Apekshit Kumar created GOBBLIN-1965:
---------------------------------------

             Summary: Extending Hive data movement CDC check to support table 
regex lookup
                 Key: GOBBLIN-1965
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1965
             Project: Apache Gobblin
          Issue Type: Bug
          Components: misc
    Affects Versions: 0.15.0
            Reporter: Apekshit Kumar


*Context :*
Currently due to NN availability issues, acquire job lock is failing, because 
of which job fails.

 
{code:java}
 select deployment_id, status, count(*) from gobblin_job_queue where 
created_date >= '2021-09-01' and created_date < '2021-10-01' and 
failure_exception like '%NullPointerException%' group by deployment_id, status 
order by deployment_id, status;
+---------------+--------+----------+
| deployment_id | status | count(*) |
+---------------+--------+----------+
| 1             | FAILED | 253      |
| 2             | FAILED | 6        |
| 230           | FAILED | 157      |
| 22702         | FAILED | 11       |
| 22703         | FAILED | 13       |
| 22704         | FAILED | 2        |
+---------------+--------+----------+
6 rows in set (1.04 sec)

mysql> select deployment_id, status, count(*) from gobblin_job_queue where 
created_date >= '2021-08-01' and created_date < '2021-09-01' and 
failure_exception like '%NullPointerException%' group by deployment_id, status 
order by deployment_id, status;
+---------------+--------+----------+
| deployment_id | status | count(*) |
+---------------+--------+----------+
| 1             | FAILED | 1091     |
| 3             | FAILED | 1598     |
| 230           | FAILED | 15870    |
+---------------+--------+----------+
3 rows in set (1.18 sec)
{code}
*Acceptance Criteria:*
Job lock acquisition to be made resilient to NN issues, probably by moving 
locks to Zk or retrying while acquiring lock, in case of NN issues 
(IOExceptions)@



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to