[ 
https://issues.apache.org/jira/browse/HIVE-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11685:
----------------------------------
    Description: 
CompactorMR submits MR job to do compaction and waits for completion.
If the metastore need to be restarted, it will kill in-flight compactions.

I ideally we'd want to add job ID to the COMPACTION_QUEUE table (and include 
that in SHOW COMPACTIONS) and poll for it or register a callback so that the 
job survives Metastore restart

Also, 
when running revokeTimedoutWorker() make sure to use this JobId to kill the job 
is it's still running.
Alternatively, if it's still running, maybe just a assign a new worker_id and 
let it continue to run.

  was:
CompactorMR submits MR job to do compaction and waits for completion.
If the metastore need to be restarted, it will kill in-flight compactions.

I ideally we'd want to add job ID to the COMPACTION_QUEUE table (and include 
that in SHOW COMPACTIONS) and poll for it or register a callback so that the 
job survives Metastore restart

Also, 
when running revokeTimedoutWorker() make sure to take use this JobId to kill 
the job is it's still running.
Alternatively, if it's still running, maybe just a assign a new worker_id and 
let it continue to run.


> Restarting Metastore kills Compactions - store Hadoop job id in 
> COMPACTION_QUEUE
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-11685
>                 URL: https://issues.apache.org/jira/browse/HIVE-11685
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore, Transactions
>    Affects Versions: 1.0.1
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>
> CompactorMR submits MR job to do compaction and waits for completion.
> If the metastore need to be restarted, it will kill in-flight compactions.
> I ideally we'd want to add job ID to the COMPACTION_QUEUE table (and include 
> that in SHOW COMPACTIONS) and poll for it or register a callback so that the 
> job survives Metastore restart
> Also, 
> when running revokeTimedoutWorker() make sure to use this JobId to kill the 
> job is it's still running.
> Alternatively, if it's still running, maybe just a assign a new worker_id and 
> let it continue to run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to