[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16805977#comment-16805977 ] Vaibhav Gumashta commented on HIVE-21402: - +1 > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802545#comment-16802545 ] Peter Vary commented on HIVE-21402: --- [~vgumashta]: The rethrown throwable will be caught again a few lines below, and will be printed again here: [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L239] Basically we finish the worker job anyway (with or without the patch), just the status for the compaction is not updated to match this before the patch. I think this is the only change the fix does. > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802498#comment-16802498 ] Vaibhav Gumashta commented on HIVE-21402: - [~pvary] How about we catch the throwable, do the clean up and then throw it again? > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789655#comment-16789655 ] Peter Vary commented on HIVE-21402: --- Yeah, and that catch just prints out the error to the log and leave the compaction in "working" status. That's left me scratching my head for a while :D My understanding of the compaction is the following (mostly by documentation ATM): * If a compaction fails then it is put to the COMPLETED_COMPACTION table with the status marked as failed. And will be retried later if the conditions are still met. * If the number of the compaction failures are bigger for that compaction than {{metastore.compactor.initiator.failed.compacts.threshold}} then it will not be scheduled again. * If a compaction is found in the "working" state for longer than {{hive.compactor.worker.timeout}} by the initiator thread then it is put back to "initiated" state - so it will be queued again later. The config comment says "declared failed" but I think it does not put a new entry to the COMPLETED_COMPACTION table, so it is not counted when checking against the failed.compacts.threshold. So if my understanding the above process is correct then if we catch the Throwable then we will have a few (by default 2) failed compactions very close to each other, on the other hand if we do not catch Throwable then we will have a continuously "working" compaction forever. Or maybe I am totally off - learning/learning/learning :) :) :) Thanks, Peter > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789614#comment-16789614 ] Ashutosh Chauhan commented on HIVE-21402: - I am unsure of how to deal with unchecked exceptions. IMHO, its not useful to catch Throwable since in case of unchecked exception its very likely that compaction will fail in next iteration too, likely that error will be encountered every time (e.g., was the case here of missing jar). In such cases, its better to let Throwable escape (or raise InterrruptedException) so that its dealt with in caller which should then fail the process. For end user its not useful that HS2 keeps on running where every compaction fails. On the other hand there is already catch(Throwable) in the outer loop : https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L238 > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789437#comment-16789437 ] Peter Vary commented on HIVE-21402: --- I agree with [~ashutoshc] that this is most probably a config issue. I usually use the same hive-site.xml for HMS and HS2. This might caused the problem. There relevant configs are: {code:java} hive.txn.managerorg.apache.hadoop.hive.ql.lockmgr.DbTxnManager hive.support.concurrencytrue hive.compactor.initiator.ontrue hive.compactor.worker.threads5 hive.compactor.crud.query.basedtrue{code} On the other hand this highlights the issue that we can have other Throwables in that try-catch which might prevent setting the correct state of the compaction. So I think this change should go in anyway. What do you think? Also - maybe in a follow-up Jira - it would be good to store the reason for the failure for the failed transactions. This could help troubleshooting tremendously. [~vgumashta]: Is {{CompactionInfo.metaInfo}} is designed to store info about the compaction - could it be used to store the exception message for the failed compactions? Thanks, Peter > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788227#comment-16788227 ] Ashutosh Chauhan commented on HIVE-21402: - Actually looking deeply, actual Compaction is now moved to ql so compactions are run in HS2. HS2 should have calcite on classpath. So, this is a deployment issue. cc: [~vgumashta] > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788066#comment-16788066 ] Peter Vary commented on HIVE-21402: --- [~ashutoshc]: Here is the exception I got: {code:java} 18:04:35.600 [PeterVary-MBP15.local-33] ERROR org.apache.hadoop.hive.ql.txn.compactor.Worker - Caught an exception in the main loop of compactor worker PeterVary-MBP15.local-33, java.lang.NoClassDefFoundError: org/apache/calcite/plan/RelOptRule at org.apache.hadoop.hive.ql.Driver.dumpMetaCallTimingWithoutEx(Driver.java:1022) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:783) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1905) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1965) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1777) at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54) at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:407) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:249) at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Unknown Source) Caused by: java.lang.ClassNotFoundException: org.apache.calcite.plan.RelOptRule at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 11 more {code} > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788012#comment-16788012 ] Ashutosh Chauhan commented on HIVE-21402: - [~pvary] What exception was thrown instead in that case? > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787144#comment-16787144 ] Hive QA commented on HIVE-21402: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12961525/HIVE-21402.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15819 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16389/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16389/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16389/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12961525 - PreCommit-HIVE-Build > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787114#comment-16787114 ] Hive QA commented on HIVE-21402: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 18s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 26m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16389/dev-support/hive-personality.sh | | git revision | master / 8ab6ced | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16389/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)