veghlaci05 commented on code in PR #3955: URL: https://github.com/apache/hive/pull/3955#discussion_r1082283569
########## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java: ########## @@ -118,19 +119,23 @@ public void run() { singleRun.cancel(true); executor.shutdownNow(); executor = getTimeoutHandlingExecutor(); + err = true; } catch (ExecutionException e) { LOG.info("Exception during executing compaction", e); + err = true; } catch (InterruptedException ie) { // do not ignore interruption requests return; + } catch (Throwable t) { + err = true; } doPostLoopActions(System.currentTimeMillis() - startedAt); // If we didn't try to launch a job it either means there was no work to do or we got - // here as the result of a communication failure with the DB. Either way we want to wait + // here as the result of an error like communication failure with the DB, schema failures etc. Either way we want to wait // a bit before, otherwise we can start over the loop immediately. - if (!launchedJob && !stop.get()) { + if ((!launchedJob || err) && !stop.get()) { Thread.sleep(SLEEP_TIME); Review Comment: @akshat0395 I think the timeout should be increased starting from the second failed job execution in a row. (until it reaches some defined maximum value) However, with backoff, we will need the err flag to differentiate between errors and cases where there were no compactions to work on. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org