sv2000 commented on a change in pull request #3002: URL: https://github.com/apache/incubator-gobblin/pull/3002#discussion_r432744369
########## File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinYarnTaskRunner.java ########## @@ -209,6 +210,12 @@ public static void main(String[] args) throws Exception { } catch (ParseException pe) { printUsage(options); System.exit(1); + } catch (ContainerHealthCheckException e) { + // Ideally, we should not be catching this exception, as this is indicative of a non-recoverable exception. However, + // simply propagating the exception may prevent the container exit due to the presence of non-daemon threads present + // in the application. Hence, we catch this exception to invoke System.exit() which in turn ensures that all non-daemon threads are killed. + LOGGER.error("Exception encountered: {}", e); + System.exit(1); Review comment: The System.exit(1) will cause the JVM container running the GobblinYarnTaskRunner to terminate. The AMRM client running inside YarnService will then get a callback with ContainerExitStatus == ABORTED. For this case, the hostAffinity check will return to false, implying that the replacement container can be allocated on a different physical host. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org