sv2000 commented on a change in pull request #2960: GOBBLIN-1120: Reinitialize HelixManager when Helix participant check … URL: https://github.com/apache/incubator-gobblin/pull/2960#discussion_r410495838
########## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixAssignedParticipantCheck.java ########## @@ -104,20 +122,23 @@ public boolean isCompleted() */ @Override public void execute() throws CommitStepException { - if (!helixManager.isConnected()) { - try { - helixManager.connect(); - } catch (Exception e) { - throw new CommitStepException(String.format("Helix instance %s unable to connect to Helix/ZK", helixInstanceName)); - } - } - TaskDriver taskDriver = new TaskDriver(helixManager); log.info(String.format("HelixParticipantCheck step called for Helix Instance: %s, Helix job: %s, Helix partition: %d", this.helixInstanceName, this.helixJob, this.partitionNum)); //Query Helix to get the currently assigned participant for the Helix partitionNum Callable callable = () -> { - JobContext jobContext = taskDriver.getJobContext(helixJob); + JobContext jobContext; + try { + TaskDriver taskDriver = new TaskDriver(helixManager); + jobContext = taskDriver.getJobContext(helixJob); + } catch (Exception e) { Review comment: Helix throws an IllegalStateException (which is a RuntimeException) when the underlying ZkClient is closed. Since it is an unchecked exception, I added a clause to catch the generic Exception. It is hard to determine the exact cause why the zkclient is closed in the first place. One possibility is a temporary N/W glitch which might cause the container to loose connectivity with Zk. Another possibility is a Zk session expiry, which if not properly handled by the underlying client (i.e does not restablish a connection with server) may cause the client to be closed. The logic here is to ensure that we refresh the Helix manager in these scenarios. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services