sv2000 commented on a change in pull request #2960: GOBBLIN-1120: Reinitialize 
HelixManager when Helix participant check …
URL: https://github.com/apache/incubator-gobblin/pull/2960#discussion_r410495838
 
 

 ##########
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixAssignedParticipantCheck.java
 ##########
 @@ -104,20 +122,23 @@ public boolean isCompleted()
    */
   @Override
   public void execute() throws CommitStepException {
-    if (!helixManager.isConnected()) {
-      try {
-        helixManager.connect();
-      } catch (Exception e) {
-        throw new CommitStepException(String.format("Helix instance %s unable 
to connect to Helix/ZK", helixInstanceName));
-      }
-    }
-    TaskDriver taskDriver = new TaskDriver(helixManager);
     log.info(String.format("HelixParticipantCheck step called for Helix 
Instance: %s, Helix job: %s, Helix partition: %d",
         this.helixInstanceName, this.helixJob, this.partitionNum));
 
     //Query Helix to get the currently assigned participant for the Helix 
partitionNum
     Callable callable = () -> {
-      JobContext jobContext = taskDriver.getJobContext(helixJob);
+      JobContext jobContext;
+      try {
+        TaskDriver taskDriver = new TaskDriver(helixManager);
+        jobContext = taskDriver.getJobContext(helixJob);
+      } catch (Exception e) {
 
 Review comment:
   Helix throws an IllegalStateException (which is a RuntimeException) when the 
underlying ZkClient is closed. Since it is an unchecked exception, I added a 
clause to catch the generic Exception. 
   It is hard to determine the exact cause why the zkclient is closed in the 
first place. One possibility is a temporary N/W glitch which might cause the 
container to loose connectivity with Zk. Another possibility is a Zk session 
expiry, which if not properly handled by the underlying client (i.e does not 
restablish a connection with server) may cause the client to be closed. The 
logic here is to ensure that we refresh the Helix manager in these scenarios. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to