[ 
https://issues.apache.org/jira/browse/GOBBLIN-1702?focusedWorklogId=807214&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-807214
 ]

ASF GitHub Bot logged work on GOBBLIN-1702:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Sep/22 23:06
            Start Date: 08/Sep/22 23:06
    Worklog Time Spent: 10m 
      Work Description: homatthew commented on code in PR #3556:
URL: https://github.com/apache/gobblin/pull/3556#discussion_r966489907


##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixUtils.java:
##########
@@ -278,9 +278,12 @@ static void waitJobCompletion(HelixManager helixManager, 
String workFlowName, St
           case STOPPING:
             log.info("Waiting for job {} to complete... State - {}", jobName, 
jobState);
             Thread.sleep(TimeUnit.SECONDS.toMillis(1L));
+            if (stoppingStateEndTime == 0) {
+              stoppingStateEndTime = currentTimeMillis + 
stoppingStateTimeoutInSeconds * 1000;
+            }
             // Workaround for a Helix bug where a job may be stuck in the 
STOPPING state due to an unresponsive task.
-            if (System.currentTimeMillis() > stoppingStateEndTime) {
-              log.info("Deleting workflow {}", workFlowName);
+            if (stoppingStateEndTime != 0 && System.currentTimeMillis() > 
stoppingStateEndTime) {

Review Comment:
   Nit: won't happen b.c. of line 281



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixUtils.java:
##########
@@ -260,7 +260,7 @@ static void waitJobCompletion(HelixManager helixManager, 
String workFlowName, St
       endTime = currentTimeMillis + timeoutInSeconds.get() * 1000;

Review Comment:
   `currentTimeMillis` should be changed to start time because it actually 
denote the start of the job and not the curren time. If we need current time 
just use system time 





Issue Time Tracking
-------------------

    Worklog Id:     (was: 807214)
    Time Spent: 0.5h  (was: 20m)

> Fix Bug when wait and checking helix job state till completion
> --------------------------------------------------------------
>
>                 Key: GOBBLIN-1702
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1702
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-cluster
>            Reporter: Hanghang Liu
>            Assignee: Hung Tran
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the HelixUtils.waitJobCompletion() has a bug when hob in STOPPING 
> state, it immediately try to delete it, instead of waiting the job itself to 
> transit to STOPPED state, due to the stoppingStateEndTime is not set 
> correctly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to