homatthew commented on code in PR #3580:
URL: https://github.com/apache/gobblin/pull/3580#discussion_r997494999


##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java:
##########
@@ -431,8 +431,15 @@ private void 
cancelJobIfRequired(DeleteJobConfigArrivalEvent deleteJobArrival) t
       if (jobNameToWorkflowIdMap.containsKey(deleteJobArrival.getJobName())) {
         String workflowId = 
jobNameToWorkflowIdMap.get(deleteJobArrival.getJobName());
         TaskDriver taskDriver = new TaskDriver(this.jobHelixManager);
-        taskDriver.waitToStop(workflowId, this.helixJobStopTimeoutMillis);
-        LOGGER.info("Stopped workflow: {}", deleteJobArrival.getJobName());
+        // Cancel the job by calling either Delete or Stop Helix API
+        if (PropertiesUtils.getPropAsBoolean(jobConfig, 
GobblinClusterConfigurationKeys.CANCEL_HELIX_JOB_BY_DELETE,
+            
GobblinClusterConfigurationKeys.DEFAULT_CANCEL_HELIX_JOB_BY_DELETE)) {
+          taskDriver.delete(workflowId);
+          LOGGER.info("Canceling Helix workflow: {} through delete API", 
deleteJobArrival.getJobName());
+        } else {

Review Comment:
   A few open questions for us to discuss:
   1. We use this waitToStop in other places. Should we consider using this 
delete api as a replacement for all calls? We use the taskrunner code basically 
everywhere and that is the rootcause for long stopping times.
   2. We should plan on cleaning up this config if we see no issues after 
further testing. Can we create a JIRA for this and add a comment for this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to