[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231723
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 21:53
Start Date: 23/Apr/19 21:53
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231723)
Time Spent: 6h 10m  (was: 6h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231637
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 18:55
Start Date: 23/Apr/19 18:55
Worklog Time Spent: 10m 
  Work Description: htran1 commented on issue #2609: GOBBLIN-744: Support 
cancellation of a Helix workflow via a DELETE Spec.
URL: 
https://github.com/apache/incubator-gobblin/pull/2609#issuecomment-485930900
 
 
   +1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231637)
Time Spent: 6h  (was: 5h 50m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231462
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 15:33
Start Date: 23/Apr/19 15:33
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277740598
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +101,105 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for restarting a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobRestartViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
UPDATE Verb to {@link IntegrationJobRestartViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to first cancel the 
running job (i.e., Helix Workflow) started in the previous step.
+   *We inspect the state of the zNode corresponding to the Workflow 
resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   *Once the cancelled job from the previous steps is completed, the 
job will be re-launched for execution by the GobblinHelixJobScheduler.
+   *   We confirm the execution by again inspecting the zNode and ensuring its 
TargetState is START. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobRestartViaSpec() throws Exception {
+this.suite = new IntegrationJobRestartViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobRestartViaSpecSuite restartViaSpecSuite = 
(IntegrationJobRestartViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+restartViaSpecSuite.addJobSpec(IntegrationJobRestartViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+restartViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+AssertWithBackoff asserter1 = 
AssertWithBackoff.create().timeoutMs(3).maxSleepMs(1000).backoffFactor(1);
+asserter1.assertTrue(isTaskStarted(helixManager, 
IntegrationJobRestartViaSpecSuite.JOB_ID),
+"Waiting for the job to start...");
+
+AssertWithBackoff asserter2 = 
AssertWithBackoff.create().maxSleepMs(100).timeoutMs(2000).backoffFactor(1);
+
asserter2.assertTrue(isTaskRunning(IntegrationJobRestartViaSpecSuite.TASK_STATE_FILE),"Waiting
 for the task to enter running state");
+
+ZkClient zkClient = new ZkClient(this.zkConnectString);
+PathBasedZkSerializer zkSerializer = ChainedPathZkSerializer.builder(new 
ZNRecordStreamingSerializer()).build();
+zkClient.setZkSerializer(zkSerializer);
+
+String clusterName = getHelixManager().getClusterName();
+String zNodePath = Paths.get("/", clusterName, "CONFIGS", "RESOURCE", 
IntegrationJobRestartViaSpecSuite.JOB_ID).toString();
+
+//Ensure that the Workflow is started
+ZNRecord record = zkClient.readData(zNodePath);
+String targetState = record.getSimpleField("TargetState");
+Assert.assertEquals(targetState, TargetState.START.name());
+
+//Add a JobSpec with UPDATE verb signalling the Helix cluster to restart 
the workflow
+restartViaSpecSuite.addJobSpec(IntegrationJobRestartViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.UPDATE.name());
+
+AssertWithBackoff asserter3 = 
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(5000).backoffFactor(1);
+asserter3.assertTrue(input -> {
+  //Inspect the zNode at the path corresponding to the Workflow resource. 
Ensure the target state of the resource is in
+  // the STOP state or that the zNode has been deleted.
+  ZNRecord recordNew = zkClient.readData(zNodePath, true);
+  String targetStateNew = null;
+  if (recordNew != null) {
+targetStateNew = recordNew.getSimpleField("TargetState");
+  }
+  return recordNew == null || 
targetStateNew.equals(TargetState.STOP.name());
+}, "Waiting for Workflow TargetState to be 

[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231456
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 15:26
Start Date: 23/Apr/19 15:26
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277737062
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -17,24 +17,52 @@
 
 package org.apache.gobblin.cluster;
 
+import java.io.File;
+import java.io.IOException;
+
+import com.google.common.io.Files;
+
 import avro.shaded.com.google.common.base.Throwables;
 
 Review comment:
   spotted a faulty import from before
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231456)
Time Spent: 5h 20m  (was: 5h 10m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231464
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 15:34
Start Date: 23/Apr/19 15:34
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277741264
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/FsSpecConsumer.java
 ##
 @@ -74,6 +78,8 @@ public FsSpecConsumer(Config config) {
   return null;
 }
 
+Arrays.sort(fileStatuses, 
Comparator.comparingLong(FileStatus::getModificationTime));
 
 Review comment:
   add a comment for why you're doing it and what you're expecting (ascending 
sort?)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231464)
Time Spent: 5h 50m  (was: 5h 40m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231461
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 15:31
Start Date: 23/Apr/19 15:31
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277739662
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -51,28 +68,30 @@ public void testJobShouldComplete()
 runAndVerify();
   }
 
-  @Test void testJobShouldGetCancelled() throws Exception {
-this.suite =new IntegrationJobCancelSuite();
+  private HelixManager getHelixManager() {
 Config helixConfig = this.suite.getManagerConfig();
 String clusterName = 
helixConfig.getString(GobblinClusterConfigurationKeys.HELIX_CLUSTER_NAME_KEY);
 String instanceName = ConfigUtils.getString(helixConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_NAME_KEY,
 GobblinClusterManager.class.getSimpleName());
-String zkConnectString = 
helixConfig.getString(GobblinClusterConfigurationKeys.ZK_CONNECTION_STRING_KEY);
+this.zkConnectString = 
helixConfig.getString(GobblinClusterConfigurationKeys.ZK_CONNECTION_STRING_KEY);
 HelixManager helixManager = 
HelixManagerFactory.getZKHelixManager(clusterName, instanceName, 
InstanceType.CONTROLLER, zkConnectString);
+return helixManager;
+  }
 
+  @Test void testJobShouldGetCancelled() throws Exception {
+this.suite =new IntegrationJobCancelSuite();
+HelixManager helixManager = getHelixManager();
 suite.startCluster();
-
 helixManager.connect();
 
 TaskDriver taskDriver = new TaskDriver(helixManager);
 
-while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelSuite.JOB_ID) == null) {
-  log.warn("Waiting for the job to start...");
-  Thread.sleep(1000L);
-}
+AssertWithBackoff asserter1 = 
AssertWithBackoff.create().maxSleepMs(1000).backoffFactor(1);
+asserter1.assertTrue(isTaskStarted(helixManager, 
IntegrationJobCancelSuite.JOB_ID),
 
 Review comment:
   you could chain the entire call without needing the local variable asserter1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231461)
Time Spent: 5.5h  (was: 5h 20m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231060
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 05:14
Start Date: 23/Apr/19 05:14
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277520130
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +99,95 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for cancelling a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
DELETE Verb to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to delete the 
running job (i.e., Helix Workflow) started in the previous step. 
+   *Finally, we inspect the state of the zNode corresponding to the 
Workflow resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobCancellationViaSpec() throws Exception {
+this.suite = new IntegrationJobCancelViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobCancelViaSpecSuite cancelViaSpecSuite = 
(IntegrationJobCancelViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+cancelViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelViaSpecSuite.JOB_ID) == null) {
+  log.warn("Waiting for the job to start...");
+  Thread.sleep(1000L);
+}
+
+
Assert.assertTrue(isTaskRunning(IntegrationJobCancelViaSpecSuite.TASK_STATE_FILE));
+
+ZkClient zkClient = new ZkClient(this.zkConnectString);
+PathBasedZkSerializer zkSerializer = ChainedPathZkSerializer.builder(new 
ZNRecordStreamingSerializer()).build();
+zkClient.setZkSerializer(zkSerializer);
+
+String clusterName = getHelixManager().getClusterName();
+String zNodePath = Paths.get("/", clusterName, "CONFIGS", "RESOURCE", 
IntegrationJobCancelViaSpecSuite.JOB_ID).toString();
+
+//Ensure that the Workflow is started
+ZNRecord record = zkClient.readData(zNodePath);
+String targetState = record.getSimpleField("TargetState");
+Assert.assertEquals(targetState, TargetState.START.name());
+
+//Add a JobSpec with DELETE verb signalling the Helix cluster to cancel 
the workflow
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.DELETE.name());
+
+int j = 0;
+boolean successFlag = false;
+while (true) {
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231060)
Time Spent: 5h 10m  (was: 5h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running 

[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231058
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 05:14
Start Date: 23/Apr/19 05:14
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277520077
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -17,24 +17,51 @@
 
 package org.apache.gobblin.cluster;
 
+import java.io.IOException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
 import avro.shaded.com.google.common.base.Throwables;
 import lombok.extern.slf4j.Slf4j;
 
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.runtime.task.BaseAbstractTask;
 
 @Slf4j
 public class SleepingTask extends BaseAbstractTask {
+  public static final String TASK_STATE_FILE_KEY = "task.state.file.path";
+
   private final long sleepTime;
+  private FileSystem fs;
 
 Review comment:
   Used Java File API.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231058)
Time Spent: 4h 50m  (was: 4h 40m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231056
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 05:13
Start Date: 23/Apr/19 05:13
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277519926
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -324,10 +332,18 @@ public void 
handleUpdateJobConfigArrival(UpdateJobConfigArrivalEvent updateJobAr
   }
 
   @Subscribe
-  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) {
+  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) throws InterruptedException {
 LOGGER.info("Received delete for job configuration of job " + 
deleteJobArrival.getJobName());
 try {
   unscheduleJob(deleteJobArrival.getJobName());
+  Properties jobConfig = deleteJobArrival.getJobConfig();
 
 Review comment:
   Added a method.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231056)
Time Spent: 4.5h  (was: 4h 20m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=231057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231057
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 23/Apr/19 05:13
Start Date: 23/Apr/19 05:13
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277519974
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/FsScheduledJobConfigurationManager.java
 ##
 @@ -53,27 +51,26 @@ public FsScheduledJobConfigurationManager(EventBus 
eventBus, Config config, Muta
 
   @Override
   protected void fetchJobSpecs() throws ExecutionException, 
InterruptedException {
-List> jobSpecs =
-(List>) 
this._specConsumer.changedSpecs().get();
+List> jobSpecs =
 
 Review comment:
   Done. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 231057)
Time Spent: 4h 40m  (was: 4.5h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230856=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230856
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:29
Start Date: 22/Apr/19 19:29
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277405599
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/suite/IntegrationJobCancelSuite.java
 ##
 @@ -26,19 +26,21 @@
 import com.typesafe.config.ConfigFactory;
 
 import org.apache.gobblin.cluster.GobblinClusterConfigurationKeys;
+import org.apache.gobblin.cluster.SleepingTask;
 import org.apache.gobblin.configuration.ConfigurationKeys;
 
 
 public class IntegrationJobCancelSuite extends IntegrationBasicSuite {
   public static final String JOB_ID = "job_HelloWorldTestJob_1234";
+  public static final String TASK_STATE_FILE = 
"/tmp/IntegrationJobCancelSuite/taskState/_RUNNING";
 
   @Override
   protected Map overrideJobConfigs(Config rawJobConfig) {
 Config newConfig = ConfigFactory.parseMap(ImmutableMap.of(
 ConfigurationKeys.SOURCE_CLASS_KEY, 
"org.apache.gobblin.cluster.SleepingCustomTaskSource",
 ConfigurationKeys.JOB_ID_KEY, JOB_ID,
 GobblinClusterConfigurationKeys.HELIX_JOB_TIMEOUT_ENABLED_KEY, 
Boolean.TRUE,
-GobblinClusterConfigurationKeys.HELIX_JOB_TIMEOUT_SECONDS, 10L))
+GobblinClusterConfigurationKeys.HELIX_JOB_TIMEOUT_SECONDS, 10L, 
SleepingTask.TASK_STATE_FILE_KEY, TASK_STATE_FILE))
 
 Review comment:
   formatting seems off
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230856)
Time Spent: 4h 20m  (was: 4h 10m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230855
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:28
Start Date: 22/Apr/19 19:28
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277405376
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +99,95 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for cancelling a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
DELETE Verb to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to delete the 
running job (i.e., Helix Workflow) started in the previous step. 
+   *Finally, we inspect the state of the zNode corresponding to the 
Workflow resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobCancellationViaSpec() throws Exception {
+this.suite = new IntegrationJobCancelViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobCancelViaSpecSuite cancelViaSpecSuite = 
(IntegrationJobCancelViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+cancelViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelViaSpecSuite.JOB_ID) == null) {
+  log.warn("Waiting for the job to start...");
+  Thread.sleep(1000L);
+}
+
+
Assert.assertTrue(isTaskRunning(IntegrationJobCancelViaSpecSuite.TASK_STATE_FILE));
+
+ZkClient zkClient = new ZkClient(this.zkConnectString);
+PathBasedZkSerializer zkSerializer = ChainedPathZkSerializer.builder(new 
ZNRecordStreamingSerializer()).build();
+zkClient.setZkSerializer(zkSerializer);
+
+String clusterName = getHelixManager().getClusterName();
+String zNodePath = Paths.get("/", clusterName, "CONFIGS", "RESOURCE", 
IntegrationJobCancelViaSpecSuite.JOB_ID).toString();
+
+//Ensure that the Workflow is started
+ZNRecord record = zkClient.readData(zNodePath);
+String targetState = record.getSimpleField("TargetState");
+Assert.assertEquals(targetState, TargetState.START.name());
+
+//Add a JobSpec with DELETE verb signalling the Helix cluster to cancel 
the workflow
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.DELETE.name());
+
+int j = 0;
+boolean successFlag = false;
+while (true) {
 
 Review comment:
   AssertWithBackoff can be used here as well
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230855)
Time Spent: 4h 10m  (was: 4h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> This task supports the 

[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230854
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:27
Start Date: 22/Apr/19 19:27
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277405142
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +99,95 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for cancelling a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
DELETE Verb to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to delete the 
running job (i.e., Helix Workflow) started in the previous step. 
+   *Finally, we inspect the state of the zNode corresponding to the 
Workflow resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobCancellationViaSpec() throws Exception {
+this.suite = new IntegrationJobCancelViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobCancelViaSpecSuite cancelViaSpecSuite = 
(IntegrationJobCancelViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+cancelViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelViaSpecSuite.JOB_ID) == null) {
 
 Review comment:
   maybe you can use AssertWithBackoff class from org.apache.gobblin.testing
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230854)
Time Spent: 4h  (was: 3h 50m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230853=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230853
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:23
Start Date: 22/Apr/19 19:23
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277403888
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -17,24 +17,51 @@
 
 package org.apache.gobblin.cluster;
 
+import java.io.IOException;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
 import avro.shaded.com.google.common.base.Throwables;
 import lombok.extern.slf4j.Slf4j;
 
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.runtime.task.BaseAbstractTask;
 
 @Slf4j
 public class SleepingTask extends BaseAbstractTask {
+  public static final String TASK_STATE_FILE_KEY = "task.state.file.path";
+
   private final long sleepTime;
+  private FileSystem fs;
 
 Review comment:
   Do you need Hadoop FileSystem here? If you use java filesystem, you can set 
it to delete on process shutdown to ensure cleanup happens. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230853)
Time Spent: 3h 50m  (was: 3h 40m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230850=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230850
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:20
Start Date: 22/Apr/19 19:20
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277402811
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -116,7 +119,8 @@ public GobblinHelixJobScheduler(Properties properties,
 this.jobCatalog = jobCatalog;
 this.metricContext = Instrumented.getMetricContext(new 
org.apache.gobblin.configuration.State(properties), this.getClass());
 
-int metricsWindowSizeInMin = 
ConfigUtils.getInt(ConfigUtils.propertiesToConfig(this.properties),
+Config jobConfig = ConfigUtils.propertiesToConfig(this.properties);
 
 Review comment:
   This should be the recommended pattern, so we can use ConfigUtils for the 
rest of the config gets, without incurring repeated txformations between 
properties and Config.  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230850)
Time Spent: 3h 40m  (was: 3.5h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230839
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:00
Start Date: 22/Apr/19 19:00
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277396163
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -324,10 +332,18 @@ public void 
handleUpdateJobConfigArrival(UpdateJobConfigArrivalEvent updateJobAr
   }
 
   @Subscribe
-  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) {
+  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) throws InterruptedException {
 LOGGER.info("Received delete for job configuration of job " + 
deleteJobArrival.getJobName());
 try {
   unscheduleJob(deleteJobArrival.getJobName());
+  Properties jobConfig = deleteJobArrival.getJobConfig();
 
 Review comment:
   This should go to a private method cancelJobIfRequired() coupled with a 
public util method cancelHelixJob
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230839)
Time Spent: 3h 20m  (was: 3h 10m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230842
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 19:05
Start Date: 22/Apr/19 19:05
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277397907
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -116,7 +119,8 @@ public GobblinHelixJobScheduler(Properties properties,
 this.jobCatalog = jobCatalog;
 this.metricContext = Instrumented.getMetricContext(new 
org.apache.gobblin.configuration.State(properties), this.getClass());
 
-int metricsWindowSizeInMin = 
ConfigUtils.getInt(ConfigUtils.propertiesToConfig(this.properties),
+Config jobConfig = ConfigUtils.propertiesToConfig(this.properties);
 
 Review comment:
   Conversion of all the properties to Config looks unnecessary to me.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230842)
Time Spent: 3.5h  (was: 3h 20m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230810
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 17:58
Start Date: 22/Apr/19 17:58
Worklog Time Spent: 10m 
  Work Description: htran1 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277374246
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/FsScheduledJobConfigurationManager.java
 ##
 @@ -53,27 +51,26 @@ public FsScheduledJobConfigurationManager(EventBus 
eventBus, Config config, Muta
 
   @Override
   protected void fetchJobSpecs() throws ExecutionException, 
InterruptedException {
-List> jobSpecs =
-(List>) 
this._specConsumer.changedSpecs().get();
+List> jobSpecs =
 
 Review comment:
   Should make sure that the job configuration manager receives the specs in 
modification time order.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230810)
Time Spent: 3h 10m  (was: 3h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230809
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 17:58
Start Date: 22/Apr/19 17:58
Worklog Time Spent: 10m 
  Work Description: htran1 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277373797
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/FsScheduledJobConfigurationManager.java
 ##
 @@ -53,27 +51,26 @@ public FsScheduledJobConfigurationManager(EventBus 
eventBus, Config config, Muta
 
   @Override
   protected void fetchJobSpecs() throws ExecutionException, 
InterruptedException {
-List> jobSpecs =
-(List>) 
this._specConsumer.changedSpecs().get();
+List> jobSpecs =
+(List>) 
this._specConsumer.changedSpecs().get();
 
-for (Pair entry : jobSpecs) {
-  Spec spec = entry.getValue();
+for (Pair entry : jobSpecs) {
+  JobSpec jobSpec = entry.getValue();
   SpecExecutor.Verb verb = entry.getKey();
   if (verb.equals(SpecExecutor.Verb.ADD) || 
verb.equals(SpecExecutor.Verb.UPDATE)) {
 
 Review comment:
   Should call `postUpdateJobConfigArrival` for the update case.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230809)
Time Spent: 3h  (was: 2h 50m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230595
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 22/Apr/19 05:43
Start Date: 22/Apr/19 05:43
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277210433
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +97,72 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for cancelling a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
DELETE Verb to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to delete the 
running job (i.e., Helix Workflow) started in the previous step. 
+   *Finally, we inspect the state of the zNode corresponding to the 
Workflow resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobCancellationViaSpec() throws Exception {
+this.suite = new IntegrationJobCancelViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobCancelViaSpecSuite cancelViaSpecSuite = 
(IntegrationJobCancelViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+cancelViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelViaSpecSuite.JOB_ID) == null) {
+  log.warn("Waiting for the job to start...");
+  Thread.sleep(1000L);
+}
+
+// Give the job some time to reach writer, where it sleeps
+Thread.sleep(2000L);
 
 Review comment:
   Modified SleepingTask to write an empty file declaring that it is RUNNING. 
However, the test case will wait only a bounded amount of time for the 
existence of this file, to avoid the pathological condition where the file 
cannot be created and the test case keeps polling for the existence of this 
file forever. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230595)
Time Spent: 2h 40m  (was: 2.5h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230552=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230552
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 22:25
Start Date: 21/Apr/19 22:25
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277183593
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java
 ##
 @@ -159,4 +159,7 @@
   public static final String KILL_DUPLICATE_PLANNING_JOB = 
GOBBLIN_CLUSTER_PREFIX + "kill.duplicate.planningJob";
   public static final boolean DEFAULT_KILL_DUPLICATE_PLANNING_JOB = true;
 
+  public static final String SHOULD_CANCEL_RUNNING_JOB_ON_DELETE = 
GOBBLIN_CLUSTER_PREFIX + "shouldCancelRunningJobOnDelete";
+
 
 Review comment:
   Ack.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230552)
Time Spent: 2.5h  (was: 2h 20m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230549
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 22:16
Start Date: 21/Apr/19 22:16
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277183386
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -141,6 +143,10 @@ public GobblinHelixJobScheduler(Properties properties,
   metricsWindowSizeInMin);
 
 this.startServicesCompleted = false;
+
+this.helixJobStopTimeoutSeconds = 
ConfigUtils.getLong(ConfigUtils.propertiesToConfig(properties),
 
 Review comment:
   Its an easy change. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230549)
Time Spent: 2h  (was: 1h 50m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230550
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 22:19
Start Date: 21/Apr/19 22:19
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277183441
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -324,10 +330,17 @@ public void 
handleUpdateJobConfigArrival(UpdateJobConfigArrivalEvent updateJobAr
   }
 
   @Subscribe
-  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) {
+  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) throws InterruptedException {
 LOGGER.info("Received delete for job configuration of job " + 
deleteJobArrival.getJobName());
 try {
   unscheduleJob(deleteJobArrival.getJobName());
+  Properties jobConfig = deleteJobArrival.getJobConfig();
+  if (PropertiesUtils.getPropAsBoolean(jobConfig, 
GobblinClusterConfigurationKeys.SHOULD_CANCEL_RUNNING_JOB_ON_DELETE, "false")) {
 
 Review comment:
   @jhsenjaliya +1 for noticing that. If the default were true, it will impact 
the job config updates as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230550)
Time Spent: 2h 10m  (was: 2h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230548=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230548
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 22:16
Start Date: 21/Apr/19 22:16
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277183365
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -324,10 +330,17 @@ public void 
handleUpdateJobConfigArrival(UpdateJobConfigArrivalEvent updateJobAr
   }
 
   @Subscribe
-  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) {
+  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) throws InterruptedException {
 LOGGER.info("Received delete for job configuration of job " + 
deleteJobArrival.getJobName());
 try {
   unscheduleJob(deleteJobArrival.getJobName());
+  Properties jobConfig = deleteJobArrival.getJobConfig();
+  if (PropertiesUtils.getPropAsBoolean(jobConfig, 
GobblinClusterConfigurationKeys.SHOULD_CANCEL_RUNNING_JOB_ON_DELETE, "false")) {
 
 Review comment:
   Yes, it is to maintain backward compatibility. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230548)
Time Spent: 1h 50m  (was: 1h 40m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230547
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 22:14
Start Date: 21/Apr/19 22:14
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277183340
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -324,10 +330,17 @@ public void 
handleUpdateJobConfigArrival(UpdateJobConfigArrivalEvent updateJobAr
   }
 
   @Subscribe
-  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) {
+  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) throws InterruptedException {
 LOGGER.info("Received delete for job configuration of job " + 
deleteJobArrival.getJobName());
 try {
   unscheduleJob(deleteJobArrival.getJobName());
+  Properties jobConfig = deleteJobArrival.getJobConfig();
+  if (PropertiesUtils.getPropAsBoolean(jobConfig, 
GobblinClusterConfigurationKeys.SHOULD_CANCEL_RUNNING_JOB_ON_DELETE, "false")) {
 
 Review comment:
   I think it has to be `False` since to update the job config 
`handleUpdateJobConfigArrival` method deletes and insert as new job config. Or 
if we want it to be `True` then the method `handleUpdateJobConfigArrival` has 
to override this config while updating the jobConfig. #my2Cents.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230547)
Time Spent: 1h 40m  (was: 1.5h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230545=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230545
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:56
Start Date: 21/Apr/19 21:56
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277183004
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -141,6 +143,10 @@ public GobblinHelixJobScheduler(Properties properties,
   metricsWindowSizeInMin);
 
 this.startServicesCompleted = false;
+
+this.helixJobStopTimeoutSeconds = 
ConfigUtils.getLong(ConfigUtils.propertiesToConfig(properties),
 
 Review comment:
   probably not required to change at this point but i see this usage of 
ConfigUtils.propertiesToConfig over an over as if its O(1). may be it needs 
separate PR for reusing the created config.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230545)
Time Spent: 1.5h  (was: 1h 20m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230543
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:46
Start Date: 21/Apr/19 21:46
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277182794
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java
 ##
 @@ -159,4 +159,7 @@
   public static final String KILL_DUPLICATE_PLANNING_JOB = 
GOBBLIN_CLUSTER_PREFIX + "kill.duplicate.planningJob";
   public static final boolean DEFAULT_KILL_DUPLICATE_PLANNING_JOB = true;
 
+  public static final String SHOULD_CANCEL_RUNNING_JOB_ON_DELETE = 
GOBBLIN_CLUSTER_PREFIX + "shouldCancelRunningJobOnDelete";
 
 Review comment:
   +1, same with variable name :) 
   btw, should this be `job.cancelR unningOnDelete`, since it's a job property 
just like `job.schedule` ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230543)
Time Spent: 1h 20m  (was: 1h 10m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230538
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:31
Start Date: 21/Apr/19 21:31
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277182228
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +97,72 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for cancelling a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
DELETE Verb to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to delete the 
running job (i.e., Helix Workflow) started in the previous step. 
+   *Finally, we inspect the state of the zNode corresponding to the 
Workflow resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobCancellationViaSpec() throws Exception {
+this.suite = new IntegrationJobCancelViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobCancelViaSpecSuite cancelViaSpecSuite = 
(IntegrationJobCancelViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+cancelViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelViaSpecSuite.JOB_ID) == null) {
+  log.warn("Waiting for the job to start...");
+  Thread.sleep(1000L);
+}
+
+// Give the job some time to reach writer, where it sleeps
+Thread.sleep(2000L);
+
+ZkClient zkClient = new ZkClient(this.zkConnectString);
+PathBasedZkSerializer zkSerializer = ChainedPathZkSerializer.builder(new 
ZNRecordStreamingSerializer()).build();
+zkClient.setZkSerializer(zkSerializer);
+
+String clusterName = getHelixManager().getClusterName();
+String zNodePath = Paths.get("/", clusterName, "CONFIGS", "RESOURCE", 
IntegrationJobCancelViaSpecSuite.JOB_ID).toString();
+
+//Ensure that the Workflow is started
+ZNRecord record = zkClient.readData(zNodePath);
+String targetState = record.getSimpleField("TargetState");
+Assert.assertEquals(targetState, TargetState.START.name());
+
+//Add a JobSpec with DELETE verb signalling the Helix cluster to cancel 
the workflow
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.DELETE.name());
+
+//Give some time for the FsScheduledJobConfigurationManager to pick up the 
DELETE spec and send
+// DeleteJobConfigArrivalEvent.
+Thread.sleep(3000L);
 
 Review comment:
   Any way to avoid sleeping but waiting on some other observable event to 
figure out if delete has been processed?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230538)
Time Spent: 1h  (was: 50m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>

[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230537
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:31
Start Date: 21/Apr/19 21:31
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277182265
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java
 ##
 @@ -159,4 +159,7 @@
   public static final String KILL_DUPLICATE_PLANNING_JOB = 
GOBBLIN_CLUSTER_PREFIX + "kill.duplicate.planningJob";
   public static final boolean DEFAULT_KILL_DUPLICATE_PLANNING_JOB = true;
 
+  public static final String SHOULD_CANCEL_RUNNING_JOB_ON_DELETE = 
GOBBLIN_CLUSTER_PREFIX + "shouldCancelRunningJobOnDelete";
 
 Review comment:
   suggestion: drop the prefix "should" from this config. So config could be: 
cancelRunningJobOnDelete
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230537)
Time Spent: 50m  (was: 40m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230539
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:31
Start Date: 21/Apr/19 21:31
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277182213
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -82,6 +97,72 @@ public void testJobShouldComplete()
 suite.waitForAndVerifyOutputFiles();
   }
 
+  /**
+   * An integration test for cancelling a Helix workflow via a JobSpec. This 
test case starts a Helix cluster with
+   * a {@link FsScheduledJobConfigurationManager}. The test case does the 
following:
+   * 
+   *add a {@link org.apache.gobblin.runtime.api.JobSpec} that uses a 
{@link org.apache.gobblin.cluster.SleepingCustomTaskSource})
+   *   to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.  
which is picked by the JobConfigurationManager. 
+   *the JobConfigurationManager sends a notification to the 
GobblinHelixJobScheduler which schedules the job for execution. The JobSpec is
+   *   also added to the JobCatalog for persistence. Helix starts a Workflow 
for this JobSpec. 
+   *We then add a {@link org.apache.gobblin.runtime.api.JobSpec} with 
DELETE Verb to {@link IntegrationJobCancelViaSpecSuite#FS_SPEC_CONSUMER_DIR}.
+   *   This signals GobblinHelixJobScheduler (and, Helix) to delete the 
running job (i.e., Helix Workflow) started in the previous step. 
+   *Finally, we inspect the state of the zNode corresponding to the 
Workflow resource in Zookeeper to ensure that its {@link 
org.apache.helix.task.TargetState}
+   *   is STOP. 
+   * 
+   */
+  @Test (dependsOnMethods = { "testJobShouldGetCancelled" })
+  public void testJobCancellationViaSpec() throws Exception {
+this.suite = new IntegrationJobCancelViaSpecSuite();
+HelixManager helixManager = getHelixManager();
+
+IntegrationJobCancelViaSpecSuite cancelViaSpecSuite = 
(IntegrationJobCancelViaSpecSuite) this.suite;
+
+//Add a new JobSpec to the path monitored by the SpecConsumer
+cancelViaSpecSuite.addJobSpec(IntegrationJobCancelViaSpecSuite.JOB_ID, 
SpecExecutor.Verb.ADD.name());
+
+//Start the cluster
+cancelViaSpecSuite.startCluster();
+
+helixManager.connect();
+
+while (TaskDriver.getWorkflowContext(helixManager, 
IntegrationJobCancelViaSpecSuite.JOB_ID) == null) {
+  log.warn("Waiting for the job to start...");
+  Thread.sleep(1000L);
+}
+
+// Give the job some time to reach writer, where it sleeps
+Thread.sleep(2000L);
 
 Review comment:
   any way to avoid sleeping, but instead use something else to detect if the 
job has reached the writer?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230539)
Time Spent: 1h 10m  (was: 1h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230536
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:31
Start Date: 21/Apr/19 21:31
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277182327
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobScheduler.java
 ##
 @@ -324,10 +330,17 @@ public void 
handleUpdateJobConfigArrival(UpdateJobConfigArrivalEvent updateJobAr
   }
 
   @Subscribe
-  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) {
+  public void handleDeleteJobConfigArrival(DeleteJobConfigArrivalEvent 
deleteJobArrival) throws InterruptedException {
 LOGGER.info("Received delete for job configuration of job " + 
deleteJobArrival.getJobName());
 try {
   unscheduleJob(deleteJobArrival.getJobName());
+  Properties jobConfig = deleteJobArrival.getJobConfig();
+  if (PropertiesUtils.getPropAsBoolean(jobConfig, 
GobblinClusterConfigurationKeys.SHOULD_CANCEL_RUNNING_JOB_ON_DELETE, "false")) {
 
 Review comment:
   Should the default be false or true? 
   It could be argued that the expected behavior on a job spec deletion is that 
running jobs should be cancelled? 
   
   Are you just trying to maintain backward compatibility of this with current 
jobs? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230536)
Time Spent: 40m  (was: 0.5h)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230535
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 21/Apr/19 21:31
Start Date: 21/Apr/19 21:31
Worklog Time Spent: 10m 
  Work Description: shirshanka commented on pull request #2609: 
GOBBLIN-744: Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609#discussion_r277182277
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java
 ##
 @@ -159,4 +159,7 @@
   public static final String KILL_DUPLICATE_PLANNING_JOB = 
GOBBLIN_CLUSTER_PREFIX + "kill.duplicate.planningJob";
   public static final boolean DEFAULT_KILL_DUPLICATE_PLANNING_JOB = true;
 
+  public static final String SHOULD_CANCEL_RUNNING_JOB_ON_DELETE = 
GOBBLIN_CLUSTER_PREFIX + "shouldCancelRunningJobOnDelete";
+
 
 Review comment:
   Add the default value for this config here (following the standard gobblin 
pattern)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230535)
Time Spent: 0.5h  (was: 20m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230486
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 20/Apr/19 22:53
Start Date: 20/Apr/19 22:53
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on issue #2609: GOBBLIN-744: Support 
cancellation of a Helix workflow via a DELETE Spec.
URL: 
https://github.com/apache/incubator-gobblin/pull/2609#issuecomment-485190467
 
 
   @htran1 @yukuai518 @arjun4084346 @shirshanka Please review. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230486)
Time Spent: 20m  (was: 10m)

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-744) Support cancellation of a Helix workflow via a DELETE Spec

2019-04-20 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-744?focusedWorklogId=230484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230484
 ]

ASF GitHub Bot logged work on GOBBLIN-744:
--

Author: ASF GitHub Bot
Created on: 20/Apr/19 22:47
Start Date: 20/Apr/19 22:47
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2609: GOBBLIN-744: 
Support cancellation of a Helix workflow via a DELETE Spec.
URL: https://github.com/apache/incubator-gobblin/pull/2609
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-744
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   This task supports the ability to interrupt and cancel a running job on a 
Gobblin Helix cluster via a DELETE Spec submitted to the 
JobConfigurationManager. The DELETE Spec should have 
"gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
running job. The default behavior is to simply delete the corresponding JobSpec 
from the JobCatalog. 
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   IntegrationJobCancelViaSpecSuite and ClusterIntegrationTest.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230484)
Time Spent: 10m
Remaining Estimate: 0h

> Support cancellation of a Helix workflow via a DELETE Spec
> --
>
> Key: GOBBLIN-744
> URL: https://issues.apache.org/jira/browse/GOBBLIN-744
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This task supports the ability to interrupt and cancel a running job on a 
> Gobblin Helix cluster via a DELETE Spec submitted to the 
> JobConfigurationManager. The DELETE Spec should have 
> "gobblin.cluster.shouldCancelRunningJobOnDelete" set to true for cancelling a 
> running job. The default behavior is to simply delete the corresponding 
> JobSpec from the JobCatalog. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)