Apache Gobblin's April report sign-off

2020-04-07 Thread Abhishek Tiwari
Hi mentors,

Please help us sign off the Gobblin's podling report:
https://cwiki.apache.org/confluence/display/INCUBATOR/April2020#Gobblin

Regards,
Abhishek


[jira] [Work logged] (GOBBLIN-1110) handle runtime exceptions in executeCancellation implementation of GobblinHelixJobLauncher

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1110?focusedWorklogId=418033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-418033
 ]

ASF GitHub Bot logged work on GOBBLIN-1110:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 22:23
Start Date: 07/Apr/20 22:23
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #2950: [GOBBLIN-1110] 
Task driver stop
URL: https://github.com/apache/incubator-gobblin/pull/2950
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 418033)
Time Spent: 40m  (was: 0.5h)

> handle runtime exceptions in executeCancellation implementation of 
> GobblinHelixJobLauncher 
> ---
>
> Key: GOBBLIN-1110
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1110
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] asfgit closed pull request #2950: [GOBBLIN-1110] Task driver stop

2020-04-07 Thread GitBox
asfgit closed pull request #2950: [GOBBLIN-1110] Task driver stop
URL: https://github.com/apache/incubator-gobblin/pull/2950
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1110) handle runtime exceptions in executeCancellation implementation of GobblinHelixJobLauncher

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1110?focusedWorklogId=418032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-418032
 ]

ASF GitHub Bot logged work on GOBBLIN-1110:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 22:22
Start Date: 07/Apr/20 22:22
Worklog Time Spent: 10m 
  Work Description: yukuai518 commented on issue #2950: [GOBBLIN-1110] Task 
driver stop
URL: 
https://github.com/apache/incubator-gobblin/pull/2950#issuecomment-610649504
 
 
   +1 LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 418032)
Time Spent: 0.5h  (was: 20m)

> handle runtime exceptions in executeCancellation implementation of 
> GobblinHelixJobLauncher 
> ---
>
> Key: GOBBLIN-1110
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1110
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] yukuai518 commented on issue #2950: [GOBBLIN-1110] Task driver stop

2020-04-07 Thread GitBox
yukuai518 commented on issue #2950: [GOBBLIN-1110] Task driver stop
URL: 
https://github.com/apache/incubator-gobblin/pull/2950#issuecomment-610649504
 
 
   +1 LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (GOBBLIN-1101) Enhance bulk api retry for ExceedQuota

2020-04-07 Thread Hung Tran (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran resolved GOBBLIN-1101.

Fix Version/s: 0.15.0
   Resolution: Fixed

Issue resolved by pull request #2942
[https://github.com/apache/incubator-gobblin/pull/2942]

> Enhance bulk api retry for ExceedQuota
> --
>
> Key: GOBBLIN-1101
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1101
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Alex Li
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> 1. ExceedQuota exception
> Below is SFDC doc about ExceedQuota
> {code:java}
> One of the limits customers frequently reach is the concurrent request limit. 
> Once a synchronous Apex request runs longer than 5 seconds, it begins 
> counting against this limit. Each organization is allowed 10 concurrent 
> long-running requests. If the limit is reached, any new synchronous Apex 
> request results in a runtime exception. This behavior occurs until the 
> organization’s requests are below the limit.
> {code}
> If the ExceedQuota exception happens, we should let the thread sleep 5 
> minutes and try again. There should not be a retryLimit for this exception.
> 2. Except stack in log file
> For example we set up retryLimit to 10, we retried 10 times,  and failed; we 
> need to print out exception stack in log file, there are 10 of them in the 
> exception stack.
> SSL Exception(root cause) retry and get > ExceedQuota retry and 
> get >  ExceedQuota a lot > 
> We'd better skip all the retry exception, only keep the root cause exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1101) Enhance bulk api retry for ExceedQuota

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1101?focusedWorklogId=417727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417727
 ]

ASF GitHub Bot logged work on GOBBLIN-1101:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 15:12
Start Date: 07/Apr/20 15:12
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #2942: 
GOBBLIN-1101(DSS-25241): Enhance bulk api retry for ExceedQuota
URL: https://github.com/apache/incubator-gobblin/pull/2942
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417727)
Time Spent: 4.5h  (was: 4h 20m)

> Enhance bulk api retry for ExceedQuota
> --
>
> Key: GOBBLIN-1101
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1101
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Alex Li
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> 1. ExceedQuota exception
> Below is SFDC doc about ExceedQuota
> {code:java}
> One of the limits customers frequently reach is the concurrent request limit. 
> Once a synchronous Apex request runs longer than 5 seconds, it begins 
> counting against this limit. Each organization is allowed 10 concurrent 
> long-running requests. If the limit is reached, any new synchronous Apex 
> request results in a runtime exception. This behavior occurs until the 
> organization’s requests are below the limit.
> {code}
> If the ExceedQuota exception happens, we should let the thread sleep 5 
> minutes and try again. There should not be a retryLimit for this exception.
> 2. Except stack in log file
> For example we set up retryLimit to 10, we retried 10 times,  and failed; we 
> need to print out exception stack in log file, there are 10 of them in the 
> exception stack.
> SSL Exception(root cause) retry and get > ExceedQuota retry and 
> get >  ExceedQuota a lot > 
> We'd better skip all the retry exception, only keep the root cause exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] asfgit closed pull request #2942: GOBBLIN-1101(DSS-25241): Enhance bulk api retry for ExceedQuota

2020-04-07 Thread GitBox
asfgit closed pull request #2942: GOBBLIN-1101(DSS-25241): Enhance bulk api 
retry for ExceedQuota
URL: https://github.com/apache/incubator-gobblin/pull/2942
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean up cyclic logic in task cancellation

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean 
up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404661941
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -64,19 +70,27 @@ public void run() {
 throw new IOException("File creation error: " + 
taskStateFile.getName());
   }
   long endTime = System.currentTimeMillis() + sleepTime * 1000;
-  while (System.currentTimeMillis() <= endTime) {
-Thread.sleep(1000L);
-log.warn("Sleeping for {} seconds", sleepTime);
-  }
+
+  
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(Integer.MAX_VALUE).backoffFactor(1).
 
 Review comment:
   What's the point of giving timeoutMs(Integer.MAX_VALUE) ?
   
   I think AssertWithBackoff should be used in place of Thread.Sleep when 
sleeping time is not known, e.g. when we are expecting for something to be true 
(or something to happen), but do not know when that will happen, and at the 
same time we do not want to sleep for a fixed amount of time which could be 1) 
too big, which would delay build, or 2) too small , that what we were expecting 
did not happen in that time.
   
   I know AssertWithBackoff is recommended in Gobblin team, but in this case, 
sleep time is fixed. AssertWithBackoff makes no sense here. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1068) Clean cyclic logic in task cancellation in Gobblin Task

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1068?focusedWorklogId=417582=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417582
 ]

ASF GitHub Bot logged work on GOBBLIN-1068:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 09:25
Start Date: 07/Apr/20 09:25
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2907: 
[GOBBLIN-1068]Clean up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404661941
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -64,19 +70,27 @@ public void run() {
 throw new IOException("File creation error: " + 
taskStateFile.getName());
   }
   long endTime = System.currentTimeMillis() + sleepTime * 1000;
-  while (System.currentTimeMillis() <= endTime) {
-Thread.sleep(1000L);
-log.warn("Sleeping for {} seconds", sleepTime);
-  }
+
+  
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(Integer.MAX_VALUE).backoffFactor(1).
 
 Review comment:
   What's the point of giving timeoutMs(Integer.MAX_VALUE) ?
   
   I think AssertWithBackoff should be used in place of Thread.Sleep when 
sleeping time is not known, e.g. when we are expecting for something to be true 
(or something to happen), but do not know when that will happen, and at the 
same time we do not want to sleep for a fixed amount of time which could be 1) 
too big, which would delay build, or 2) too small , that what we were expecting 
did not happen in that time.
   
   I know AssertWithBackoff is recommended in Gobblin team, but in this case, 
sleep time is fixed. AssertWithBackoff makes no sense here. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417582)
Time Spent: 3h 10m  (was: 3h)

> Clean cyclic logic in task cancellation in Gobblin Task
> ---
>
> Key: GOBBLIN-1068
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1068
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean up cyclic logic in task cancellation

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean 
up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404646726
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -19,27 +19,33 @@
 
 import java.io.File;
 import java.io.IOException;
+import java.util.concurrent.TimeoutException;
 
+import org.apache.gobblin.runtime.TaskContext;
 
 Review comment:
   It seems you are using wrong codestyle file, which is messing up imports in 
all of your PRs


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean up cyclic logic in task cancellation

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean 
up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404662868
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -167,9 +171,20 @@ public void testJobRestartViaSpec() throws Exception {
   return recordNew == null || 
targetStateNew.equals(TargetState.STOP.name());
 }, "Waiting for Workflow TargetState to be STOP");
 
-//Ensure that the SleepingTask did not terminate normally i.e. it was 
interrupted. We check this by ensuring
-// that the line "Hello World!" is not present in the logged output.
-suite.waitForAndVerifyOutputFiles();
+// Ensure that the SleepingTask did not terminate normally i.e. it was 
interrupted. We check this by ensuring
+// that the line "Hello World!" is not present in the logged output but 
only "Sleeping interrupted"
+// It is important to have back-off here since ZNode STOP state doesn't 
mean the actual sleeping task has been terminated
+// since shutdown of task is async.
+
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(12).backoffFactor(1).assertTrue(input
 -> {
 
 Review comment:
   should be `maxSleepMs(1000L).timeoutMs(12L)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1068) Clean cyclic logic in task cancellation in Gobblin Task

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1068?focusedWorklogId=417581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417581
 ]

ASF GitHub Bot logged work on GOBBLIN-1068:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 09:25
Start Date: 07/Apr/20 09:25
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2907: 
[GOBBLIN-1068]Clean up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404665523
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/suite/IntegrationJobCancelSuite.java
 ##
 @@ -20,20 +20,26 @@
 import org.junit.Assert;
 
 import com.typesafe.config.Config;
+import com.typesafe.config.ConfigValueFactory;
+
+import static org.apache.gobblin.cluster.SleepingTask.SLEEPING_TASK_SLEEP_TIME;
 
 
 public class IntegrationJobCancelSuite extends IntegrationBasicSuite {
   public static final String JOB_ID = "job_HelloWorldTestJob_1234";
   public static final String TASK_STATE_FILE = 
"/tmp/IntegrationJobCancelSuite/taskState/_RUNNING";
 
   public IntegrationJobCancelSuite(Config jobConfigOverrides) {
-super(jobConfigOverrides);
+// Put SleepingTask in long sleep allowing cancellation to happen.
+super(jobConfigOverrides.withValue(SLEEPING_TASK_SLEEP_TIME, 
ConfigValueFactory.fromAnyRef(100)));
   }
 
   @Override
   public void waitForAndVerifyOutputFiles() throws Exception {
+// SleepingTask is in an infinite sleep. The log line is printed only when 
a cancellation in invoked.
+Assert.assertTrue(verifyFileForMessage(this.jobLogOutputFile, "Sleep 
interrupted"));
+
 // If the job is cancelled, it should not have been able to write 'Hello 
World!'
 Assert.assertFalse(verifyFileForMessage(this.jobLogOutputFile, "Hello 
World!"));
 
 Review comment:
   Can we keep the deleted line. A bug was fixed and that line ensure that NPE 
does not happen. Yea, it can probably be replaced by 
Assert.assertFalse(verifyFileForMessage(this.jobLogOutputFile, "Exception"));
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417581)
Time Spent: 3h  (was: 2h 50m)

> Clean cyclic logic in task cancellation in Gobblin Task
> ---
>
> Key: GOBBLIN-1068
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1068
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1068) Clean cyclic logic in task cancellation in Gobblin Task

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1068?focusedWorklogId=417580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417580
 ]

ASF GitHub Bot logged work on GOBBLIN-1068:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 09:25
Start Date: 07/Apr/20 09:25
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2907: 
[GOBBLIN-1068]Clean up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404662868
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/ClusterIntegrationTest.java
 ##
 @@ -167,9 +171,20 @@ public void testJobRestartViaSpec() throws Exception {
   return recordNew == null || 
targetStateNew.equals(TargetState.STOP.name());
 }, "Waiting for Workflow TargetState to be STOP");
 
-//Ensure that the SleepingTask did not terminate normally i.e. it was 
interrupted. We check this by ensuring
-// that the line "Hello World!" is not present in the logged output.
-suite.waitForAndVerifyOutputFiles();
+// Ensure that the SleepingTask did not terminate normally i.e. it was 
interrupted. We check this by ensuring
+// that the line "Hello World!" is not present in the logged output but 
only "Sleeping interrupted"
+// It is important to have back-off here since ZNode STOP state doesn't 
mean the actual sleeping task has been terminated
+// since shutdown of task is async.
+
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(12).backoffFactor(1).assertTrue(input
 -> {
 
 Review comment:
   should be `maxSleepMs(1000L).timeoutMs(12L)`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417580)
Time Spent: 3h  (was: 2h 50m)

> Clean cyclic logic in task cancellation in Gobblin Task
> ---
>
> Key: GOBBLIN-1068
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1068
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean up cyclic logic in task cancellation

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean 
up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404652815
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -64,19 +70,27 @@ public void run() {
 throw new IOException("File creation error: " + 
taskStateFile.getName());
   }
   long endTime = System.currentTimeMillis() + sleepTime * 1000;
-  while (System.currentTimeMillis() <= endTime) {
-Thread.sleep(1000L);
-log.warn("Sleeping for {} seconds", sleepTime);
-  }
+
+  
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(Integer.MAX_VALUE).backoffFactor(1).
+  assertTrue(new Predicate() {
+@Override
+public boolean apply(@Nullable Void input) {
+  return System.currentTimeMillis() > endTime;
+}
+  }, "Waiting for the job to start...");
+
   log.info("Hello World!");
   super.run();
 } catch (InterruptedException e) {
-  log.error("Sleep interrupted.");
+  log.info("Sleep interrupted.");
 
 Review comment:
   I think I wrongly wrote it error. warn seems more appropriate to me. If it 
comes here, it does not mean that the SleepingTask did anything **erroneous**, 
but got interrupted. could be info too, but looks like an important message, 
especially in SleepingTask


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean up cyclic logic in task cancellation

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2907: [GOBBLIN-1068]Clean 
up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404665523
 
 

 ##
 File path: 
gobblin-cluster/src/test/java/org/apache/gobblin/cluster/suite/IntegrationJobCancelSuite.java
 ##
 @@ -20,20 +20,26 @@
 import org.junit.Assert;
 
 import com.typesafe.config.Config;
+import com.typesafe.config.ConfigValueFactory;
+
+import static org.apache.gobblin.cluster.SleepingTask.SLEEPING_TASK_SLEEP_TIME;
 
 
 public class IntegrationJobCancelSuite extends IntegrationBasicSuite {
   public static final String JOB_ID = "job_HelloWorldTestJob_1234";
   public static final String TASK_STATE_FILE = 
"/tmp/IntegrationJobCancelSuite/taskState/_RUNNING";
 
   public IntegrationJobCancelSuite(Config jobConfigOverrides) {
-super(jobConfigOverrides);
+// Put SleepingTask in long sleep allowing cancellation to happen.
+super(jobConfigOverrides.withValue(SLEEPING_TASK_SLEEP_TIME, 
ConfigValueFactory.fromAnyRef(100)));
   }
 
   @Override
   public void waitForAndVerifyOutputFiles() throws Exception {
+// SleepingTask is in an infinite sleep. The log line is printed only when 
a cancellation in invoked.
+Assert.assertTrue(verifyFileForMessage(this.jobLogOutputFile, "Sleep 
interrupted"));
+
 // If the job is cancelled, it should not have been able to write 'Hello 
World!'
 Assert.assertFalse(verifyFileForMessage(this.jobLogOutputFile, "Hello 
World!"));
 
 Review comment:
   Can we keep the deleted line. A bug was fixed and that line ensure that NPE 
does not happen. Yea, it can probably be replaced by 
Assert.assertFalse(verifyFileForMessage(this.jobLogOutputFile, "Exception"));


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1068) Clean cyclic logic in task cancellation in Gobblin Task

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1068?focusedWorklogId=417578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417578
 ]

ASF GitHub Bot logged work on GOBBLIN-1068:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 09:25
Start Date: 07/Apr/20 09:25
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2907: 
[GOBBLIN-1068]Clean up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404646726
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -19,27 +19,33 @@
 
 import java.io.File;
 import java.io.IOException;
+import java.util.concurrent.TimeoutException;
 
+import org.apache.gobblin.runtime.TaskContext;
 
 Review comment:
   It seems you are using wrong codestyle file, which is messing up imports in 
all of your PRs
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417578)
Time Spent: 2h 40m  (was: 2.5h)

> Clean cyclic logic in task cancellation in Gobblin Task
> ---
>
> Key: GOBBLIN-1068
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1068
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1068) Clean cyclic logic in task cancellation in Gobblin Task

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1068?focusedWorklogId=417579=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417579
 ]

ASF GitHub Bot logged work on GOBBLIN-1068:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 09:25
Start Date: 07/Apr/20 09:25
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2907: 
[GOBBLIN-1068]Clean up cyclic logic in task cancellation
URL: https://github.com/apache/incubator-gobblin/pull/2907#discussion_r404652815
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -64,19 +70,27 @@ public void run() {
 throw new IOException("File creation error: " + 
taskStateFile.getName());
   }
   long endTime = System.currentTimeMillis() + sleepTime * 1000;
-  while (System.currentTimeMillis() <= endTime) {
-Thread.sleep(1000L);
-log.warn("Sleeping for {} seconds", sleepTime);
-  }
+
+  
AssertWithBackoff.create().maxSleepMs(1000).timeoutMs(Integer.MAX_VALUE).backoffFactor(1).
+  assertTrue(new Predicate() {
+@Override
+public boolean apply(@Nullable Void input) {
+  return System.currentTimeMillis() > endTime;
+}
+  }, "Waiting for the job to start...");
+
   log.info("Hello World!");
   super.run();
 } catch (InterruptedException e) {
-  log.error("Sleep interrupted.");
+  log.info("Sleep interrupted.");
 
 Review comment:
   I think I wrongly wrote it error. warn seems more appropriate to me. If it 
comes here, it does not mean that the SleepingTask did anything **erroneous**, 
but got interrupted. could be info too, but looks like an important message, 
especially in SleepingTask
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417579)
Time Spent: 2h 50m  (was: 2h 40m)

> Clean cyclic logic in task cancellation in Gobblin Task
> ---
>
> Key: GOBBLIN-1068
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1068
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404626735
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/FailedTask.java
 ##
 @@ -16,19 +16,30 @@
  */
 package org.apache.gobblin.runtime.task;
 
+import org.apache.gobblin.configuration.ConfigurationKeys;
 import org.apache.gobblin.configuration.WorkUnitState;
 import org.apache.gobblin.publisher.DataPublisher;
 import org.apache.gobblin.publisher.NoopPublisher;
 import org.apache.gobblin.runtime.JobState;
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.source.workunit.WorkUnit;
 
+import com.google.common.base.Throwables;
+
+import groovy.util.logging.Slf4j;
+
+
 /**
- * A task which raise an exception when run
+ * A task which returns "FAILED" state directly.
 
 Review comment:
   If the code on the left raises an exception, why the code on the right would 
not raise an exception? javadoc on the left is incorrect?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417538
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 08:19
Start Date: 07/Apr/20 08:19
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404623597
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/FailedTask.java
 ##
 @@ -16,19 +16,30 @@
  */
 package org.apache.gobblin.runtime.task;
 
+import org.apache.gobblin.configuration.ConfigurationKeys;
 import org.apache.gobblin.configuration.WorkUnitState;
 import org.apache.gobblin.publisher.DataPublisher;
 import org.apache.gobblin.publisher.NoopPublisher;
 import org.apache.gobblin.runtime.JobState;
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.source.workunit.WorkUnit;
 
+import com.google.common.base.Throwables;
+
+import groovy.util.logging.Slf4j;
+
+
 /**
- * A task which raise an exception when run
+ * A task which returns "FAILED" state directly.
  */
+@Slf4j
 public class FailedTask extends BaseAbstractTask {
-  public FailedTask (TaskContext taskContext) {
+  private final TaskContext taskContext;
 
 Review comment:
   Oh yes, maybe have this in BaseAbstractTask so it can be removed from all 
child classes like I said in previous comment?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417538)
Time Spent: 2h 40m  (was: 2.5h)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404623597
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/FailedTask.java
 ##
 @@ -16,19 +16,30 @@
  */
 package org.apache.gobblin.runtime.task;
 
+import org.apache.gobblin.configuration.ConfigurationKeys;
 import org.apache.gobblin.configuration.WorkUnitState;
 import org.apache.gobblin.publisher.DataPublisher;
 import org.apache.gobblin.publisher.NoopPublisher;
 import org.apache.gobblin.runtime.JobState;
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.source.workunit.WorkUnit;
 
+import com.google.common.base.Throwables;
+
+import groovy.util.logging.Slf4j;
+
+
 /**
- * A task which raise an exception when run
+ * A task which returns "FAILED" state directly.
  */
+@Slf4j
 public class FailedTask extends BaseAbstractTask {
-  public FailedTask (TaskContext taskContext) {
+  private final TaskContext taskContext;
 
 Review comment:
   Oh yes, maybe have this in BaseAbstractTask so it can be removed from all 
child classes like I said in previous comment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417537
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 08:18
Start Date: 07/Apr/20 08:18
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404623597
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/FailedTask.java
 ##
 @@ -16,19 +16,30 @@
  */
 package org.apache.gobblin.runtime.task;
 
+import org.apache.gobblin.configuration.ConfigurationKeys;
 import org.apache.gobblin.configuration.WorkUnitState;
 import org.apache.gobblin.publisher.DataPublisher;
 import org.apache.gobblin.publisher.NoopPublisher;
 import org.apache.gobblin.runtime.JobState;
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.source.workunit.WorkUnit;
 
+import com.google.common.base.Throwables;
+
+import groovy.util.logging.Slf4j;
+
+
 /**
- * A task which raise an exception when run
+ * A task which returns "FAILED" state directly.
  */
+@Slf4j
 public class FailedTask extends BaseAbstractTask {
-  public FailedTask (TaskContext taskContext) {
+  private final TaskContext taskContext;
 
 Review comment:
   Oh yes, maybe have this in BaseAbstractTask so it can be removed from all 
child classes?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417537)
Time Spent: 2.5h  (was: 2h 20m)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404623597
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/FailedTask.java
 ##
 @@ -16,19 +16,30 @@
  */
 package org.apache.gobblin.runtime.task;
 
+import org.apache.gobblin.configuration.ConfigurationKeys;
 import org.apache.gobblin.configuration.WorkUnitState;
 import org.apache.gobblin.publisher.DataPublisher;
 import org.apache.gobblin.publisher.NoopPublisher;
 import org.apache.gobblin.runtime.JobState;
 import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.TaskState;
 import org.apache.gobblin.source.workunit.WorkUnit;
 
+import com.google.common.base.Throwables;
+
+import groovy.util.logging.Slf4j;
+
+
 /**
- * A task which raise an exception when run
+ * A task which returns "FAILED" state directly.
  */
+@Slf4j
 public class FailedTask extends BaseAbstractTask {
-  public FailedTask (TaskContext taskContext) {
+  private final TaskContext taskContext;
 
 Review comment:
   Oh yes, maybe have this in BaseAbstractTask so it can be removed from all 
child classes?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404622306
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/BaseAbstractTask.java
 ##
 @@ -91,4 +94,13 @@ public String getProgress() {
   public boolean isSpeculativeExecutionSafe() {
 return false;
   }
+
+  /**
+   * Similar to org.apache.gobblin.runtime.Task#failTask(java.lang.Throwable), 
we need to propagate exception thrown
+   * in workunit level by setting {@link 
ConfigurationKeys#TASK_FAILURE_EXCEPTION_KEY}
+   */
+  protected void failTask(Throwable t, TaskContext taskContext) {
+
taskContext.getTaskState().setWorkingState(WorkUnitState.WorkingState.FAILED);
+
taskContext.getTaskState().setProp(ConfigurationKeys.TASK_FAILURE_EXCEPTION_KEY,
 Throwables.getStackTraceAsString(t));
+  }
 
 Review comment:
   No failureEvent here?
   Is there a reasonable way to not have duplicate code?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417536
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 08:16
Start Date: 07/Apr/20 08:16
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404622306
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/BaseAbstractTask.java
 ##
 @@ -91,4 +94,13 @@ public String getProgress() {
   public boolean isSpeculativeExecutionSafe() {
 return false;
   }
+
+  /**
+   * Similar to org.apache.gobblin.runtime.Task#failTask(java.lang.Throwable), 
we need to propagate exception thrown
+   * in workunit level by setting {@link 
ConfigurationKeys#TASK_FAILURE_EXCEPTION_KEY}
+   */
+  protected void failTask(Throwable t, TaskContext taskContext) {
+
taskContext.getTaskState().setWorkingState(WorkUnitState.WorkingState.FAILED);
+
taskContext.getTaskState().setProp(ConfigurationKeys.TASK_FAILURE_EXCEPTION_KEY,
 Throwables.getStackTraceAsString(t));
+  }
 
 Review comment:
   No failureEvent here?
   Is there a reasonable way to not have duplicate code?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417536)
Time Spent: 2h 20m  (was: 2h 10m)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417535
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 08:15
Start Date: 07/Apr/20 08:15
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404621465
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/BaseAbstractTask.java
 ##
 @@ -91,4 +94,13 @@ public String getProgress() {
   public boolean isSpeculativeExecutionSafe() {
 return false;
   }
+
+  /**
+   * Similar to org.apache.gobblin.runtime.Task#failTask(java.lang.Throwable), 
we need to propagate exception thrown
+   * in workunit level by setting {@link 
ConfigurationKeys#TASK_FAILURE_EXCEPTION_KEY}
+   */
+  protected void failTask(Throwable t, TaskContext taskContext) {
+
taskContext.getTaskState().setWorkingState(WorkUnitState.WorkingState.FAILED);
 
 Review comment:
   Should we save taskContext in BaseAbstractTask ? In this way, signature of 
this method can be consistent with that in Task.java
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417535)
Time Spent: 2h 10m  (was: 2h)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404621465
 
 

 ##
 File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/task/BaseAbstractTask.java
 ##
 @@ -91,4 +94,13 @@ public String getProgress() {
   public boolean isSpeculativeExecutionSafe() {
 return false;
   }
+
+  /**
+   * Similar to org.apache.gobblin.runtime.Task#failTask(java.lang.Throwable), 
we need to propagate exception thrown
+   * in workunit level by setting {@link 
ConfigurationKeys#TASK_FAILURE_EXCEPTION_KEY}
+   */
+  protected void failTask(Throwable t, TaskContext taskContext) {
+
taskContext.getTaskState().setWorkingState(WorkUnitState.WorkingState.FAILED);
 
 Review comment:
   Should we save taskContext in BaseAbstractTask ? In this way, signature of 
this method can be consistent with that in Task.java


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417525
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 08:07
Start Date: 07/Apr/20 08:07
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404608709
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -73,10 +76,10 @@ public void run() {
 } catch (InterruptedException e) {
   log.error("Sleep interrupted.");
   Thread.currentThread().interrupt();
-  Throwables.propagate(e);
+  failTask(e, taskContext);
 
 Review comment:
   SleepingTask was written to test cancellation code or hanging issues. It is 
meant to sleep for some amount of time to simulate as a hanging task in 
production. It is supposed propagate exceptions like any normal task would do.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417525)
Time Spent: 2h  (was: 1h 50m)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404608709
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -73,10 +76,10 @@ public void run() {
 } catch (InterruptedException e) {
   log.error("Sleep interrupted.");
   Thread.currentThread().interrupt();
-  Throwables.propagate(e);
+  failTask(e, taskContext);
 
 Review comment:
   SleepingTask was written to test cancellation code or hanging issues. It is 
meant to sleep for some amount of time to simulate as a hanging task in 
production. It is supposed propagate exceptions like any normal task would do.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417515
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 07:54
Start Date: 07/Apr/20 07:54
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404608709
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -73,10 +76,10 @@ public void run() {
 } catch (InterruptedException e) {
   log.error("Sleep interrupted.");
   Thread.currentThread().interrupt();
-  Throwables.propagate(e);
+  failTask(e, taskContext);
 
 Review comment:
   SleepingTask was written to test cancellation code or hanging issues. It is 
meant to sleep for some amount to simulate as a hanging task in production. It 
is supposed propagate exceptions like any normal task would do.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417515)
Time Spent: 1h 50m  (was: 1h 40m)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404608709
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -73,10 +76,10 @@ public void run() {
 } catch (InterruptedException e) {
   log.error("Sleep interrupted.");
   Thread.currentThread().interrupt();
-  Throwables.propagate(e);
+  failTask(e, taskContext);
 
 Review comment:
   SleepingTask was written to test cancellation code or hanging issues. It is 
meant to sleep for some amount to simulate as a hanging task in production. It 
is supposed propagate exceptions like any normal task would do.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-1083) Maximize helix retry by returning failure when cancel

2020-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1083?focusedWorklogId=417512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-417512
 ]

ASF GitHub Bot logged work on GOBBLIN-1083:
---

Author: ASF GitHub Bot
Created on: 07/Apr/20 07:51
Start Date: 07/Apr/20 07:51
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2923: 
[GOBBLIN-1083] Unit test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404606800
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -20,27 +20,30 @@
 import java.io.File;
 import java.io.IOException;
 
+import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.task.BaseAbstractTask;
+
 import com.google.common.base.Throwables;
 import com.google.common.io.Files;
 
 import lombok.extern.slf4j.Slf4j;
 
-import org.apache.gobblin.runtime.TaskContext;
-import org.apache.gobblin.runtime.TaskState;
-import org.apache.gobblin.runtime.task.BaseAbstractTask;
-
 @Slf4j
 public class SleepingTask extends BaseAbstractTask {
   public static final String TASK_STATE_FILE_KEY = "task.state.file.path";
+  public static final String SLEEPING_TASK_SLEEP_TIME = 
"data.publisher.sleep.time.in.seconds";
 
   private final long sleepTime;
   private File taskStateFile;
+  private final TaskContext taskContext;
 
   public SleepingTask(TaskContext taskContext) {
 super(taskContext);
-TaskState taskState = taskContext.getTaskState();
-sleepTime = 
taskState.getPropAsLong("data.publisher.sleep.time.in.seconds", 10L);
-taskStateFile = new File(taskState.getProp(TASK_STATE_FILE_KEY));
+this.taskContext = taskContext;
+sleepTime = 
taskContext.getTaskState().getPropAsLong(SLEEPING_TASK_SLEEP_TIME, 10L);
 
 Review comment:
   why `TaskState taskState = taskContext.getTaskState();` is removed?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 417512)
Time Spent: 1h 40m  (was: 1.5h)

> Maximize helix retry by returning failure when cancel
> -
>
> Key: GOBBLIN-1083
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1083
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Lei Sun
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit test improving & return failed when helix task cancelled

2020-04-07 Thread GitBox
arjun4084346 commented on a change in pull request #2923: [GOBBLIN-1083] Unit 
test improving & return failed when helix task cancelled
URL: https://github.com/apache/incubator-gobblin/pull/2923#discussion_r404606800
 
 

 ##
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/SleepingTask.java
 ##
 @@ -20,27 +20,30 @@
 import java.io.File;
 import java.io.IOException;
 
+import org.apache.gobblin.runtime.TaskContext;
+import org.apache.gobblin.runtime.task.BaseAbstractTask;
+
 import com.google.common.base.Throwables;
 import com.google.common.io.Files;
 
 import lombok.extern.slf4j.Slf4j;
 
-import org.apache.gobblin.runtime.TaskContext;
-import org.apache.gobblin.runtime.TaskState;
-import org.apache.gobblin.runtime.task.BaseAbstractTask;
-
 @Slf4j
 public class SleepingTask extends BaseAbstractTask {
   public static final String TASK_STATE_FILE_KEY = "task.state.file.path";
+  public static final String SLEEPING_TASK_SLEEP_TIME = 
"data.publisher.sleep.time.in.seconds";
 
   private final long sleepTime;
   private File taskStateFile;
+  private final TaskContext taskContext;
 
   public SleepingTask(TaskContext taskContext) {
 super(taskContext);
-TaskState taskState = taskContext.getTaskState();
-sleepTime = 
taskState.getPropAsLong("data.publisher.sleep.time.in.seconds", 10L);
-taskStateFile = new File(taskState.getProp(TASK_STATE_FILE_KEY));
+this.taskContext = taskContext;
+sleepTime = 
taskContext.getTaskState().getPropAsLong(SLEEPING_TASK_SLEEP_TIME, 10L);
 
 Review comment:
   why `TaskState taskState = taskContext.getTaskState();` is removed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services