[jira] [Work logged] (GOBBLIN-771) emit a few metrics for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-771?focusedWorklogId=247803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247803
 ]

ASF GitHub Bot logged work on GOBBLIN-771:
--

Author: ASF GitHub Bot
Created on: 24/May/19 01:06
Start Date: 24/May/19 01:06
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2635: 
[GOBBLIN-771] add  a few metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287184812
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java
 ##
 @@ -241,10 +233,15 @@ public void orchestrate(Spec spec) throws Exception {
   if (!canRun(flowName, flowGroup, allowConcurrentExecution)) {
 _log.warn("Another instance of flowGroup: {}, flowName: {} running; 
Skipping flow execution since "
 + "concurrent executions are disabled for this flow.", flowGroup, 
flowName);
-if (this.flowAlreadyRunningGauge.isPresent()) {
-  this.jobAlreadyRunning.incrementAndGet();
-}
+// We send a gauge with value 0 signifying that the flow could not be 
compiled because previous execution is already running
+metricContext.newContextAwareGauge(
+
MetricRegistry.name(MetricReportUtils.GOBBLIN_SERVICE_METRICS_PREFIX, 
flowGroup, flowName, ServiceMetricNames.COMPILED),
+() -> 0L);
 
 Review comment:
   No, O and 1 seems more intuitive. 0 means not compiled, 1 means compiled?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247803)
Time Spent: 1h 10m  (was: 1h)

> emit a few metrics for gobblin service
> --
>
> Key: GOBBLIN-771
> URL: https://issues.apache.org/jira/browse/GOBBLIN-771
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-771) emit a few metrics for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-771?focusedWorklogId=247804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247804
 ]

ASF GitHub Bot logged work on GOBBLIN-771:
--

Author: ASF GitHub Bot
Created on: 24/May/19 01:08
Start Date: 24/May/19 01:08
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2635: 
[GOBBLIN-771] add  a few metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287185042
 
 

 ##
 File path: 
gobblin-metrics-libs/gobblin-metrics/src/main/java/org/apache/gobblin/metrics/ServiceMetricNames.java
 ##
 @@ -37,5 +37,5 @@
   public static final String RUN_IMMEDIATELY_FLOW_METER = "RunImmediatelyFlow";
 
   public static final String RUNNING_FLOWS_COUNTER = "RunningFlows";
-  public static final String FLOWS_ALREADY_RUNNING_GAUGE = 
"FlowsAlreadyRunning";
+  public static final String COMPILED = "Compiled";
 
 Review comment:
   I think 'Compiled' is ok, because we are appending flow details before it.
   e.g. ktwo.encryption_holdem_faro.Compiled = 1/0 seems better than
   ktwo.encryption_holdem_faro.CompiledFlows = 1/0
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247804)
Time Spent: 1h 20m  (was: 1h 10m)

> emit a few metrics for gobblin service
> --
>
> Key: GOBBLIN-771
> URL: https://issues.apache.org/jira/browse/GOBBLIN-771
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2635: [GOBBLIN-771] add a few metrics for gobblin service

2019-05-23 Thread GitBox
arjun4084346 commented on a change in pull request #2635: [GOBBLIN-771] add  a 
few metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287185042
 
 

 ##
 File path: 
gobblin-metrics-libs/gobblin-metrics/src/main/java/org/apache/gobblin/metrics/ServiceMetricNames.java
 ##
 @@ -37,5 +37,5 @@
   public static final String RUN_IMMEDIATELY_FLOW_METER = "RunImmediatelyFlow";
 
   public static final String RUNNING_FLOWS_COUNTER = "RunningFlows";
-  public static final String FLOWS_ALREADY_RUNNING_GAUGE = 
"FlowsAlreadyRunning";
+  public static final String COMPILED = "Compiled";
 
 Review comment:
   I think 'Compiled' is ok, because we are appending flow details before it.
   e.g. ktwo.encryption_holdem_faro.Compiled = 1/0 seems better than
   ktwo.encryption_holdem_faro.CompiledFlows = 1/0


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2635: [GOBBLIN-771] add a few metrics for gobblin service

2019-05-23 Thread GitBox
arjun4084346 commented on a change in pull request #2635: [GOBBLIN-771] add  a 
few metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287184812
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java
 ##
 @@ -241,10 +233,15 @@ public void orchestrate(Spec spec) throws Exception {
   if (!canRun(flowName, flowGroup, allowConcurrentExecution)) {
 _log.warn("Another instance of flowGroup: {}, flowName: {} running; 
Skipping flow execution since "
 + "concurrent executions are disabled for this flow.", flowGroup, 
flowName);
-if (this.flowAlreadyRunningGauge.isPresent()) {
-  this.jobAlreadyRunning.incrementAndGet();
-}
+// We send a gauge with value 0 signifying that the flow could not be 
compiled because previous execution is already running
+metricContext.newContextAwareGauge(
+
MetricRegistry.name(MetricReportUtils.GOBBLIN_SERVICE_METRICS_PREFIX, 
flowGroup, flowName, ServiceMetricNames.COMPILED),
+() -> 0L);
 
 Review comment:
   No, O and 1 seems more intuitive. 0 means not compiled, 1 means compiled?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-775) Add job level retry for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-775?focusedWorklogId=247743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247743
 ]

ASF GitHub Bot logged work on GOBBLIN-775:
--

Author: ASF GitHub Bot
Created on: 23/May/19 22:11
Start Date: 23/May/19 22:11
Worklog Time Spent: 10m 
  Work Description: jack-moseley commented on pull request #2640: 
[GOBBLIN-775] Add job level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287153533
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -391,6 +392,10 @@ private void pollAndAdvanceDag()
 jobExecutionPlan.setExecutionStatus(RUNNING);
 break;
 }
+
+if (jobStatus.isShouldRetry()) {
 
 Review comment:
   Yes, and they also have their status updated to running at the same time.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247743)
Time Spent: 1h 20m  (was: 1h 10m)

> Add job level retry for gobblin service
> ---
>
> Key: GOBBLIN-775
> URL: https://issues.apache.org/jira/browse/GOBBLIN-775
> Project: Apache Gobblin
>  Issue Type: New Feature
>  Components: gobblin-service
>Reporter: Jack Moseley
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] jack-moseley commented on a change in pull request #2640: [GOBBLIN-775] Add job level retries for gobblin service

2019-05-23 Thread GitBox
jack-moseley commented on a change in pull request #2640: [GOBBLIN-775] Add job 
level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287153533
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -391,6 +392,10 @@ private void pollAndAdvanceDag()
 jobExecutionPlan.setExecutionStatus(RUNNING);
 break;
 }
+
+if (jobStatus.isShouldRetry()) {
 
 Review comment:
   Yes, and they also have their status updated to running at the same time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-775) Add job level retry for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-775?focusedWorklogId=247740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247740
 ]

ASF GitHub Bot logged work on GOBBLIN-775:
--

Author: ASF GitHub Bot
Created on: 23/May/19 22:10
Start Date: 23/May/19 22:10
Worklog Time Spent: 10m 
  Work Description: jack-moseley commented on pull request #2640: 
[GOBBLIN-775] Add job level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287153336
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -457,6 +462,15 @@ private void submitJob(DagNode dagNode) 
{
   JobSpec jobSpec = DagManagerUtils.getJobSpec(dagNode);
   Map jobMetadata = 
TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan);
 
+  // Increment submission attempt
 
 Review comment:
   I don't think the incrementing logic should be in there because it could be 
called at a time other than job submission. But I moved the logic for updating 
the map to there.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247740)
Time Spent: 1h 10m  (was: 1h)

> Add job level retry for gobblin service
> ---
>
> Key: GOBBLIN-775
> URL: https://issues.apache.org/jira/browse/GOBBLIN-775
> Project: Apache Gobblin
>  Issue Type: New Feature
>  Components: gobblin-service
>Reporter: Jack Moseley
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] jack-moseley commented on a change in pull request #2640: [GOBBLIN-775] Add job level retries for gobblin service

2019-05-23 Thread GitBox
jack-moseley commented on a change in pull request #2640: [GOBBLIN-775] Add job 
level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287153336
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -457,6 +462,15 @@ private void submitJob(DagNode dagNode) 
{
   JobSpec jobSpec = DagManagerUtils.getJobSpec(dagNode);
   Map jobMetadata = 
TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan);
 
+  // Increment submission attempt
 
 Review comment:
   I don't think the incrementing logic should be in there because it could be 
called at a time other than job submission. But I moved the logic for updating 
the map to there.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-771) emit a few metrics for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-771?focusedWorklogId=247570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247570
 ]

ASF GitHub Bot logged work on GOBBLIN-771:
--

Author: ASF GitHub Bot
Created on: 23/May/19 17:31
Start Date: 23/May/19 17:31
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2635: [GOBBLIN-771] 
add  a few metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287054481
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java
 ##
 @@ -241,10 +233,15 @@ public void orchestrate(Spec spec) throws Exception {
   if (!canRun(flowName, flowGroup, allowConcurrentExecution)) {
 _log.warn("Another instance of flowGroup: {}, flowName: {} running; 
Skipping flow execution since "
 + "concurrent executions are disabled for this flow.", flowGroup, 
flowName);
-if (this.flowAlreadyRunningGauge.isPresent()) {
-  this.jobAlreadyRunning.incrementAndGet();
-}
+// We send a gauge with value 0 signifying that the flow could not be 
compiled because previous execution is already running
+metricContext.newContextAwareGauge(
+
MetricRegistry.name(MetricReportUtils.GOBBLIN_SERVICE_METRICS_PREFIX, 
flowGroup, flowName, ServiceMetricNames.COMPILED),
+() -> 0L);
 
 Review comment:
   Can we use gauge with values 1 and 2 instead of 0 and 1?   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247570)
Time Spent: 1h  (was: 50m)

> emit a few metrics for gobblin service
> --
>
> Key: GOBBLIN-771
> URL: https://issues.apache.org/jira/browse/GOBBLIN-771
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-771) emit a few metrics for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-771?focusedWorklogId=247569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247569
 ]

ASF GitHub Bot logged work on GOBBLIN-771:
--

Author: ASF GitHub Bot
Created on: 23/May/19 17:31
Start Date: 23/May/19 17:31
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2635: [GOBBLIN-771] 
add  a few metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287041859
 
 

 ##
 File path: 
gobblin-metrics-libs/gobblin-metrics/src/main/java/org/apache/gobblin/metrics/ServiceMetricNames.java
 ##
 @@ -37,5 +37,5 @@
   public static final String RUN_IMMEDIATELY_FLOW_METER = "RunImmediatelyFlow";
 
   public static final String RUNNING_FLOWS_COUNTER = "RunningFlows";
-  public static final String FLOWS_ALREADY_RUNNING_GAUGE = 
"FlowsAlreadyRunning";
+  public static final String COMPILED = "Compiled";
 
 Review comment:
   Maybe "CompiledFlows"?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247569)
Time Spent: 50m  (was: 40m)

> emit a few metrics for gobblin service
> --
>
> Key: GOBBLIN-771
> URL: https://issues.apache.org/jira/browse/GOBBLIN-771
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2635: [GOBBLIN-771] add a few metrics for gobblin service

2019-05-23 Thread GitBox
sv2000 commented on a change in pull request #2635: [GOBBLIN-771] add  a few 
metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287041859
 
 

 ##
 File path: 
gobblin-metrics-libs/gobblin-metrics/src/main/java/org/apache/gobblin/metrics/ServiceMetricNames.java
 ##
 @@ -37,5 +37,5 @@
   public static final String RUN_IMMEDIATELY_FLOW_METER = "RunImmediatelyFlow";
 
   public static final String RUNNING_FLOWS_COUNTER = "RunningFlows";
-  public static final String FLOWS_ALREADY_RUNNING_GAUGE = 
"FlowsAlreadyRunning";
+  public static final String COMPILED = "Compiled";
 
 Review comment:
   Maybe "CompiledFlows"?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2635: [GOBBLIN-771] add a few metrics for gobblin service

2019-05-23 Thread GitBox
sv2000 commented on a change in pull request #2635: [GOBBLIN-771] add  a few 
metrics for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2635#discussion_r287054481
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/Orchestrator.java
 ##
 @@ -241,10 +233,15 @@ public void orchestrate(Spec spec) throws Exception {
   if (!canRun(flowName, flowGroup, allowConcurrentExecution)) {
 _log.warn("Another instance of flowGroup: {}, flowName: {} running; 
Skipping flow execution since "
 + "concurrent executions are disabled for this flow.", flowGroup, 
flowName);
-if (this.flowAlreadyRunningGauge.isPresent()) {
-  this.jobAlreadyRunning.incrementAndGet();
-}
+// We send a gauge with value 0 signifying that the flow could not be 
compiled because previous execution is already running
+metricContext.newContextAwareGauge(
+
MetricRegistry.name(MetricReportUtils.GOBBLIN_SERVICE_METRICS_PREFIX, 
flowGroup, flowName, ServiceMetricNames.COMPILED),
+() -> 0L);
 
 Review comment:
   Can we use gauge with values 1 and 2 instead of 0 and 1?   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-775) Add job level retry for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-775?focusedWorklogId=247561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247561
 ]

ASF GitHub Bot logged work on GOBBLIN-775:
--

Author: ASF GitHub Bot
Created on: 23/May/19 17:30
Start Date: 23/May/19 17:30
Worklog Time Spent: 10m 
  Work Description: autumnust commented on pull request #2640: 
[GOBBLIN-775] Add job level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287050503
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManagerUtils.java
 ##
 @@ -75,8 +75,10 @@ static String getJobName(DagNode dagNode) 
{
* @return a fully qualified name of the underlying job.
*/
   static String getFullyQualifiedJobName(DagNode dagNode) {
-Config jobConfig = dagNode.getValue().getJobSpec().getConfig();
+return 
getFullyQualifiedJobName(dagNode.getValue().getJobSpec().getConfig());
+  }
 
+  public static String getFullyQualifiedJobName(Config jobConfig) {
 
 Review comment:
   Let's give it an another name instead.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247561)
Time Spent: 40m  (was: 0.5h)

> Add job level retry for gobblin service
> ---
>
> Key: GOBBLIN-775
> URL: https://issues.apache.org/jira/browse/GOBBLIN-775
> Project: Apache Gobblin
>  Issue Type: New Feature
>  Components: gobblin-service
>Reporter: Jack Moseley
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-775) Add job level retry for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-775?focusedWorklogId=247563=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247563
 ]

ASF GitHub Bot logged work on GOBBLIN-775:
--

Author: ASF GitHub Bot
Created on: 23/May/19 17:30
Start Date: 23/May/19 17:30
Worklog Time Spent: 10m 
  Work Description: autumnust commented on pull request #2640: 
[GOBBLIN-775] Add job level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287053928
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -391,6 +392,10 @@ private void pollAndAdvanceDag()
 jobExecutionPlan.setExecutionStatus(RUNNING);
 break;
 }
+
+if (jobStatus.isShouldRetry()) {
 
 Review comment:
   Only failed job's status will have this flag turned on? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247563)
Time Spent: 1h  (was: 50m)

> Add job level retry for gobblin service
> ---
>
> Key: GOBBLIN-775
> URL: https://issues.apache.org/jira/browse/GOBBLIN-775
> Project: Apache Gobblin
>  Issue Type: New Feature
>  Components: gobblin-service
>Reporter: Jack Moseley
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-775) Add job level retry for gobblin service

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-775?focusedWorklogId=247562=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247562
 ]

ASF GitHub Bot logged work on GOBBLIN-775:
--

Author: ASF GitHub Bot
Created on: 23/May/19 17:30
Start Date: 23/May/19 17:30
Worklog Time Spent: 10m 
  Work Description: autumnust commented on pull request #2640: 
[GOBBLIN-775] Add job level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287051518
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -457,6 +462,15 @@ private void submitJob(DagNode dagNode) 
{
   JobSpec jobSpec = DagManagerUtils.getJobSpec(dagNode);
   Map jobMetadata = 
TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan);
 
+  // Increment submission attempt
 
 Review comment:
   Should this blocked be part of `TimingEventUtils.getJobMetadata` ? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247562)
Time Spent: 50m  (was: 40m)

> Add job level retry for gobblin service
> ---
>
> Key: GOBBLIN-775
> URL: https://issues.apache.org/jira/browse/GOBBLIN-775
> Project: Apache Gobblin
>  Issue Type: New Feature
>  Components: gobblin-service
>Reporter: Jack Moseley
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2640: [GOBBLIN-775] Add job level retries for gobblin service

2019-05-23 Thread GitBox
autumnust commented on a change in pull request #2640: [GOBBLIN-775] Add job 
level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287050503
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManagerUtils.java
 ##
 @@ -75,8 +75,10 @@ static String getJobName(DagNode dagNode) 
{
* @return a fully qualified name of the underlying job.
*/
   static String getFullyQualifiedJobName(DagNode dagNode) {
-Config jobConfig = dagNode.getValue().getJobSpec().getConfig();
+return 
getFullyQualifiedJobName(dagNode.getValue().getJobSpec().getConfig());
+  }
 
+  public static String getFullyQualifiedJobName(Config jobConfig) {
 
 Review comment:
   Let's give it an another name instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2640: [GOBBLIN-775] Add job level retries for gobblin service

2019-05-23 Thread GitBox
autumnust commented on a change in pull request #2640: [GOBBLIN-775] Add job 
level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287051518
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -457,6 +462,15 @@ private void submitJob(DagNode dagNode) 
{
   JobSpec jobSpec = DagManagerUtils.getJobSpec(dagNode);
   Map jobMetadata = 
TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan);
 
+  // Increment submission attempt
 
 Review comment:
   Should this blocked be part of `TimingEventUtils.getJobMetadata` ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2640: [GOBBLIN-775] Add job level retries for gobblin service

2019-05-23 Thread GitBox
autumnust commented on a change in pull request #2640: [GOBBLIN-775] Add job 
level retries for gobblin service
URL: https://github.com/apache/incubator-gobblin/pull/2640#discussion_r287053928
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -391,6 +392,10 @@ private void pollAndAdvanceDag()
 jobExecutionPlan.setExecutionStatus(RUNNING);
 break;
 }
+
+if (jobStatus.isShouldRetry()) {
 
 Review comment:
   Only failed job's status will have this flag turned on? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (GOBBLIN-780) Handle scenarios that cause the YarnAutoScalingManager to be stuck

2019-05-23 Thread Hung Tran (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran updated GOBBLIN-780:
--
Summary: Handle scenarios that cause the YarnAutoScalingManager to be stuck 
 (was: Handle scenarios that causes the YarnAutoScalingManager to be stuck)

> Handle scenarios that cause the YarnAutoScalingManager to be stuck
> --
>
> Key: GOBBLIN-780
> URL: https://issues.apache.org/jira/browse/GOBBLIN-780
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Priority: Major
>
> Issue 1: The YarnAutoScalingRunnable is run in a fixed schedule by a 
> ScheduledExecutorService in YarnAutoScalingManager. If the runnable 
> encounters an exception the the executor service will stop scheduling it. 
> Catch all exceptions in the runnable, log, and do not re-raise.
> Issue 2: The auto scaler may reduce the container count to 0. Helix will not 
> schedule any flows if there are no participants connected. This results in 
> the auto scaler keeping the container count at 0 and no progress is made. Fix 
> this by not allowing the container count to be reduced below 1.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-780) Handle scenarios that cause the YarnAutoScalingManager to be stuck

2019-05-23 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-780?focusedWorklogId=247520=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247520
 ]

ASF GitHub Bot logged work on GOBBLIN-780:
--

Author: ASF GitHub Bot
Created on: 23/May/19 16:29
Start Date: 23/May/19 16:29
Worklog Time Spent: 10m 
  Work Description: htran1 commented on pull request #2644: [GOBBLIN-780] 
Handle scenarios that cause the YarnAutoScalingManager …
URL: https://github.com/apache/incubator-gobblin/pull/2644
 
 
   …to be stuck
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [X] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-780
   
   
   ### Description
   - [X] Here are some details about my PR, including screenshots (if 
applicable):
   Issue 1: The YarnAutoScalingRunnable is run in a fixed schedule by a 
ScheduledExecutorService in YarnAutoScalingManager. If the runnable encounters 
an exception the the executor service will stop scheduling it. Catch all 
exceptions in the runnable, log, and do not re-raise.
   
   Issue 2: The auto scaler may reduce the container count to 0. Helix will not 
schedule any flows if there are no participants connected. This results in the 
auto scaler keeping the container count at 0 and no progress is made. Fix this 
by not allowing the container count to be reduced below 1.
   
   ### Tests
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   - [X] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 247520)
Time Spent: 10m
Remaining Estimate: 0h

> Handle scenarios that cause the YarnAutoScalingManager to be stuck
> --
>
> Key: GOBBLIN-780
> URL: https://issues.apache.org/jira/browse/GOBBLIN-780
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Issue 1: The YarnAutoScalingRunnable is run in a fixed schedule by a 
> ScheduledExecutorService in YarnAutoScalingManager. If the runnable 
> encounters an exception the the executor service will stop scheduling it. 
> Catch all exceptions in the runnable, log, and do not re-raise.
> Issue 2: The auto scaler may reduce the container count to 0. Helix will not 
> schedule any flows if there are no participants connected. This results in 
> the auto scaler keeping the container count at 0 and no progress is made. Fix 
> this by not allowing the container count to be reduced below 1.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GOBBLIN-780) Handle scenarios that causes the YarnAutoScalingManager to be stuck

2019-05-23 Thread Hung Tran (JIRA)
Hung Tran created GOBBLIN-780:
-

 Summary: Handle scenarios that causes the YarnAutoScalingManager 
to be stuck
 Key: GOBBLIN-780
 URL: https://issues.apache.org/jira/browse/GOBBLIN-780
 Project: Apache Gobblin
  Issue Type: Task
Reporter: Hung Tran


Issue 1: The YarnAutoScalingRunnable is run in a fixed schedule by a 
ScheduledExecutorService in YarnAutoScalingManager. If the runnable encounters 
an exception the the executor service will stop scheduling it. Catch all 
exceptions in the runnable, log, and do not re-raise.

Issue 2: The auto scaler may reduce the container count to 0. Helix will not 
schedule any flows if there are no participants connected. This results in the 
auto scaler keeping the container count at 0 and no progress is made. Fix this 
by not allowing the container count to be reduced below 1.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)