[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293597&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293597
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 04:34
Start Date: 13/Aug/19 04:34
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] 
Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215701
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -530,6 +545,48 @@ private void pollAndAdvanceDag()
   }
 }
 
+private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus 
jobStatus) {
+  if (slaKilled) {
+return CANCELLED;
+  } else {
+if (jobStatus == null) {
+  return PENDING;
+} else {
+  return valueOf(jobStatus.getEventName());
+}
+  }
+}
+
+/**
+ * Check if the SLA is configured for the flow this job belongs to.
+ * If it is, this method will try to cancel the job when SLA is reached.
+ *
+ * @param node dag node of the job
+ * @return true if the job is killed because it reached sla
+ * @throws ExecutionException exception
+ * @throws InterruptedException exception
+ */
+private boolean slaKillIfNeeded(DagNode node) throws 
ExecutionException, InterruptedException {
+  long flowStartTime = DagManagerUtils.getFlowStartTime(node);
+  long currentTime = System.currentTimeMillis();
+  String dagId = DagManagerUtils.generateDagId(node);
+
+  long flowSla;
+  if (dagToSLA.containsKey(dagId)) {
+flowSla = dagToSLA.get(dagId);
+  } else {
+flowSla = DagManagerUtils.getFlowSLA(node);
+dagToSLA.put(dagId, flowSla);
+  }
+
+  if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + 
flowSla) {
+log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla);
 
 Review comment:
   log.info("Flow exceeded the SLA...")?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293597)
Time Spent: 3.5h  (was: 3h 20m)

> add a flow level sla in gaas flows
> --
>
> Key: GOBBLIN-847
> URL: https://issues.apache.org/jira/browse/GOBBLIN-847
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> add a flow level sla in gaas flows, because sometimes azkaban jobs may not 
> start and hence send any tracking event, or azkaban maybe down. in all those 
> cases, we might have to kill the job so we can start a new job



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293593&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293593
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 04:34
Start Date: 13/Aug/19 04:34
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] 
Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216048
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -473,22 +490,20 @@ private void initialize(Dag dag)
 /**
  * Proceed the execution of each dag node based on job status.
  */
-private void pollAndAdvanceDag()
-throws IOException {
+private void pollAndAdvanceDag() throws IOException, ExecutionException, 
InterruptedException {
   this.failedDagIdsFinishRunning.clear();
-
   Map>> nextSubmitted = 
Maps.newHashMap();
   List> nodesToCleanUp = Lists.newArrayList();
+
   for (DagNode node: this.jobToDag.keySet()) {
-long pollStartTime = System.nanoTime();
+boolean slaKilled = slaKillIfNeeded(node);
+
 JobStatus jobStatus = pollJobStatus(node);
 
 Review comment:
   Do we have to pollJobStatus if slaKilled is true? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293593)
Time Spent: 3h 10m  (was: 3h)

> add a flow level sla in gaas flows
> --
>
> Key: GOBBLIN-847
> URL: https://issues.apache.org/jira/browse/GOBBLIN-847
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> add a flow level sla in gaas flows, because sometimes azkaban jobs may not 
> start and hence send any tracking event, or azkaban maybe down. in all those 
> cases, we might have to kill the job so we can start a new job



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293594&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293594
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 04:34
Start Date: 13/Aug/19 04:34
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] 
Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213068
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -427,25 +428,20 @@ private void cancelDag(String dagToCancel) throws 
ExecutionException, Interrupte
 List> dagNodesToCancel = 
this.dagToJobs.get(dagToCancel);
 log.info("Found {} DagNodes to cancel.", dagNodesToCancel.size());
 for (DagNode dagNodeToCancel : dagNodesToCancel) {
-  cancelDag(dagNodeToCancel);
+  cancelDagNode(dagNodeToCancel);
 
 Review comment:
   Will dagNodesToCancel include jobs that finished successfully? Or only jobs 
currently running?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293594)
Time Spent: 3h 20m  (was: 3h 10m)

> add a flow level sla in gaas flows
> --
>
> Key: GOBBLIN-847
> URL: https://issues.apache.org/jira/browse/GOBBLIN-847
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> add a flow level sla in gaas flows, because sometimes azkaban jobs may not 
> start and hence send any tracking event, or azkaban maybe down. in all those 
> cases, we might have to kill the job so we can start a new job



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293595&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293595
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 04:34
Start Date: 13/Aug/19 04:34
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] 
Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213742
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -454,6 +450,7 @@ private void sendCancellationEvent(JobExecutionPlan 
jobExecutionPlan) {
   if (this.eventSubmitter.isPresent()) {
 Map jobMetadata = 
TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan);
 
this.eventSubmitter.get().getTimingEvent(TimingEvent.LauncherTimings.JOB_CANCEL).stop(jobMetadata);
+
this.eventSubmitter.get().getTimingEvent(TimingEvent.FlowTimings.FLOW_CANCEL).stop(jobMetadata);
 
 Review comment:
   So we will emit a FLOW_CANCEL event for every running job? Why not emit once?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293595)

> add a flow level sla in gaas flows
> --
>
> Key: GOBBLIN-847
> URL: https://issues.apache.org/jira/browse/GOBBLIN-847
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> add a flow level sla in gaas flows, because sometimes azkaban jobs may not 
> start and hence send any tracking event, or azkaban maybe down. in all those 
> cases, we might have to kill the job so we can start a new job



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293592&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293592
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 04:34
Start Date: 13/Aug/19 04:34
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] 
Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215757
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -530,6 +545,48 @@ private void pollAndAdvanceDag()
   }
 }
 
+private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus 
jobStatus) {
+  if (slaKilled) {
+return CANCELLED;
+  } else {
+if (jobStatus == null) {
+  return PENDING;
+} else {
+  return valueOf(jobStatus.getEventName());
+}
+  }
+}
+
+/**
+ * Check if the SLA is configured for the flow this job belongs to.
+ * If it is, this method will try to cancel the job when SLA is reached.
+ *
+ * @param node dag node of the job
+ * @return true if the job is killed because it reached sla
+ * @throws ExecutionException exception
+ * @throws InterruptedException exception
+ */
+private boolean slaKillIfNeeded(DagNode node) throws 
ExecutionException, InterruptedException {
+  long flowStartTime = DagManagerUtils.getFlowStartTime(node);
+  long currentTime = System.currentTimeMillis();
+  String dagId = DagManagerUtils.generateDagId(node);
+
+  long flowSla;
+  if (dagToSLA.containsKey(dagId)) {
+flowSla = dagToSLA.get(dagId);
+  } else {
+flowSla = DagManagerUtils.getFlowSLA(node);
+dagToSLA.put(dagId, flowSla);
+  }
+
+  if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + 
flowSla) {
+log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla);
+cancelDagNode(node);
 
 Review comment:
   why not call cancelDag(dagId) here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293592)
Time Spent: 3h  (was: 2h 50m)

> add a flow level sla in gaas flows
> --
>
> Key: GOBBLIN-847
> URL: https://issues.apache.org/jira/browse/GOBBLIN-847
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> add a flow level sla in gaas flows, because sometimes azkaban jobs may not 
> start and hence send any tracking event, or azkaban maybe down. in all those 
> cases, we might have to kill the job so we can start a new job



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293596&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293596
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 04:34
Start Date: 13/Aug/19 04:34
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] 
Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216361
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -473,22 +490,20 @@ private void initialize(Dag dag)
 /**
  * Proceed the execution of each dag node based on job status.
  */
-private void pollAndAdvanceDag()
-throws IOException {
+private void pollAndAdvanceDag() throws IOException, ExecutionException, 
InterruptedException {
   this.failedDagIdsFinishRunning.clear();
-
   Map>> nextSubmitted = 
Maps.newHashMap();
   List> nodesToCleanUp = Lists.newArrayList();
+
   for (DagNode node: this.jobToDag.keySet()) {
-long pollStartTime = System.nanoTime();
+boolean slaKilled = slaKillIfNeeded(node);
+
 JobStatus jobStatus = pollJobStatus(node);
-Instrumented.updateTimer(this.jobStatusPolledTimer, System.nanoTime() 
- pollStartTime, TimeUnit.NANOSECONDS);
-if (jobStatus == null) {
-  continue;
-}
+
+ExecutionStatus status = getJobExecutionStatus(slaKilled, jobStatus);
 
 Review comment:
   Why not ExecutionStatus status = (slaKilled) ? ExecutionStatus.CANCELLED : 
getJobExecutionStatus(pollJobStatus(node));? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293596)
Time Spent: 3.5h  (was: 3h 20m)

> add a flow level sla in gaas flows
> --
>
> Key: GOBBLIN-847
> URL: https://issues.apache.org/jira/browse/GOBBLIN-847
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> add a flow level sla in gaas flows, because sometimes azkaban jobs may not 
> start and hence send any tracking event, or azkaban maybe down. in all those 
> cases, we might have to kill the job so we can start a new job



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216048
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -473,22 +490,20 @@ private void initialize(Dag dag)
 /**
  * Proceed the execution of each dag node based on job status.
  */
-private void pollAndAdvanceDag()
-throws IOException {
+private void pollAndAdvanceDag() throws IOException, ExecutionException, 
InterruptedException {
   this.failedDagIdsFinishRunning.clear();
-
   Map>> nextSubmitted = 
Maps.newHashMap();
   List> nodesToCleanUp = Lists.newArrayList();
+
   for (DagNode node: this.jobToDag.keySet()) {
-long pollStartTime = System.nanoTime();
+boolean slaKilled = slaKillIfNeeded(node);
+
 JobStatus jobStatus = pollJobStatus(node);
 
 Review comment:
   Do we have to pollJobStatus if slaKilled is true? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216361
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -473,22 +490,20 @@ private void initialize(Dag dag)
 /**
  * Proceed the execution of each dag node based on job status.
  */
-private void pollAndAdvanceDag()
-throws IOException {
+private void pollAndAdvanceDag() throws IOException, ExecutionException, 
InterruptedException {
   this.failedDagIdsFinishRunning.clear();
-
   Map>> nextSubmitted = 
Maps.newHashMap();
   List> nodesToCleanUp = Lists.newArrayList();
+
   for (DagNode node: this.jobToDag.keySet()) {
-long pollStartTime = System.nanoTime();
+boolean slaKilled = slaKillIfNeeded(node);
+
 JobStatus jobStatus = pollJobStatus(node);
-Instrumented.updateTimer(this.jobStatusPolledTimer, System.nanoTime() 
- pollStartTime, TimeUnit.NANOSECONDS);
-if (jobStatus == null) {
-  continue;
-}
+
+ExecutionStatus status = getJobExecutionStatus(slaKilled, jobStatus);
 
 Review comment:
   Why not ExecutionStatus status = (slaKilled) ? ExecutionStatus.CANCELLED : 
getJobExecutionStatus(pollJobStatus(node));? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215701
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -530,6 +545,48 @@ private void pollAndAdvanceDag()
   }
 }
 
+private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus 
jobStatus) {
+  if (slaKilled) {
+return CANCELLED;
+  } else {
+if (jobStatus == null) {
+  return PENDING;
+} else {
+  return valueOf(jobStatus.getEventName());
+}
+  }
+}
+
+/**
+ * Check if the SLA is configured for the flow this job belongs to.
+ * If it is, this method will try to cancel the job when SLA is reached.
+ *
+ * @param node dag node of the job
+ * @return true if the job is killed because it reached sla
+ * @throws ExecutionException exception
+ * @throws InterruptedException exception
+ */
+private boolean slaKillIfNeeded(DagNode node) throws 
ExecutionException, InterruptedException {
+  long flowStartTime = DagManagerUtils.getFlowStartTime(node);
+  long currentTime = System.currentTimeMillis();
+  String dagId = DagManagerUtils.generateDagId(node);
+
+  long flowSla;
+  if (dagToSLA.containsKey(dagId)) {
+flowSla = dagToSLA.get(dagId);
+  } else {
+flowSla = DagManagerUtils.getFlowSLA(node);
+dagToSLA.put(dagId, flowSla);
+  }
+
+  if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + 
flowSla) {
+log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla);
 
 Review comment:
   log.info("Flow exceeded the SLA...")?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213742
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -454,6 +450,7 @@ private void sendCancellationEvent(JobExecutionPlan 
jobExecutionPlan) {
   if (this.eventSubmitter.isPresent()) {
 Map jobMetadata = 
TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan);
 
this.eventSubmitter.get().getTimingEvent(TimingEvent.LauncherTimings.JOB_CANCEL).stop(jobMetadata);
+
this.eventSubmitter.get().getTimingEvent(TimingEvent.FlowTimings.FLOW_CANCEL).stop(jobMetadata);
 
 Review comment:
   So we will emit a FLOW_CANCEL event for every running job? Why not emit once?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213068
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -427,25 +428,20 @@ private void cancelDag(String dagToCancel) throws 
ExecutionException, Interrupte
 List> dagNodesToCancel = 
this.dagToJobs.get(dagToCancel);
 log.info("Found {} DagNodes to cancel.", dagNodesToCancel.size());
 for (DagNode dagNodeToCancel : dagNodesToCancel) {
-  cancelDag(dagNodeToCancel);
+  cancelDagNode(dagNodeToCancel);
 
 Review comment:
   Will dagNodesToCancel include jobs that finished successfully? Or only jobs 
currently running?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215757
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
 ##
 @@ -530,6 +545,48 @@ private void pollAndAdvanceDag()
   }
 }
 
+private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus 
jobStatus) {
+  if (slaKilled) {
+return CANCELLED;
+  } else {
+if (jobStatus == null) {
+  return PENDING;
+} else {
+  return valueOf(jobStatus.getEventName());
+}
+  }
+}
+
+/**
+ * Check if the SLA is configured for the flow this job belongs to.
+ * If it is, this method will try to cancel the job when SLA is reached.
+ *
+ * @param node dag node of the job
+ * @return true if the job is killed because it reached sla
+ * @throws ExecutionException exception
+ * @throws InterruptedException exception
+ */
+private boolean slaKillIfNeeded(DagNode node) throws 
ExecutionException, InterruptedException {
+  long flowStartTime = DagManagerUtils.getFlowStartTime(node);
+  long currentTime = System.currentTimeMillis();
+  String dagId = DagManagerUtils.generateDagId(node);
+
+  long flowSla;
+  if (dagToSLA.containsKey(dagId)) {
+flowSla = dagToSLA.get(dagId);
+  } else {
+flowSla = DagManagerUtils.getFlowSLA(node);
+dagToSLA.put(dagId, flowSla);
+  }
+
+  if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + 
flowSla) {
+log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla);
+cancelDagNode(node);
 
 Review comment:
   why not call cancelDag(dagId) here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (GOBBLIN-822) upgrade log4j to log4j2

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-822:

Description: 
log4j2 has routing appender that would be super useful and probably only way to 
achieve " job specific log files" functionality without meddling around 
fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits (ref: 
HIVE-11304)

  was:
log4j2 has routing appender that would be super useful and probably only way to 
achieve " job specific log files" functionality without meddling around 
fileHandler in log4j.

Also log4j2 has lot of new functionalities and performance benefits


> upgrade log4j to log4j2
> ---
>
> Key: GOBBLIN-822
> URL: https://issues.apache.org/jira/browse/GOBBLIN-822
> Project: Apache Gobblin
>  Issue Type: Sub-task
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> log4j2 has routing appender that would be super useful and probably only way 
> to achieve " job specific log files" functionality without meddling around 
> fileHandler in log4j.
> Also log4j2 has lot of new functionalities and performance benefits (ref: 
> HIVE-11304)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?focusedWorklogId=293542&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293542
 ]

ASF GitHub Bot logged work on GOBBLIN-854:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 02:19
Start Date: 13/Aug/19 02:19
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on pull request #2710: 
[GOBBLIN-854] use typesafe config instead of java properties
URL: https://github.com/apache/incubator-gobblin/pull/2710
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-854
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   standalone mode SchedulerDaemon uses java properties, this ticket is to use 
TypeSafe Config instead to make config standardized across the modes and also 
enable config to take benefits of TypeSafe functionalities. Also it takes 2 
different config file as argument, one as default and another as custom, we 
probably only need one property file.
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: not required, no change in functionality
   
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293542)
Time Spent: 10m
Remaining Estimate: 0h

> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.
> Also it takes 2 different config file as argument, one as default and another 
> as custom, we probably only need one property file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] jhsenjaliya opened a new pull request #2710: [GOBBLIN-854] use typesafe config instead of java properties

2019-08-12 Thread GitBox
jhsenjaliya opened a new pull request #2710: [GOBBLIN-854] use typesafe config 
instead of java properties
URL: https://github.com/apache/incubator-gobblin/pull/2710
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-854
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   standalone mode SchedulerDaemon uses java properties, this ticket is to use 
TypeSafe Config instead to make config standardized across the modes and also 
enable config to take benefits of TypeSafe functionalities. Also it takes 2 
different config file as argument, one as default and another as custom, we 
probably only need one property file.
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: not required, no change in functionality
   
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-854:

Description: 
standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to use 
TypeSafe Config instead to make config standardized across the modes and also 
enable config to take benefits of TypeSafe functionalities.

Also it takes 2 different config file as argument, one as default and another 
as custom, we probably only need one property file.

  was:standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is 
to use TypeSafe Config instead to make config standardized across the modes and 
also enable config to take benefits of TypeSafe functionalities.


> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.
> Also it takes 2 different config file as argument, one as default and another 
> as custom, we probably only need one property file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)
Jay Sen created GOBBLIN-854:
---

 Summary: update config reader in standalone mode
 Key: GOBBLIN-854
 URL: https://issues.apache.org/jira/browse/GOBBLIN-854
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Jay Sen


standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to 
use TypeSafe Config instead to make config standardized across the modes and 
also enable config to take benefits of TypeSafe functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-854:

Affects Version/s: 0.14.0

> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
>
> standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-854:

Fix Version/s: 0.15.0

> update config reader in standalone mode
> ---
>
> Key: GOBBLIN-854
> URL: https://issues.apache.org/jira/browse/GOBBLIN-854
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to 
> use TypeSafe Config instead to make config standardized across the modes and 
> also enable config to take benefits of TypeSafe functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293533&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293533
 ]

ASF GitHub Bot logged work on GOBBLIN-847:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 02:02
Start Date: 13/Aug/19 02:02
Worklog Time Spent: 10m 
  Work Description: codecov-io commented on issue #2702: [GOBBLIN-847] Flow 
level sla
URL: 
https://github.com/apache/incubator-gobblin/pull/2702#issuecomment-520659463
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=h1)
 Report
   > Merging 
[#2702](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/8903ebf3807af3369839069e2082afa70c7fe77e?src=pr&el=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `85.71%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ## master   #2702  +/-   ##
   ===
   + Coverage  44.9%   44.9%   +<.01% 
   - Complexity 87138718   +5 
   ===
 Files  18791879  
 Lines 70079   70129  +50 
 Branches   77037707   +4 
   ===
   + Hits  31466   31490  +24 
   - Misses35702   35730  +28 
   + Partials   29112909   -2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...pache/gobblin/configuration/ConfigurationKeys.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29uZmlndXJhdGlvbi9Db25maWd1cmF0aW9uS2V5cy5qYXZh)
 | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[.../org/apache/gobblin/metrics/event/TimingEvent.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9UaW1pbmdFdmVudC5qYXZh)
 | `70% <ø> (ø)` | `15 <0> (ø)` | :arrow_down: |
   | 
[...time/spec\_executorInstance/MockedSpecExecutor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19leGVjdXRvckluc3RhbmNlL01vY2tlZFNwZWNFeGVjdXRvci5qYXZh)
 | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...service/modules/orchestration/DagManagerUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXJVdGlscy5qYXZh)
 | `84.28% <100%> (+3.95%)` | `30 <9> (+9)` | :arrow_up: |
   | 
[...blin/service/modules/orchestration/DagManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXIuamF2YQ==)
 | `77.33% <86.04%> (+2.79%)` | `12 <1> (+1)` | :arrow_up: |
   | 
[...bblin/cluster/GobblinHelixJobLauncherListener.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4Sm9iTGF1bmNoZXJMaXN0ZW5lci5qYXZh)
 | `70% <0%> (-30%)` | `3% <0%> (-2%)` | |
   | 
[...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh)
 | `32.71% <0%> (-9.13%)` | `11% <0%> (-3%)` | |
   | 
[.../gobblin/cluster/HelixRetriggeringJobCallable.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhSZXRyaWdnZXJpbmdKb2JDYWxsYWJsZS5qYXZh)
 | `60.41% <0%> (-3.48%)` | `9% <0%> (ø)` | |
   | 
[...ache/gobblin/couchbase/writer/CouchbaseWriter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY291Y2hiYXNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvdWNoYmFzZS93cml0ZXIvQ291Y2hiYXNlV3JpdGVyLmphdmE=)
 | `66.27% <0%> (-2.33%)` | `11% <0%> (ø)` | |
   | 
[...pache/gobblin/cluster/GobblinHelixJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/27

[GitHub] [incubator-gobblin] codecov-io commented on issue #2702: [GOBBLIN-847] Flow level sla

2019-08-12 Thread GitBox
codecov-io commented on issue #2702: [GOBBLIN-847] Flow level sla
URL: 
https://github.com/apache/incubator-gobblin/pull/2702#issuecomment-520659463
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=h1)
 Report
   > Merging 
[#2702](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/8903ebf3807af3369839069e2082afa70c7fe77e?src=pr&el=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `85.71%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ## master   #2702  +/-   ##
   ===
   + Coverage  44.9%   44.9%   +<.01% 
   - Complexity 87138718   +5 
   ===
 Files  18791879  
 Lines 70079   70129  +50 
 Branches   77037707   +4 
   ===
   + Hits  31466   31490  +24 
   - Misses35702   35730  +28 
   + Partials   29112909   -2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...pache/gobblin/configuration/ConfigurationKeys.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29uZmlndXJhdGlvbi9Db25maWd1cmF0aW9uS2V5cy5qYXZh)
 | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[.../org/apache/gobblin/metrics/event/TimingEvent.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9UaW1pbmdFdmVudC5qYXZh)
 | `70% <ø> (ø)` | `15 <0> (ø)` | :arrow_down: |
   | 
[...time/spec\_executorInstance/MockedSpecExecutor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19leGVjdXRvckluc3RhbmNlL01vY2tlZFNwZWNFeGVjdXRvci5qYXZh)
 | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...service/modules/orchestration/DagManagerUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXJVdGlscy5qYXZh)
 | `84.28% <100%> (+3.95%)` | `30 <9> (+9)` | :arrow_up: |
   | 
[...blin/service/modules/orchestration/DagManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXIuamF2YQ==)
 | `77.33% <86.04%> (+2.79%)` | `12 <1> (+1)` | :arrow_up: |
   | 
[...bblin/cluster/GobblinHelixJobLauncherListener.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4Sm9iTGF1bmNoZXJMaXN0ZW5lci5qYXZh)
 | `70% <0%> (-30%)` | `3% <0%> (-2%)` | |
   | 
[...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh)
 | `32.71% <0%> (-9.13%)` | `11% <0%> (-3%)` | |
   | 
[.../gobblin/cluster/HelixRetriggeringJobCallable.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhSZXRyaWdnZXJpbmdKb2JDYWxsYWJsZS5qYXZh)
 | `60.41% <0%> (-3.48%)` | `9% <0%> (ø)` | |
   | 
[...ache/gobblin/couchbase/writer/CouchbaseWriter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY291Y2hiYXNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvdWNoYmFzZS93cml0ZXIvQ291Y2hiYXNlV3JpdGVyLmphdmE=)
 | `66.27% <0%> (-2.33%)` | `11% <0%> (ø)` | |
   | 
[...pache/gobblin/cluster/GobblinHelixJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4Sm9iTGF1bmNoZXIuamF2YQ==)
 | `81.53% <0%> (-1.8%)` | `26% <0%> (-2%)` | |
   | ... and [6 
more](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=p

[GitHub] [incubator-gobblin] codecov-io commented on issue #2709: [GOBBLIN-853] Support multiple paths specified in flow config

2019-08-12 Thread GitBox
codecov-io commented on issue #2709: [GOBBLIN-853] Support multiple paths 
specified in flow config
URL: 
https://github.com/apache/incubator-gobblin/pull/2709#issuecomment-520650477
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=h1)
 Report
   > Merging 
[#2709](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **increase** coverage by `0.02%`.
   > The diff coverage is `85.71%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2709  +/-   ##
   
   + Coverage 44.87%   44.89%   +0.02% 
   - Complexity 8708 8714   +6 
   
 Files  1879 1879  
 Lines 7009570125  +30 
 Branches   7704 7711   +7 
   
   + Hits  3145531484  +29 
   - Misses3572835729   +1 
 Partials   2912 2912
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==)
 | `75.8% <100%> (+0.39%)` | `9 <0> (ø)` | :arrow_down: |
   | 
[...lin/service/modules/flow/MultiHopFlowCompiler.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93L011bHRpSG9wRmxvd0NvbXBpbGVyLmphdmE=)
 | `67.92% <84.84%> (+5.58%)` | `13 <3> (+5)` | :arrow_up: |
   | 
[.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=)
 | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: |
   | 
[...n/service/modules/template/StaticFlowTemplate.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy90ZW1wbGF0ZS9TdGF0aWNGbG93VGVtcGxhdGUuamF2YQ==)
 | `90.9% <0%> (+1.51%)` | `16% <0%> (ø)` | :arrow_down: |
   | 
[...service/modules/dataset/BaseDatasetDescriptor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9kYXRhc2V0L0Jhc2VEYXRhc2V0RGVzY3JpcHRvci5qYXZh)
 | `72.41% <0%> (+3.44%)` | `13% <0%> (+1%)` | :arrow_up: |
   | 
[...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=)
 | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=footer).
 Last update 
[50280ee...011a563](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-853) Support multiple paths specified in flow config

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-853?focusedWorklogId=293519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293519
 ]

ASF GitHub Bot logged work on GOBBLIN-853:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 01:11
Start Date: 13/Aug/19 01:11
Worklog Time Spent: 10m 
  Work Description: codecov-io commented on issue #2709: [GOBBLIN-853] 
Support multiple paths specified in flow config
URL: 
https://github.com/apache/incubator-gobblin/pull/2709#issuecomment-520650477
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=h1)
 Report
   > Merging 
[#2709](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **increase** coverage by `0.02%`.
   > The diff coverage is `85.71%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2709  +/-   ##
   
   + Coverage 44.87%   44.89%   +0.02% 
   - Complexity 8708 8714   +6 
   
 Files  1879 1879  
 Lines 7009570125  +30 
 Branches   7704 7711   +7 
   
   + Hits  3145531484  +29 
   - Misses3572835729   +1 
 Partials   2912 2912
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==)
 | `75.8% <100%> (+0.39%)` | `9 <0> (ø)` | :arrow_down: |
   | 
[...lin/service/modules/flow/MultiHopFlowCompiler.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93L011bHRpSG9wRmxvd0NvbXBpbGVyLmphdmE=)
 | `67.92% <84.84%> (+5.58%)` | `13 <3> (+5)` | :arrow_up: |
   | 
[.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=)
 | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: |
   | 
[...n/service/modules/template/StaticFlowTemplate.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy90ZW1wbGF0ZS9TdGF0aWNGbG93VGVtcGxhdGUuamF2YQ==)
 | `90.9% <0%> (+1.51%)` | `16% <0%> (ø)` | :arrow_down: |
   | 
[...service/modules/dataset/BaseDatasetDescriptor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9kYXRhc2V0L0Jhc2VEYXRhc2V0RGVzY3JpcHRvci5qYXZh)
 | `72.41% <0%> (+3.44%)` | `13% <0%> (+1%)` | :arrow_up: |
   | 
[...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=)
 | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=footer).
 Last update 
[50280ee...011a563](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog

[jira] [Work logged] (GOBBLIN-853) Support multiple paths specified in flow config

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-853?focusedWorklogId=293497&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293497
 ]

ASF GitHub Bot logged work on GOBBLIN-853:
--

Author: ASF GitHub Bot
Created on: 13/Aug/19 00:22
Start Date: 13/Aug/19 00:22
Worklog Time Spent: 10m 
  Work Description: jack-moseley commented on pull request #2709: 
[GOBBLIN-853] Support multiple paths specified in flow config
URL: https://github.com/apache/incubator-gobblin/pull/2709
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-853
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   
   Support multiple paths in flow spec by splitting into multiple flow specs on 
the path property. Then the resulting dags are merged into a single dag, so 
each dataset will be a concurrent job within the dag.
   
   Also added a random string to the end of each job.name to avoid collisions, 
since job.name is assumed to be unique with a dag.
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Added unit test
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293497)
Time Spent: 10m
Remaining Estimate: 0h

> Support multiple paths specified in flow config
> ---
>
> Key: GOBBLIN-853
> URL: https://issues.apache.org/jira/browse/GOBBLIN-853
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Jack Moseley
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] jack-moseley opened a new pull request #2709: [GOBBLIN-853] Support multiple paths specified in flow config

2019-08-12 Thread GitBox
jack-moseley opened a new pull request #2709: [GOBBLIN-853] Support multiple 
paths specified in flow config
URL: https://github.com/apache/incubator-gobblin/pull/2709
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-853
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   
   Support multiple paths in flow spec by splitting into multiple flow specs on 
the path property. Then the resulting dags are merged into a single dag, so 
each dataset will be a concurrent job within the dag.
   
   Also added a random string to the end of each job.name to avoid collisions, 
since job.name is assumed to be unique with a dag.
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Added unit test
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (GOBBLIN-853) Support multiple paths specified in flow config

2019-08-12 Thread Jack Moseley (JIRA)
Jack Moseley created GOBBLIN-853:


 Summary: Support multiple paths specified in flow config
 Key: GOBBLIN-853
 URL: https://issues.apache.org/jira/browse/GOBBLIN-853
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Jack Moseley






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293451
 ]

ASF GitHub Bot logged work on GOBBLIN-852:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 22:40
Start Date: 12/Aug/19 22:40
Worklog Time Spent: 10m 
  Work Description: codecov-io commented on issue #2708: 
[GOBBLIN-852]Reorganize the code for hive registration to isolate function
URL: 
https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520621350
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=h1)
 Report
   > Merging 
[#2708](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `0%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2708  +/-   ##
   
   - Coverage 44.87%   44.86%   -0.01% 
   - Complexity 8708 8709   +1 
   
 Files  1879 1879  
 Lines 7009570098   +3 
 Branches   7704 7705   +1 
   
   - Hits  3145531451   -4 
   - Misses3572835735   +7 
 Partials   2912 2912
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...e/gobblin/publisher/HiveRegistrationPublisher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3B1Ymxpc2hlci9IaXZlUmVnaXN0cmF0aW9uUHVibGlzaGVyLmphdmE=)
 | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...in/java/org/apache/gobblin/cluster/SingleTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvU2luZ2xlVGFzay5qYXZh)
 | `73.58% <0%> (-7.55%)` | `9% <0%> (ø)` | |
   | 
[...a/org/apache/gobblin/cluster/GobblinHelixTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4VGFzay5qYXZh)
 | `76.08% <0%> (-4.35%)` | `5% <0%> (ø)` | |
   | 
[.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==)
 | `64.78% <0%> (-0.94%)` | `29% <0%> (ø)` | |
   | 
[...main/java/org/apache/gobblin/util/HadoopUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvSGFkb29wVXRpbHMuamF2YQ==)
 | `30.53% <0%> (+0.33%)` | `24% <0%> (ø)` | :arrow_down: |
   | 
[...lin/restli/throttling/ZookeeperLeaderElection.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL1pvb2tlZXBlckxlYWRlckVsZWN0aW9uLmphdmE=)
 | `72.22% <0%> (+2.22%)` | `13% <0%> (ø)` | :arrow_down: |
   | 
[...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=)
 | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=footer).
 Last update 
[50280ee...89bf954](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   
 

This is an automated message from the Apach

[GitHub] [incubator-gobblin] codecov-io commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function

2019-08-12 Thread GitBox
codecov-io commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive 
registration to isolate function
URL: 
https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520621350
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=h1)
 Report
   > Merging 
[#2708](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `0%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2708  +/-   ##
   
   - Coverage 44.87%   44.86%   -0.01% 
   - Complexity 8708 8709   +1 
   
 Files  1879 1879  
 Lines 7009570098   +3 
 Branches   7704 7705   +1 
   
   - Hits  3145531451   -4 
   - Misses3572835735   +7 
 Partials   2912 2912
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...e/gobblin/publisher/HiveRegistrationPublisher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3B1Ymxpc2hlci9IaXZlUmVnaXN0cmF0aW9uUHVibGlzaGVyLmphdmE=)
 | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...in/java/org/apache/gobblin/cluster/SingleTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvU2luZ2xlVGFzay5qYXZh)
 | `73.58% <0%> (-7.55%)` | `9% <0%> (ø)` | |
   | 
[...a/org/apache/gobblin/cluster/GobblinHelixTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4VGFzay5qYXZh)
 | `76.08% <0%> (-4.35%)` | `5% <0%> (ø)` | |
   | 
[.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==)
 | `64.78% <0%> (-0.94%)` | `29% <0%> (ø)` | |
   | 
[...main/java/org/apache/gobblin/util/HadoopUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvSGFkb29wVXRpbHMuamF2YQ==)
 | `30.53% <0%> (+0.33%)` | `24% <0%> (ø)` | :arrow_down: |
   | 
[...lin/restli/throttling/ZookeeperLeaderElection.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL1pvb2tlZXBlckxlYWRlckVsZWN0aW9uLmphdmE=)
 | `72.22% <0%> (+2.22%)` | `13% <0%> (ø)` | :arrow_down: |
   | 
[...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=)
 | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=footer).
 Last update 
[50280ee...89bf954](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (GOBBLIN-824) upgrade to latest libraries in Gobblin

2019-08-12 Thread Jay Sen (JIRA)


[ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905621#comment-16905621
 ] 

Jay Sen commented on GOBBLIN-824:
-

GOBBLIN-818 does minor upgrades.

> upgrade to latest libraries in Gobblin
> --
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-824) upgrade to latest libraries in Gobblin

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-824:

Summary: upgrade to latest libraries in Gobblin  (was: upgrade libs 
versions in Gobblin)

> upgrade to latest libraries in Gobblin
> --
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-824) upgrade libs versions in Gobblin

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-824:

Description: 
lot of libs are old, like hadoop, hive, etc... 
 it wont be easy to just comile gobblin with new version via passing new 
version on command line, there is lot of changes since last couple of years. 
 Gobblin should use latest versions

Hadoop: 2.9.x 
 hive : 2.3.5
 pegasus: 24.0.2

Avro : 1.8.2

etc...

please feel free to mention which lib should be updated as part of this overall 
upgrade process.

  was:
lot of libs are old, like hadoop, hive, etc... 
it wont be easy to just comile gobblin with new version via passing new version 
on command line, there is lot of changes since last couple of years. 
Gobblin should use latest versions

hadoop : 2.7.7
hive : 2.3.5
pegasus: 24.0.2

etc...

please feel free to mention which lib should be updated as part of this overall 
upgrade process.



> upgrade libs versions in Gobblin
> 
>
> Key: GOBBLIN-824
> URL: https://issues.apache.org/jira/browse/GOBBLIN-824
> Project: Apache Gobblin
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Jay Sen
>Priority: Major
> Fix For: 0.15.0
>
>
> lot of libs are old, like hadoop, hive, etc... 
>  it wont be easy to just comile gobblin with new version via passing new 
> version on command line, there is lot of changes since last couple of years. 
>  Gobblin should use latest versions
> Hadoop: 2.9.x 
>  hive : 2.3.5
>  pegasus: 24.0.2
> Avro : 1.8.2
> etc...
> please feel free to mention which lib should be updated as part of this 
> overall upgrade process.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Description: 
Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to 
backward incompatible changes in Hive 1.2

we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
stable.

  was:
Gobblin uses old hive 1.x.
Hive 2.x has significant changes and some incompatible/deprecated classes.
while hive 3.x is already in pipeline along with Hadoop 3.x, we should move to 
hive 2.x so user dont have to deal with manual fixes for their use of Gobblin. 

we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
stable.



> upgrade default hadoop versions to 2.7.x and hive version to 1.2
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due 
> to backward incompatible changes in Hive 1.2
> we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
> stable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2

2019-08-12 Thread Jay Sen (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Sen updated GOBBLIN-818:

Summary: upgrade default hadoop versions to 2.7.x and hive version to 1.2  
(was: MIgrate to Hive 2.x as default)

> upgrade default hadoop versions to 2.7.x and hive version to 1.2
> 
>
> Key: GOBBLIN-818
> URL: https://issues.apache.org/jira/browse/GOBBLIN-818
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>
> Gobblin uses old hive 1.x.
> Hive 2.x has significant changes and some incompatible/deprecated classes.
> while hive 3.x is already in pipeline along with Hadoop 3.x, we should move 
> to hive 2.x so user dont have to deal with manual fixes for their use of 
> Gobblin. 
> we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very 
> stable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293377&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293377
 ]

ASF GitHub Bot logged work on GOBBLIN-852:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 20:55
Start Date: 12/Aug/19 20:55
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on pull request #2708: 
[GOBBLIN-852]Reorganize the code for hive registration to isolate function
URL: https://github.com/apache/incubator-gobblin/pull/2708#discussion_r313123196
 
 

 ##
 File path: 
gobblin-core/src/main/java/org/apache/gobblin/publisher/HiveRegistrationPublisher.java
 ##
 @@ -157,7 +146,9 @@ public void publishData(Collection states) throws IOExc
   if (isPathDedupeEnabled && 
pathsToRegisterFromSingleState.contains(path)){
 continue;
   }
-  pathsToRegisterFromSingleState.add(path);
+  if(isPathDedupeEnabled) {
 
 Review comment:
   can this if () {} block be merged inside the above if() {} block? e.g. if 
(isPathDedupeEnabled) { pathsToRegisterFromSingleState.contains(path) ? 
continue: pathsToRegisterFromSingleState.add(path);}
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293377)
Time Spent: 0.5h  (was: 20m)

> Reorganize the code for hive registration to isolate function
> -
>
> Key: GOBBLIN-852
> URL: https://issues.apache.org/jira/browse/GOBBLIN-852
> Project: Apache Gobblin
>  Issue Type: Task
>  Components: hive-registration
>Reporter: Zihan Li
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function

2019-08-12 Thread GitBox
sv2000 commented on a change in pull request #2708: [GOBBLIN-852]Reorganize the 
code for hive registration to isolate function
URL: https://github.com/apache/incubator-gobblin/pull/2708#discussion_r313123196
 
 

 ##
 File path: 
gobblin-core/src/main/java/org/apache/gobblin/publisher/HiveRegistrationPublisher.java
 ##
 @@ -157,7 +146,9 @@ public void publishData(Collection states) throws IOExc
   if (isPathDedupeEnabled && 
pathsToRegisterFromSingleState.contains(path)){
 continue;
   }
-  pathsToRegisterFromSingleState.add(path);
+  if(isPathDedupeEnabled) {
 
 Review comment:
   can this if () {} block be merged inside the above if() {} block? e.g. if 
(isPathDedupeEnabled) { pathsToRegisterFromSingleState.contains(path) ? 
continue: pathsToRegisterFromSingleState.add(path);}
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293363
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 20:29
Start Date: 12/Aug/19 20:29
Worklog Time Spent: 10m 
  Work Description: codecov-io commented on issue #2707: [GOBBLIN-851] 
Provide capability to disable Hive partition schema registration.
URL: 
https://github.com/apache/incubator-gobblin/pull/2707#issuecomment-520582425
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=h1)
 Report
   > Merging 
[#2707](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `56.25%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2707  +/-   ##
   
   + Coverage 44.87%   44.88%   +0.01% 
   - Complexity 8708 8712   +4 
   
 Files  1879 1879  
 Lines 7009570103   +8 
 Branches   7704 7706   +2 
   
   + Hits  3145531466  +11 
   + Misses3572835721   -7 
   - Partials   2912 2916   +4
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...che/gobblin/hive/metastore/HiveMetaStoreUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL21ldGFzdG9yZS9IaXZlTWV0YVN0b3JlVXRpbHMuamF2YQ==)
 | `31.83% <50%> (-0.15%)` | `12 <0> (ø)` | |
   | 
[...g/apache/gobblin/metrics/event/EventSubmitter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9FdmVudFN1Ym1pdHRlci5qYXZh)
 | `37.03% <50%> (-0.7%)` | `3 <0> (ø)` | |
   | 
[...apache/gobblin/hive/avro/HiveAvroSerDeManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL2F2cm8vSGl2ZUF2cm9TZXJEZU1hbmFnZXIuamF2YQ==)
 | `52.17% <57.14%> (-0.77%)` | `8 <1> (ø)` | |
   | 
[.../org/apache/gobblin/hive/HiveRegistrationUnit.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL0hpdmVSZWdpc3RyYXRpb25Vbml0LmphdmE=)
 | `47.39% <60%> (+0.37%)` | `33 <1> (+1)` | :arrow_up: |
   | 
[...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==)
 | `15.49% <0%> (+0.84%)` | `4% <0%> (+1%)` | :arrow_up: |
   | 
[.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=)
 | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: |
   | 
[...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh)
 | `39.25% <0%> (+3.73%)` | `13% <0%> (+1%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=footer).
 Last update 
[50280ee...3ac88ef](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   
 

This is an automated message from the Apache Git

[GitHub] [incubator-gobblin] codecov-io commented on issue #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
codecov-io commented on issue #2707: [GOBBLIN-851] Provide capability to 
disable Hive partition schema registration.
URL: 
https://github.com/apache/incubator-gobblin/pull/2707#issuecomment-520582425
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=h1)
 Report
   > Merging 
[#2707](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `56.25%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2707  +/-   ##
   
   + Coverage 44.87%   44.88%   +0.01% 
   - Complexity 8708 8712   +4 
   
 Files  1879 1879  
 Lines 7009570103   +8 
 Branches   7704 7706   +2 
   
   + Hits  3145531466  +11 
   + Misses3572835721   -7 
   - Partials   2912 2916   +4
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...che/gobblin/hive/metastore/HiveMetaStoreUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL21ldGFzdG9yZS9IaXZlTWV0YVN0b3JlVXRpbHMuamF2YQ==)
 | `31.83% <50%> (-0.15%)` | `12 <0> (ø)` | |
   | 
[...g/apache/gobblin/metrics/event/EventSubmitter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9FdmVudFN1Ym1pdHRlci5qYXZh)
 | `37.03% <50%> (-0.7%)` | `3 <0> (ø)` | |
   | 
[...apache/gobblin/hive/avro/HiveAvroSerDeManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL2F2cm8vSGl2ZUF2cm9TZXJEZU1hbmFnZXIuamF2YQ==)
 | `52.17% <57.14%> (-0.77%)` | `8 <1> (ø)` | |
   | 
[.../org/apache/gobblin/hive/HiveRegistrationUnit.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL0hpdmVSZWdpc3RyYXRpb25Vbml0LmphdmE=)
 | `47.39% <60%> (+0.37%)` | `33 <1> (+1)` | :arrow_up: |
   | 
[...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==)
 | `15.49% <0%> (+0.84%)` | `4% <0%> (+1%)` | :arrow_up: |
   | 
[.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=)
 | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: |
   | 
[...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh)
 | `39.25% <0%> (+3.73%)` | `13% <0%> (+1%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=footer).
 Last update 
[50280ee...3ac88ef](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] ZihanLi58 commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function

2019-08-12 Thread GitBox
ZihanLi58 commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive 
registration to isolate function
URL: 
https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520576484
 
 
   @sv2000 @autumnust Can you help review this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293356
 ]

ASF GitHub Bot logged work on GOBBLIN-852:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 20:11
Start Date: 12/Aug/19 20:11
Worklog Time Spent: 10m 
  Work Description: ZihanLi58 commented on issue #2708: 
[GOBBLIN-852]Reorganize the code for hive registration to isolate function
URL: 
https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520576484
 
 
   @sv2000 @autumnust Can you help review this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293356)
Time Spent: 20m  (was: 10m)

> Reorganize the code for hive registration to isolate function
> -
>
> Key: GOBBLIN-852
> URL: https://issues.apache.org/jira/browse/GOBBLIN-852
> Project: Apache Gobblin
>  Issue Type: Task
>  Components: hive-registration
>Reporter: Zihan Li
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] ZihanLi58 opened a new pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function

2019-08-12 Thread GitBox
ZihanLi58 opened a new pull request #2708: [GOBBLIN-852]Reorganize the code for 
hive registration to isolate function
URL: https://github.com/apache/incubator-gobblin/pull/2708
 
 
   …(ETL-8815)
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following 
[GOBBLIN-852](https://issues.apache.org/jira/browse/GOBBLIN/) issues and 
references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-852
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if 
applicable):
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Run streaming job on azkaban.
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293354&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293354
 ]

ASF GitHub Bot logged work on GOBBLIN-852:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 20:10
Start Date: 12/Aug/19 20:10
Worklog Time Spent: 10m 
  Work Description: ZihanLi58 commented on pull request #2708: 
[GOBBLIN-852]Reorganize the code for hive registration to isolate function
URL: https://github.com/apache/incubator-gobblin/pull/2708
 
 
   …(ETL-8815)
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following 
[GOBBLIN-852](https://issues.apache.org/jira/browse/GOBBLIN/) issues and 
references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-852
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if 
applicable):
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Run streaming job on azkaban.
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293354)
Time Spent: 10m
Remaining Estimate: 0h

> Reorganize the code for hive registration to isolate function
> -
>
> Key: GOBBLIN-852
> URL: https://issues.apache.org/jira/browse/GOBBLIN-852
> Project: Apache Gobblin
>  Issue Type: Task
>  Components: hive-registration
>Reporter: Zihan Li
>Assignee: Abhishek Tiwari
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-852) Reorganize the code for hive registration to isolate function

2019-08-12 Thread Zihan Li (JIRA)
Zihan Li created GOBBLIN-852:


 Summary: Reorganize the code for hive registration to isolate 
function
 Key: GOBBLIN-852
 URL: https://issues.apache.org/jira/browse/GOBBLIN-852
 Project: Apache Gobblin
  Issue Type: Task
  Components: hive-registration
Reporter: Zihan Li
Assignee: Abhishek Tiwari






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide 
capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313072043
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java
 ##
 @@ -211,7 +211,9 @@ private static StorageDescriptor 
getStorageDescriptor(HiveRegistrationUnit unit)
 State props = unit.getStorageProps();
 StorageDescriptor sd = new StorageDescriptor();
 sd.setParameters(getParameters(props));
-sd.setCols(getFieldSchemas(unit));
+if (unit instanceof HiveTable) {
 
 Review comment:
   Due to we skip the schema registration in partition, the conversion from 
HivePartition to Partition will fail due to below exception. The 
addSchemaProperties is for HivePartition creation, not the HivePartition to 
Partition conversion.
   
   org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither 
avro.schema.literal nor avro.schema.url specified, can't determine table schema
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:119)
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:177)
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:103)
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:80)
   at 
org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:520)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:399)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getFieldSchemas(HiveMetaStoreUtils.java:356)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getStorageDescriptor(HiveMetaStoreUtils.java:214)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getPartition(HiveMetaStoreUtils.java:164)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.addOrAlterPartition(HiveMetaStoreBasedRegister.java:458)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.registerPath(HiveMetaStoreBasedRegister.java:159)
   at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:109)
   at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:93)
   at 
org.apache.gobblin.util.executors.MDCPropagatingCallable.call(MDCPropagatingCallable.java:42)
   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   at java.lang.Thread.run(Thread.java:748)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293310
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 18:42
Start Date: 12/Aug/19 18:42
Worklog Time Spent: 10m 
  Work Description: yukuai518 commented on pull request #2707: 
[GOBBLIN-851] Provide capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313072043
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java
 ##
 @@ -211,7 +211,9 @@ private static StorageDescriptor 
getStorageDescriptor(HiveRegistrationUnit unit)
 State props = unit.getStorageProps();
 StorageDescriptor sd = new StorageDescriptor();
 sd.setParameters(getParameters(props));
-sd.setCols(getFieldSchemas(unit));
+if (unit instanceof HiveTable) {
 
 Review comment:
   Due to we skip the schema registration in partition, the conversion from 
HivePartition to Partition will fail due to below exception. The 
addSchemaProperties is for HivePartition creation, not the HivePartition to 
Partition conversion.
   
   org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither 
avro.schema.literal nor avro.schema.url specified, can't determine table schema
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:119)
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:177)
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:103)
   at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:80)
   at 
org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:520)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:399)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getFieldSchemas(HiveMetaStoreUtils.java:356)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getStorageDescriptor(HiveMetaStoreUtils.java:214)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getPartition(HiveMetaStoreUtils.java:164)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.addOrAlterPartition(HiveMetaStoreBasedRegister.java:458)
   at 
org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.registerPath(HiveMetaStoreBasedRegister.java:159)
   at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:109)
   at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:93)
   at 
org.apache.gobblin.util.executors.MDCPropagatingCallable.call(MDCPropagatingCallable.java:42)
   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   at java.lang.Thread.run(Thread.java:748)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293310)
Time Spent: 1h  (was: 50m)

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293307&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293307
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 18:35
Start Date: 12/Aug/19 18:35
Worklog Time Spent: 10m 
  Work Description: yukuai518 commented on pull request #2707: 
[GOBBLIN-851] Provide capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313068740
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java
 ##
 @@ -418,6 +421,11 @@ public T withTableName(String tableName) {
   return (T) this;
 }
 
+public T withRegisterSchema(boolean registerSchema) {
 
 Review comment:
   Put it this way, the getHivePartition's actual implementation is in 
gobblin-hive-registration MP, which is out of the open source code base.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293307)
Time Spent: 50m  (was: 40m)

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide 
capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313068740
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java
 ##
 @@ -418,6 +421,11 @@ public T withTableName(String tableName) {
   return (T) this;
 }
 
+public T withRegisterSchema(boolean registerSchema) {
 
 Review comment:
   Put it this way, the getHivePartition's actual implementation is in 
gobblin-hive-registration MP, which is out of the open source code base.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293300&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293300
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 18:23
Start Date: 12/Aug/19 18:23
Worklog Time Spent: 10m 
  Work Description: yukuai518 commented on pull request #2707: 
[GOBBLIN-851] Provide capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313063000
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java
 ##
 @@ -418,6 +421,11 @@ public T withTableName(String tableName) {
   return (T) this;
 }
 
+public T withRegisterSchema(boolean registerSchema) {
 
 Review comment:
   This is the builder method, so that our gobblin-hive-registration can set 
weather the partition needs to register schema.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293300)
Time Spent: 40m  (was: 0.5h)

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide 
capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313063000
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java
 ##
 @@ -418,6 +421,11 @@ public T withTableName(String tableName) {
   return (T) this;
 }
 
+public T withRegisterSchema(boolean registerSchema) {
 
 Review comment:
   This is the builder method, so that our gobblin-hive-registration can set 
weather the partition needs to register schema.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293292&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293292
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 18:08
Start Date: 12/Aug/19 18:08
Worklog Time Spent: 10m 
  Work Description: autumnust commented on pull request #2707: 
[GOBBLIN-851] Provide capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313056277
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java
 ##
 @@ -418,6 +421,11 @@ public T withTableName(String tableName) {
   return (T) this;
 }
 
+public T withRegisterSchema(boolean registerSchema) {
 
 Review comment:
   What's the usage of this field? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293292)
Time Spent: 20m  (was: 10m)

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293293
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 18:08
Start Date: 12/Aug/19 18:08
Worklog Time Spent: 10m 
  Work Description: autumnust commented on pull request #2707: 
[GOBBLIN-851] Provide capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313055179
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java
 ##
 @@ -211,7 +211,9 @@ private static StorageDescriptor 
getStorageDescriptor(HiveRegistrationUnit unit)
 State props = unit.getStorageProps();
 StorageDescriptor sd = new StorageDescriptor();
 sd.setParameters(getParameters(props));
-sd.setCols(getFieldSchemas(unit));
+if (unit instanceof HiveTable) {
 
 Review comment:
   I believe this should happen inside `addSchemaProperties` method
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293293)
Time Spent: 0.5h  (was: 20m)

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide 
capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313056277
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java
 ##
 @@ -418,6 +421,11 @@ public T withTableName(String tableName) {
   return (T) this;
 }
 
+public T withRegisterSchema(boolean registerSchema) {
 
 Review comment:
   What's the usage of this field? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide 
capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313055179
 
 

 ##
 File path: 
gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java
 ##
 @@ -211,7 +211,9 @@ private static StorageDescriptor 
getStorageDescriptor(HiveRegistrationUnit unit)
 State props = unit.getStorageProps();
 StorageDescriptor sd = new StorageDescriptor();
 sd.setParameters(getParameters(props));
-sd.setCols(getFieldSchemas(unit));
+if (unit instanceof HiveTable) {
 
 Review comment:
   I believe this should happen inside `addSchemaProperties` method


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] yukuai518 opened a new pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.

2019-08-12 Thread GitBox
yukuai518 opened a new pull request #2707: [GOBBLIN-851] Provide capability to 
disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-851
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   We had problems when table level schema and partition level schema diverges. 
Think about the case when user register two partitions : 2019/08/10, 
2019/08/11, but schema changes in between(S1->S2). Now the table level has 
schema S2, but 2019/08/10 will have schema S1. 
   
   Query on the latest schema will cause the old partition failure.
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
  Unit tests in gobblin-hive-registration module.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293283&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293283
 ]

ASF GitHub Bot logged work on GOBBLIN-851:
--

Author: ASF GitHub Bot
Created on: 12/Aug/19 17:48
Start Date: 12/Aug/19 17:48
Worklog Time Spent: 10m 
  Work Description: yukuai518 commented on pull request #2707: 
[GOBBLIN-851] Provide capability to disable Hive partition schema registration.
URL: https://github.com/apache/incubator-gobblin/pull/2707
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-851
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   We had problems when table level schema and partition level schema diverges. 
Think about the case when user register two partitions : 2019/08/10, 
2019/08/11, but schema changes in between(S1->S2). Now the table level has 
schema S2, but 2019/08/10 will have schema S1. 
   
   Query on the latest schema will cause the old partition failure.
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
  Unit tests in gobblin-hive-registration module.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 293283)
Time Spent: 10m
Remaining Estimate: 0h

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread Kuai Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuai Yu updated GOBBLIN-851:

Description: 
We had problems when table level schema and partition level schema diverges. 
Think about the case when user register two partitions : 2019/08/10, 
2019/08/11, but schema changes in between(S1->S2). Now the table level has 
schema S2, but 2019/08/10 will have schema S1. 

Query on the latest schema will cause the old partition failure.

> Provide capability to disable hive schema registration in partition level
> -
>
> Key: GOBBLIN-851
> URL: https://issues.apache.org/jira/browse/GOBBLIN-851
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Kuai Yu
>Priority: Major
>
> We had problems when table level schema and partition level schema diverges. 
> Think about the case when user register two partitions : 2019/08/10, 
> 2019/08/11, but schema changes in between(S1->S2). Now the table level has 
> schema S2, but 2019/08/10 will have schema S1. 
> Query on the latest schema will cause the old partition failure.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level

2019-08-12 Thread Kuai Yu (JIRA)
Kuai Yu created GOBBLIN-851:
---

 Summary: Provide capability to disable hive schema registration in 
partition level
 Key: GOBBLIN-851
 URL: https://issues.apache.org/jira/browse/GOBBLIN-851
 Project: Apache Gobblin
  Issue Type: Improvement
Reporter: Kuai Yu






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Read locality on gobblin jobs

2019-08-12 Thread Kuai Yu
The Helix framework in cluster mode doesn't have data locality concept. I think 
that is only in YARN/MR mode.

From: Jay Sen 
Sent: Sunday, August 11, 2019 5:25 PM
To: dev@gobblin.incubator.apache.org 
Subject: Read locality on gobblin jobs

Hi Dev Team,

when gobblin runs on cluster mode or MR mode, if the job requires to reads
data from the hadoop filesystem which is local, i.e on the same gobblin
cluster, does gobblin or Helix figures out the data locality automatically
( as in typical MR job ) ?
I doubt if this is the case, but just wanted to get some info on whether
there is a way to achieve it anyway.

some more context:
I am preparing for following Gobblin pipeline.

*external hadoop cluster *---job 1---> *gobblin hadoop cluster-1* ---job
2---> *gobblin hadoop cluster-2* job 3 --> *target platform*
(oracle/mysql)

here job 1,2,3 are the gobblin jobs.

looking for suggestion on which gobblin mode would be best for this
scenario as well.
currently looking at Gobblin cluster mode and MR mode.

Thanks
Jay


[GitHub] [incubator-gobblin] krishraman commented on issue #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs.

2019-08-12 Thread GitBox
krishraman commented on issue #2706: Emit WorkUnitsCreated Count Event for MR 
deployed jobs. 
URL: 
https://github.com/apache/incubator-gobblin/pull/2706#issuecomment-520502737
 
 
   Hi @htran1 @zxcware Can you please review this PL? Thanks 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] codecov-io commented on issue #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs.

2019-08-12 Thread GitBox
codecov-io commented on issue #2706: Emit WorkUnitsCreated Count Event for MR 
deployed jobs. 
URL: 
https://github.com/apache/incubator-gobblin/pull/2706#issuecomment-520502431
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=h1)
 Report
   > Merging 
[#2706](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc)
 will **decrease** coverage by `40.71%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #2706   +/-   ##
   
   - Coverage 44.87%   4.15%   -40.72% 
   + Complexity 8708 736 -7972 
   
 Files  18791879   
 Lines 70095   70098+3 
 Branches   77047704   
   
   - Hits  314552915-28540 
   - Misses35728   66877+31149 
   + Partials   2912 306 -2606
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...ava/org/apache/gobblin/metrics/event/JobEvent.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9Kb2JFdmVudC5qYXZh)
 | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...pache/gobblin/runtime/mapreduce/MRJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvbWFwcmVkdWNlL01SSm9iTGF1bmNoZXIuamF2YQ==)
 | `54.49% <100%> (+0.38%)` | `18 <0> (ø)` | :arrow_down: |
   | 
[...n/converter/AvroStringFieldDecryptorConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY3J5cHRvLXByb3ZpZGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbnZlcnRlci9BdnJvU3RyaW5nRmllbGREZWNyeXB0b3JDb252ZXJ0ZXIuamF2YQ==)
 | `0% <0%> (-100%)` | `0% <0%> (-2%)` | |
   | 
[...he/gobblin/cluster/TaskRunnerSuiteThreadModel.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvVGFza1J1bm5lclN1aXRlVGhyZWFkTW9kZWwuamF2YQ==)
 | `0% <0%> (-100%)` | `0% <0%> (-5%)` | |
   | 
[...n/mapreduce/avro/AvroKeyCompactorOutputFormat.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vbWFwcmVkdWNlL2F2cm8vQXZyb0tleUNvbXBhY3Rvck91dHB1dEZvcm1hdC5qYXZh)
 | `0% <0%> (-100%)` | `0% <0%> (-3%)` | |
   | 
[...apache/gobblin/fork/CopyNotSupportedException.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZm9yay9Db3B5Tm90U3VwcG9ydGVkRXhjZXB0aW9uLmphdmE=)
 | `0% <0%> (-100%)` | `0% <0%> (-1%)` | |
   | 
[.../gobblin/kafka/writer/KafkaWriterCommonConfig.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2thZmthL3dyaXRlci9LYWZrYVdyaXRlckNvbW1vbkNvbmZpZy5qYXZh)
 | `0% <0%> (-100%)` | `0% <0%> (-7%)` | |
   | 
[...ker/task/TaskLevelPolicyCheckerBuilderFactory.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3F1YWxpdHljaGVja2VyL3Rhc2svVGFza0xldmVsUG9saWN5Q2hlY2tlckJ1aWxkZXJGYWN0b3J5LmphdmE=)
 | `0% <0%> (-100%)` | `0% <0%> (-2%)` | |
   | 
[...bblin/data/management/copy/AllEqualComparator.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvQWxsRXF1YWxDb21wYXJhdG9yLmphdmE=)
 | `0% <0%> (-100%)` | `0% <0%> (-2%)` | |
   | 
[...blin/converter/string/ObjectToStringConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbnZlcnRlci9zdHJpbmcvT2JqZWN0VG9TdHJpbmdDb252ZXJ0ZXIuamF2YQ==)
 | `0% <0%> (-100%)` | `0% <0%> (-3%)` | |
   | ... and [1075 
more](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree-more)
 | |
   
   --
   
   [Conti

[GitHub] [incubator-gobblin] krishraman opened a new pull request #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs.

2019-08-12 Thread GitBox
krishraman opened a new pull request #2706: Emit WorkUnitsCreated Count Event 
for MR deployed jobs. 
URL: https://github.com/apache/incubator-gobblin/pull/2706
 
 
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-766] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-766
   
   
   ### Description
   - [ x] Here are some details about my PR, including screenshots (if 
applicable):
   
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Tested on Yugioh by emitted the events to kafka and verifying using 
kafka-tool
   
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Reopened] (GOBBLIN-766) Emit Workunits created event in Apache gobblin

2019-08-12 Thread kraman (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kraman reopened GOBBLIN-766:


> Emit  Workunits created  event  in Apache gobblin
> -
>
> Key: GOBBLIN-766
> URL: https://issues.apache.org/jira/browse/GOBBLIN-766
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: kraman
>Priority: Minor
> Fix For: 0.15.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Emit a new workunits created metric to be captured for monitoring/Alerting



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)