[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293597&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293597 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 04:34 Start Date: 13/Aug/19 04:34 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215701 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -530,6 +545,48 @@ private void pollAndAdvanceDag() } } +private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus jobStatus) { + if (slaKilled) { +return CANCELLED; + } else { +if (jobStatus == null) { + return PENDING; +} else { + return valueOf(jobStatus.getEventName()); +} + } +} + +/** + * Check if the SLA is configured for the flow this job belongs to. + * If it is, this method will try to cancel the job when SLA is reached. + * + * @param node dag node of the job + * @return true if the job is killed because it reached sla + * @throws ExecutionException exception + * @throws InterruptedException exception + */ +private boolean slaKillIfNeeded(DagNode node) throws ExecutionException, InterruptedException { + long flowStartTime = DagManagerUtils.getFlowStartTime(node); + long currentTime = System.currentTimeMillis(); + String dagId = DagManagerUtils.generateDagId(node); + + long flowSla; + if (dagToSLA.containsKey(dagId)) { +flowSla = dagToSLA.get(dagId); + } else { +flowSla = DagManagerUtils.getFlowSLA(node); +dagToSLA.put(dagId, flowSla); + } + + if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + flowSla) { +log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla); Review comment: log.info("Flow exceeded the SLA...")? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293597) Time Spent: 3.5h (was: 3h 20m) > add a flow level sla in gaas flows > -- > > Key: GOBBLIN-847 > URL: https://issues.apache.org/jira/browse/GOBBLIN-847 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > add a flow level sla in gaas flows, because sometimes azkaban jobs may not > start and hence send any tracking event, or azkaban maybe down. in all those > cases, we might have to kill the job so we can start a new job -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293593&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293593 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 04:34 Start Date: 13/Aug/19 04:34 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216048 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -473,22 +490,20 @@ private void initialize(Dag dag) /** * Proceed the execution of each dag node based on job status. */ -private void pollAndAdvanceDag() -throws IOException { +private void pollAndAdvanceDag() throws IOException, ExecutionException, InterruptedException { this.failedDagIdsFinishRunning.clear(); - Map>> nextSubmitted = Maps.newHashMap(); List> nodesToCleanUp = Lists.newArrayList(); + for (DagNode node: this.jobToDag.keySet()) { -long pollStartTime = System.nanoTime(); +boolean slaKilled = slaKillIfNeeded(node); + JobStatus jobStatus = pollJobStatus(node); Review comment: Do we have to pollJobStatus if slaKilled is true? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293593) Time Spent: 3h 10m (was: 3h) > add a flow level sla in gaas flows > -- > > Key: GOBBLIN-847 > URL: https://issues.apache.org/jira/browse/GOBBLIN-847 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > add a flow level sla in gaas flows, because sometimes azkaban jobs may not > start and hence send any tracking event, or azkaban maybe down. in all those > cases, we might have to kill the job so we can start a new job -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293594&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293594 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 04:34 Start Date: 13/Aug/19 04:34 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213068 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -427,25 +428,20 @@ private void cancelDag(String dagToCancel) throws ExecutionException, Interrupte List> dagNodesToCancel = this.dagToJobs.get(dagToCancel); log.info("Found {} DagNodes to cancel.", dagNodesToCancel.size()); for (DagNode dagNodeToCancel : dagNodesToCancel) { - cancelDag(dagNodeToCancel); + cancelDagNode(dagNodeToCancel); Review comment: Will dagNodesToCancel include jobs that finished successfully? Or only jobs currently running? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293594) Time Spent: 3h 20m (was: 3h 10m) > add a flow level sla in gaas flows > -- > > Key: GOBBLIN-847 > URL: https://issues.apache.org/jira/browse/GOBBLIN-847 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > add a flow level sla in gaas flows, because sometimes azkaban jobs may not > start and hence send any tracking event, or azkaban maybe down. in all those > cases, we might have to kill the job so we can start a new job -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293595&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293595 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 04:34 Start Date: 13/Aug/19 04:34 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213742 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -454,6 +450,7 @@ private void sendCancellationEvent(JobExecutionPlan jobExecutionPlan) { if (this.eventSubmitter.isPresent()) { Map jobMetadata = TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan); this.eventSubmitter.get().getTimingEvent(TimingEvent.LauncherTimings.JOB_CANCEL).stop(jobMetadata); + this.eventSubmitter.get().getTimingEvent(TimingEvent.FlowTimings.FLOW_CANCEL).stop(jobMetadata); Review comment: So we will emit a FLOW_CANCEL event for every running job? Why not emit once? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293595) > add a flow level sla in gaas flows > -- > > Key: GOBBLIN-847 > URL: https://issues.apache.org/jira/browse/GOBBLIN-847 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > add a flow level sla in gaas flows, because sometimes azkaban jobs may not > start and hence send any tracking event, or azkaban maybe down. in all those > cases, we might have to kill the job so we can start a new job -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293592&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293592 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 04:34 Start Date: 13/Aug/19 04:34 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215757 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -530,6 +545,48 @@ private void pollAndAdvanceDag() } } +private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus jobStatus) { + if (slaKilled) { +return CANCELLED; + } else { +if (jobStatus == null) { + return PENDING; +} else { + return valueOf(jobStatus.getEventName()); +} + } +} + +/** + * Check if the SLA is configured for the flow this job belongs to. + * If it is, this method will try to cancel the job when SLA is reached. + * + * @param node dag node of the job + * @return true if the job is killed because it reached sla + * @throws ExecutionException exception + * @throws InterruptedException exception + */ +private boolean slaKillIfNeeded(DagNode node) throws ExecutionException, InterruptedException { + long flowStartTime = DagManagerUtils.getFlowStartTime(node); + long currentTime = System.currentTimeMillis(); + String dagId = DagManagerUtils.generateDagId(node); + + long flowSla; + if (dagToSLA.containsKey(dagId)) { +flowSla = dagToSLA.get(dagId); + } else { +flowSla = DagManagerUtils.getFlowSLA(node); +dagToSLA.put(dagId, flowSla); + } + + if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + flowSla) { +log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla); +cancelDagNode(node); Review comment: why not call cancelDag(dagId) here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293592) Time Spent: 3h (was: 2h 50m) > add a flow level sla in gaas flows > -- > > Key: GOBBLIN-847 > URL: https://issues.apache.org/jira/browse/GOBBLIN-847 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > add a flow level sla in gaas flows, because sometimes azkaban jobs may not > start and hence send any tracking event, or azkaban maybe down. in all those > cases, we might have to kill the job so we can start a new job -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293596&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293596 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 04:34 Start Date: 13/Aug/19 04:34 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216361 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -473,22 +490,20 @@ private void initialize(Dag dag) /** * Proceed the execution of each dag node based on job status. */ -private void pollAndAdvanceDag() -throws IOException { +private void pollAndAdvanceDag() throws IOException, ExecutionException, InterruptedException { this.failedDagIdsFinishRunning.clear(); - Map>> nextSubmitted = Maps.newHashMap(); List> nodesToCleanUp = Lists.newArrayList(); + for (DagNode node: this.jobToDag.keySet()) { -long pollStartTime = System.nanoTime(); +boolean slaKilled = slaKillIfNeeded(node); + JobStatus jobStatus = pollJobStatus(node); -Instrumented.updateTimer(this.jobStatusPolledTimer, System.nanoTime() - pollStartTime, TimeUnit.NANOSECONDS); -if (jobStatus == null) { - continue; -} + +ExecutionStatus status = getJobExecutionStatus(slaKilled, jobStatus); Review comment: Why not ExecutionStatus status = (slaKilled) ? ExecutionStatus.CANCELLED : getJobExecutionStatus(pollJobStatus(node));? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293596) Time Spent: 3.5h (was: 3h 20m) > add a flow level sla in gaas flows > -- > > Key: GOBBLIN-847 > URL: https://issues.apache.org/jira/browse/GOBBLIN-847 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > add a flow level sla in gaas flows, because sometimes azkaban jobs may not > start and hence send any tracking event, or azkaban maybe down. in all those > cases, we might have to kill the job so we can start a new job -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216048 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -473,22 +490,20 @@ private void initialize(Dag dag) /** * Proceed the execution of each dag node based on job status. */ -private void pollAndAdvanceDag() -throws IOException { +private void pollAndAdvanceDag() throws IOException, ExecutionException, InterruptedException { this.failedDagIdsFinishRunning.clear(); - Map>> nextSubmitted = Maps.newHashMap(); List> nodesToCleanUp = Lists.newArrayList(); + for (DagNode node: this.jobToDag.keySet()) { -long pollStartTime = System.nanoTime(); +boolean slaKilled = slaKillIfNeeded(node); + JobStatus jobStatus = pollJobStatus(node); Review comment: Do we have to pollJobStatus if slaKilled is true? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313216361 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -473,22 +490,20 @@ private void initialize(Dag dag) /** * Proceed the execution of each dag node based on job status. */ -private void pollAndAdvanceDag() -throws IOException { +private void pollAndAdvanceDag() throws IOException, ExecutionException, InterruptedException { this.failedDagIdsFinishRunning.clear(); - Map>> nextSubmitted = Maps.newHashMap(); List> nodesToCleanUp = Lists.newArrayList(); + for (DagNode node: this.jobToDag.keySet()) { -long pollStartTime = System.nanoTime(); +boolean slaKilled = slaKillIfNeeded(node); + JobStatus jobStatus = pollJobStatus(node); -Instrumented.updateTimer(this.jobStatusPolledTimer, System.nanoTime() - pollStartTime, TimeUnit.NANOSECONDS); -if (jobStatus == null) { - continue; -} + +ExecutionStatus status = getJobExecutionStatus(slaKilled, jobStatus); Review comment: Why not ExecutionStatus status = (slaKilled) ? ExecutionStatus.CANCELLED : getJobExecutionStatus(pollJobStatus(node));? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215701 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -530,6 +545,48 @@ private void pollAndAdvanceDag() } } +private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus jobStatus) { + if (slaKilled) { +return CANCELLED; + } else { +if (jobStatus == null) { + return PENDING; +} else { + return valueOf(jobStatus.getEventName()); +} + } +} + +/** + * Check if the SLA is configured for the flow this job belongs to. + * If it is, this method will try to cancel the job when SLA is reached. + * + * @param node dag node of the job + * @return true if the job is killed because it reached sla + * @throws ExecutionException exception + * @throws InterruptedException exception + */ +private boolean slaKillIfNeeded(DagNode node) throws ExecutionException, InterruptedException { + long flowStartTime = DagManagerUtils.getFlowStartTime(node); + long currentTime = System.currentTimeMillis(); + String dagId = DagManagerUtils.generateDagId(node); + + long flowSla; + if (dagToSLA.containsKey(dagId)) { +flowSla = dagToSLA.get(dagId); + } else { +flowSla = DagManagerUtils.getFlowSLA(node); +dagToSLA.put(dagId, flowSla); + } + + if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + flowSla) { +log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla); Review comment: log.info("Flow exceeded the SLA...")? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213742 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -454,6 +450,7 @@ private void sendCancellationEvent(JobExecutionPlan jobExecutionPlan) { if (this.eventSubmitter.isPresent()) { Map jobMetadata = TimingEventUtils.getJobMetadata(Maps.newHashMap(), jobExecutionPlan); this.eventSubmitter.get().getTimingEvent(TimingEvent.LauncherTimings.JOB_CANCEL).stop(jobMetadata); + this.eventSubmitter.get().getTimingEvent(TimingEvent.FlowTimings.FLOW_CANCEL).stop(jobMetadata); Review comment: So we will emit a FLOW_CANCEL event for every running job? Why not emit once? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313213068 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -427,25 +428,20 @@ private void cancelDag(String dagToCancel) throws ExecutionException, Interrupte List> dagNodesToCancel = this.dagToJobs.get(dagToCancel); log.info("Found {} DagNodes to cancel.", dagNodesToCancel.size()); for (DagNode dagNodeToCancel : dagNodesToCancel) { - cancelDag(dagNodeToCancel); + cancelDagNode(dagNodeToCancel); Review comment: Will dagNodesToCancel include jobs that finished successfully? Or only jobs currently running? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla
sv2000 commented on a change in pull request #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#discussion_r313215757 ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java ## @@ -530,6 +545,48 @@ private void pollAndAdvanceDag() } } +private ExecutionStatus getJobExecutionStatus(boolean slaKilled, JobStatus jobStatus) { + if (slaKilled) { +return CANCELLED; + } else { +if (jobStatus == null) { + return PENDING; +} else { + return valueOf(jobStatus.getEventName()); +} + } +} + +/** + * Check if the SLA is configured for the flow this job belongs to. + * If it is, this method will try to cancel the job when SLA is reached. + * + * @param node dag node of the job + * @return true if the job is killed because it reached sla + * @throws ExecutionException exception + * @throws InterruptedException exception + */ +private boolean slaKillIfNeeded(DagNode node) throws ExecutionException, InterruptedException { + long flowStartTime = DagManagerUtils.getFlowStartTime(node); + long currentTime = System.currentTimeMillis(); + String dagId = DagManagerUtils.generateDagId(node); + + long flowSla; + if (dagToSLA.containsKey(dagId)) { +flowSla = dagToSLA.get(dagId); + } else { +flowSla = DagManagerUtils.getFlowSLA(node); +dagToSLA.put(dagId, flowSla); + } + + if (flowSla != DagManagerUtils.NO_SLA && currentTime > flowStartTime + flowSla) { +log.info("Job exceeded the SLA of {} ms. Killing it now...", flowSla); +cancelDagNode(node); Review comment: why not call cancelDag(dagId) here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (GOBBLIN-822) upgrade log4j to log4j2
[ https://issues.apache.org/jira/browse/GOBBLIN-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-822: Description: log4j2 has routing appender that would be super useful and probably only way to achieve " job specific log files" functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits (ref: HIVE-11304) was: log4j2 has routing appender that would be super useful and probably only way to achieve " job specific log files" functionality without meddling around fileHandler in log4j. Also log4j2 has lot of new functionalities and performance benefits > upgrade log4j to log4j2 > --- > > Key: GOBBLIN-822 > URL: https://issues.apache.org/jira/browse/GOBBLIN-822 > Project: Apache Gobblin > Issue Type: Sub-task >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > log4j2 has routing appender that would be super useful and probably only way > to achieve " job specific log files" functionality without meddling around > fileHandler in log4j. > Also log4j2 has lot of new functionalities and performance benefits (ref: > HIVE-11304) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?focusedWorklogId=293542&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293542 ] ASF GitHub Bot logged work on GOBBLIN-854: -- Author: ASF GitHub Bot Created on: 13/Aug/19 02:19 Start Date: 13/Aug/19 02:19 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2710: [GOBBLIN-854] use typesafe config instead of java properties URL: https://github.com/apache/incubator-gobblin/pull/2710 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-854 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): standalone mode SchedulerDaemon uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. Also it takes 2 different config file as argument, one as default and another as custom, we probably only need one property file. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: not required, no change in functionality ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293542) Time Spent: 10m Remaining Estimate: 0h > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > Time Spent: 10m > Remaining Estimate: 0h > > standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. > Also it takes 2 different config file as argument, one as default and another > as custom, we probably only need one property file. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] jhsenjaliya opened a new pull request #2710: [GOBBLIN-854] use typesafe config instead of java properties
jhsenjaliya opened a new pull request #2710: [GOBBLIN-854] use typesafe config instead of java properties URL: https://github.com/apache/incubator-gobblin/pull/2710 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-854 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): standalone mode SchedulerDaemon uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. Also it takes 2 different config file as argument, one as default and another as custom, we probably only need one property file. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: not required, no change in functionality ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-854: Description: standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. Also it takes 2 different config file as argument, one as default and another as custom, we probably only need one property file. was:standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > standalone mode {{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. > Also it takes 2 different config file as argument, one as default and another > as custom, we probably only need one property file. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-854) update config reader in standalone mode
Jay Sen created GOBBLIN-854: --- Summary: update config reader in standalone mode Key: GOBBLIN-854 URL: https://issues.apache.org/jira/browse/GOBBLIN-854 Project: Apache Gobblin Issue Type: Improvement Reporter: Jay Sen standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to use TypeSafe Config instead to make config standardized across the modes and also enable config to take benefits of TypeSafe functionalities. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-854: Affects Version/s: 0.14.0 > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > > standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-854) update config reader in standalone mode
[ https://issues.apache.org/jira/browse/GOBBLIN-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-854: Fix Version/s: 0.15.0 > update config reader in standalone mode > --- > > Key: GOBBLIN-854 > URL: https://issues.apache.org/jira/browse/GOBBLIN-854 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > standalone mode \{{SchedulerDaemon}} uses java properties, this ticket is to > use TypeSafe Config instead to make config standardized across the modes and > also enable config to take benefits of TypeSafe functionalities. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-847) add a flow level sla in gaas flows
[ https://issues.apache.org/jira/browse/GOBBLIN-847?focusedWorklogId=293533&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293533 ] ASF GitHub Bot logged work on GOBBLIN-847: -- Author: ASF GitHub Bot Created on: 13/Aug/19 02:02 Start Date: 13/Aug/19 02:02 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#issuecomment-520659463 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=h1) Report > Merging [#2702](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/8903ebf3807af3369839069e2082afa70c7fe77e?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `85.71%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #2702 +/- ## === + Coverage 44.9% 44.9% +<.01% - Complexity 87138718 +5 === Files 18791879 Lines 70079 70129 +50 Branches 77037707 +4 === + Hits 31466 31490 +24 - Misses35702 35730 +28 + Partials 29112909 -2 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...pache/gobblin/configuration/ConfigurationKeys.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29uZmlndXJhdGlvbi9Db25maWd1cmF0aW9uS2V5cy5qYXZh) | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: | | [.../org/apache/gobblin/metrics/event/TimingEvent.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9UaW1pbmdFdmVudC5qYXZh) | `70% <ø> (ø)` | `15 <0> (ø)` | :arrow_down: | | [...time/spec\_executorInstance/MockedSpecExecutor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19leGVjdXRvckluc3RhbmNlL01vY2tlZFNwZWNFeGVjdXRvci5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...service/modules/orchestration/DagManagerUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXJVdGlscy5qYXZh) | `84.28% <100%> (+3.95%)` | `30 <9> (+9)` | :arrow_up: | | [...blin/service/modules/orchestration/DagManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXIuamF2YQ==) | `77.33% <86.04%> (+2.79%)` | `12 <1> (+1)` | :arrow_up: | | [...bblin/cluster/GobblinHelixJobLauncherListener.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4Sm9iTGF1bmNoZXJMaXN0ZW5lci5qYXZh) | `70% <0%> (-30%)` | `3% <0%> (-2%)` | | | [...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh) | `32.71% <0%> (-9.13%)` | `11% <0%> (-3%)` | | | [.../gobblin/cluster/HelixRetriggeringJobCallable.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhSZXRyaWdnZXJpbmdKb2JDYWxsYWJsZS5qYXZh) | `60.41% <0%> (-3.48%)` | `9% <0%> (ø)` | | | [...ache/gobblin/couchbase/writer/CouchbaseWriter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY291Y2hiYXNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvdWNoYmFzZS93cml0ZXIvQ291Y2hiYXNlV3JpdGVyLmphdmE=) | `66.27% <0%> (-2.33%)` | `11% <0%> (ø)` | | | [...pache/gobblin/cluster/GobblinHelixJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/27
[GitHub] [incubator-gobblin] codecov-io commented on issue #2702: [GOBBLIN-847] Flow level sla
codecov-io commented on issue #2702: [GOBBLIN-847] Flow level sla URL: https://github.com/apache/incubator-gobblin/pull/2702#issuecomment-520659463 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=h1) Report > Merging [#2702](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/8903ebf3807af3369839069e2082afa70c7fe77e?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `85.71%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #2702 +/- ## === + Coverage 44.9% 44.9% +<.01% - Complexity 87138718 +5 === Files 18791879 Lines 70079 70129 +50 Branches 77037707 +4 === + Hits 31466 31490 +24 - Misses35702 35730 +28 + Partials 29112909 -2 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...pache/gobblin/configuration/ConfigurationKeys.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29uZmlndXJhdGlvbi9Db25maWd1cmF0aW9uS2V5cy5qYXZh) | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: | | [.../org/apache/gobblin/metrics/event/TimingEvent.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9UaW1pbmdFdmVudC5qYXZh) | `70% <ø> (ø)` | `15 <0> (ø)` | :arrow_down: | | [...time/spec\_executorInstance/MockedSpecExecutor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19leGVjdXRvckluc3RhbmNlL01vY2tlZFNwZWNFeGVjdXRvci5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...service/modules/orchestration/DagManagerUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXJVdGlscy5qYXZh) | `84.28% <100%> (+3.95%)` | `30 <9> (+9)` | :arrow_up: | | [...blin/service/modules/orchestration/DagManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9vcmNoZXN0cmF0aW9uL0RhZ01hbmFnZXIuamF2YQ==) | `77.33% <86.04%> (+2.79%)` | `12 <1> (+1)` | :arrow_up: | | [...bblin/cluster/GobblinHelixJobLauncherListener.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4Sm9iTGF1bmNoZXJMaXN0ZW5lci5qYXZh) | `70% <0%> (-30%)` | `3% <0%> (-2%)` | | | [...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh) | `32.71% <0%> (-9.13%)` | `11% <0%> (-3%)` | | | [.../gobblin/cluster/HelixRetriggeringJobCallable.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhSZXRyaWdnZXJpbmdKb2JDYWxsYWJsZS5qYXZh) | `60.41% <0%> (-3.48%)` | `9% <0%> (ø)` | | | [...ache/gobblin/couchbase/writer/CouchbaseWriter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY291Y2hiYXNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvdWNoYmFzZS93cml0ZXIvQ291Y2hiYXNlV3JpdGVyLmphdmE=) | `66.27% <0%> (-2.33%)` | `11% <0%> (ø)` | | | [...pache/gobblin/cluster/GobblinHelixJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4Sm9iTGF1bmNoZXIuamF2YQ==) | `81.53% <0%> (-1.8%)` | `26% <0%> (-2%)` | | | ... and [6 more](https://codecov.io/gh/apache/incubator-gobblin/pull/2702/diff?src=pr&el=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2702?src=p
[GitHub] [incubator-gobblin] codecov-io commented on issue #2709: [GOBBLIN-853] Support multiple paths specified in flow config
codecov-io commented on issue #2709: [GOBBLIN-853] Support multiple paths specified in flow config URL: https://github.com/apache/incubator-gobblin/pull/2709#issuecomment-520650477 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=h1) Report > Merging [#2709](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **increase** coverage by `0.02%`. > The diff coverage is `85.71%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#2709 +/- ## + Coverage 44.87% 44.89% +0.02% - Complexity 8708 8714 +6 Files 1879 1879 Lines 7009570125 +30 Branches 7704 7711 +7 + Hits 3145531484 +29 - Misses3572835729 +1 Partials 2912 2912 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==) | `75.8% <100%> (+0.39%)` | `9 <0> (ø)` | :arrow_down: | | [...lin/service/modules/flow/MultiHopFlowCompiler.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93L011bHRpSG9wRmxvd0NvbXBpbGVyLmphdmE=) | `67.92% <84.84%> (+5.58%)` | `13 <3> (+5)` | :arrow_up: | | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: | | [...n/service/modules/template/StaticFlowTemplate.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy90ZW1wbGF0ZS9TdGF0aWNGbG93VGVtcGxhdGUuamF2YQ==) | `90.9% <0%> (+1.51%)` | `16% <0%> (ø)` | :arrow_down: | | [...service/modules/dataset/BaseDatasetDescriptor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9kYXRhc2V0L0Jhc2VEYXRhc2V0RGVzY3JpcHRvci5qYXZh) | `72.41% <0%> (+3.44%)` | `13% <0%> (+1%)` | :arrow_up: | | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=footer). Last update [50280ee...011a563](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-853) Support multiple paths specified in flow config
[ https://issues.apache.org/jira/browse/GOBBLIN-853?focusedWorklogId=293519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293519 ] ASF GitHub Bot logged work on GOBBLIN-853: -- Author: ASF GitHub Bot Created on: 13/Aug/19 01:11 Start Date: 13/Aug/19 01:11 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2709: [GOBBLIN-853] Support multiple paths specified in flow config URL: https://github.com/apache/incubator-gobblin/pull/2709#issuecomment-520650477 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=h1) Report > Merging [#2709](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **increase** coverage by `0.02%`. > The diff coverage is `85.71%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#2709 +/- ## + Coverage 44.87% 44.89% +0.02% - Complexity 8708 8714 +6 Files 1879 1879 Lines 7009570125 +30 Branches 7704 7711 +7 + Hits 3145531484 +29 - Misses3572835729 +1 Partials 2912 2912 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==) | `75.8% <100%> (+0.39%)` | `9 <0> (ø)` | :arrow_down: | | [...lin/service/modules/flow/MultiHopFlowCompiler.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9mbG93L011bHRpSG9wRmxvd0NvbXBpbGVyLmphdmE=) | `67.92% <84.84%> (+5.58%)` | `13 <3> (+5)` | :arrow_up: | | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: | | [...n/service/modules/template/StaticFlowTemplate.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy90ZW1wbGF0ZS9TdGF0aWNGbG93VGVtcGxhdGUuamF2YQ==) | `90.9% <0%> (+1.51%)` | `16% <0%> (ø)` | :arrow_down: | | [...service/modules/dataset/BaseDatasetDescriptor.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9kYXRhc2V0L0Jhc2VEYXRhc2V0RGVzY3JpcHRvci5qYXZh) | `72.41% <0%> (+3.44%)` | `13% <0%> (+1%)` | :arrow_up: | | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2709/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=footer). Last update [50280ee...011a563](https://codecov.io/gh/apache/incubator-gobblin/pull/2709?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog
[jira] [Work logged] (GOBBLIN-853) Support multiple paths specified in flow config
[ https://issues.apache.org/jira/browse/GOBBLIN-853?focusedWorklogId=293497&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293497 ] ASF GitHub Bot logged work on GOBBLIN-853: -- Author: ASF GitHub Bot Created on: 13/Aug/19 00:22 Start Date: 13/Aug/19 00:22 Worklog Time Spent: 10m Work Description: jack-moseley commented on pull request #2709: [GOBBLIN-853] Support multiple paths specified in flow config URL: https://github.com/apache/incubator-gobblin/pull/2709 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-853 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): Support multiple paths in flow spec by splitting into multiple flow specs on the path property. Then the resulting dags are merged into a single dag, so each dataset will be a concurrent job within the dag. Also added a random string to the end of each job.name to avoid collisions, since job.name is assumed to be unique with a dag. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Added unit test ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293497) Time Spent: 10m Remaining Estimate: 0h > Support multiple paths specified in flow config > --- > > Key: GOBBLIN-853 > URL: https://issues.apache.org/jira/browse/GOBBLIN-853 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jack Moseley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] jack-moseley opened a new pull request #2709: [GOBBLIN-853] Support multiple paths specified in flow config
jack-moseley opened a new pull request #2709: [GOBBLIN-853] Support multiple paths specified in flow config URL: https://github.com/apache/incubator-gobblin/pull/2709 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-853 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): Support multiple paths in flow spec by splitting into multiple flow specs on the path property. Then the resulting dags are merged into a single dag, so each dataset will be a concurrent job within the dag. Also added a random string to the end of each job.name to avoid collisions, since job.name is assumed to be unique with a dag. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Added unit test ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (GOBBLIN-853) Support multiple paths specified in flow config
Jack Moseley created GOBBLIN-853: Summary: Support multiple paths specified in flow config Key: GOBBLIN-853 URL: https://issues.apache.org/jira/browse/GOBBLIN-853 Project: Apache Gobblin Issue Type: Bug Reporter: Jack Moseley -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function
[ https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293451 ] ASF GitHub Bot logged work on GOBBLIN-852: -- Author: ASF GitHub Bot Created on: 12/Aug/19 22:40 Start Date: 12/Aug/19 22:40 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520621350 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=h1) Report > Merging [#2708](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **decrease** coverage by `<.01%`. > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#2708 +/- ## - Coverage 44.87% 44.86% -0.01% - Complexity 8708 8709 +1 Files 1879 1879 Lines 7009570098 +3 Branches 7704 7705 +1 - Hits 3145531451 -4 - Misses3572835735 +7 Partials 2912 2912 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...e/gobblin/publisher/HiveRegistrationPublisher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3B1Ymxpc2hlci9IaXZlUmVnaXN0cmF0aW9uUHVibGlzaGVyLmphdmE=) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...in/java/org/apache/gobblin/cluster/SingleTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvU2luZ2xlVGFzay5qYXZh) | `73.58% <0%> (-7.55%)` | `9% <0%> (ø)` | | | [...a/org/apache/gobblin/cluster/GobblinHelixTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4VGFzay5qYXZh) | `76.08% <0%> (-4.35%)` | `5% <0%> (ø)` | | | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `64.78% <0%> (-0.94%)` | `29% <0%> (ø)` | | | [...main/java/org/apache/gobblin/util/HadoopUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvSGFkb29wVXRpbHMuamF2YQ==) | `30.53% <0%> (+0.33%)` | `24% <0%> (ø)` | :arrow_down: | | [...lin/restli/throttling/ZookeeperLeaderElection.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL1pvb2tlZXBlckxlYWRlckVsZWN0aW9uLmphdmE=) | `72.22% <0%> (+2.22%)` | `13% <0%> (ø)` | :arrow_down: | | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=footer). Last update [50280ee...89bf954](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apach
[GitHub] [incubator-gobblin] codecov-io commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function
codecov-io commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520621350 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=h1) Report > Merging [#2708](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **decrease** coverage by `<.01%`. > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#2708 +/- ## - Coverage 44.87% 44.86% -0.01% - Complexity 8708 8709 +1 Files 1879 1879 Lines 7009570098 +3 Branches 7704 7705 +1 - Hits 3145531451 -4 - Misses3572835735 +7 Partials 2912 2912 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...e/gobblin/publisher/HiveRegistrationPublisher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3B1Ymxpc2hlci9IaXZlUmVnaXN0cmF0aW9uUHVibGlzaGVyLmphdmE=) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...in/java/org/apache/gobblin/cluster/SingleTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvU2luZ2xlVGFzay5qYXZh) | `73.58% <0%> (-7.55%)` | `9% <0%> (ø)` | | | [...a/org/apache/gobblin/cluster/GobblinHelixTask.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4VGFzay5qYXZh) | `76.08% <0%> (-4.35%)` | `5% <0%> (ø)` | | | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `64.78% <0%> (-0.94%)` | `29% <0%> (ø)` | | | [...main/java/org/apache/gobblin/util/HadoopUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvSGFkb29wVXRpbHMuamF2YQ==) | `30.53% <0%> (+0.33%)` | `24% <0%> (ø)` | :arrow_down: | | [...lin/restli/throttling/ZookeeperLeaderElection.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL1pvb2tlZXBlckxlYWRlckVsZWN0aW9uLmphdmE=) | `72.22% <0%> (+2.22%)` | `13% <0%> (ø)` | :arrow_down: | | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2708/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `92.85% <0%> (+7.14%)` | `3% <0%> (ø)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=footer). Last update [50280ee...89bf954](https://codecov.io/gh/apache/incubator-gobblin/pull/2708?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (GOBBLIN-824) upgrade to latest libraries in Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905621#comment-16905621 ] Jay Sen commented on GOBBLIN-824: - GOBBLIN-818 does minor upgrades. > upgrade to latest libraries in Gobblin > -- > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-824) upgrade to latest libraries in Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-824: Summary: upgrade to latest libraries in Gobblin (was: upgrade libs versions in Gobblin) > upgrade to latest libraries in Gobblin > -- > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-824) upgrade libs versions in Gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-824: Description: lot of libs are old, like hadoop, hive, etc... it wont be easy to just comile gobblin with new version via passing new version on command line, there is lot of changes since last couple of years. Gobblin should use latest versions Hadoop: 2.9.x hive : 2.3.5 pegasus: 24.0.2 Avro : 1.8.2 etc... please feel free to mention which lib should be updated as part of this overall upgrade process. was: lot of libs are old, like hadoop, hive, etc... it wont be easy to just comile gobblin with new version via passing new version on command line, there is lot of changes since last couple of years. Gobblin should use latest versions hadoop : 2.7.7 hive : 2.3.5 pegasus: 24.0.2 etc... please feel free to mention which lib should be updated as part of this overall upgrade process. > upgrade libs versions in Gobblin > > > Key: GOBBLIN-824 > URL: https://issues.apache.org/jira/browse/GOBBLIN-824 > Project: Apache Gobblin > Issue Type: Improvement >Affects Versions: 0.14.0 >Reporter: Jay Sen >Priority: Major > Fix For: 0.15.0 > > > lot of libs are old, like hadoop, hive, etc... > it wont be easy to just comile gobblin with new version via passing new > version on command line, there is lot of changes since last couple of years. > Gobblin should use latest versions > Hadoop: 2.9.x > hive : 2.3.5 > pegasus: 24.0.2 > Avro : 1.8.2 > etc... > please feel free to mention which lib should be updated as part of this > overall upgrade process. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Description: Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due to backward incompatible changes in Hive 1.2 we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very stable. was: Gobblin uses old hive 1.x. Hive 2.x has significant changes and some incompatible/deprecated classes. while hive 3.x is already in pipeline along with Hadoop 3.x, we should move to hive 2.x so user dont have to deal with manual fixes for their use of Gobblin. we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very stable. > upgrade default hadoop versions to 2.7.x and hive version to 1.2 > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.0, compiling against Hive 1.2 is not compatible due > to backward incompatible changes in Hive 1.2 > we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very > stable. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-818) upgrade default hadoop versions to 2.7.x and hive version to 1.2
[ https://issues.apache.org/jira/browse/GOBBLIN-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Sen updated GOBBLIN-818: Summary: upgrade default hadoop versions to 2.7.x and hive version to 1.2 (was: MIgrate to Hive 2.x as default) > upgrade default hadoop versions to 2.7.x and hive version to 1.2 > > > Key: GOBBLIN-818 > URL: https://issues.apache.org/jira/browse/GOBBLIN-818 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > > Gobblin uses old hive 1.x. > Hive 2.x has significant changes and some incompatible/deprecated classes. > while hive 3.x is already in pipeline along with Hadoop 3.x, we should move > to hive 2.x so user dont have to deal with manual fixes for their use of > Gobblin. > we should also update Hadoop libs from 2.3 to 2.7.7 at least which is very > stable. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function
[ https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293377&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293377 ] ASF GitHub Bot logged work on GOBBLIN-852: -- Author: ASF GitHub Bot Created on: 12/Aug/19 20:55 Start Date: 12/Aug/19 20:55 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708#discussion_r313123196 ## File path: gobblin-core/src/main/java/org/apache/gobblin/publisher/HiveRegistrationPublisher.java ## @@ -157,7 +146,9 @@ public void publishData(Collection states) throws IOExc if (isPathDedupeEnabled && pathsToRegisterFromSingleState.contains(path)){ continue; } - pathsToRegisterFromSingleState.add(path); + if(isPathDedupeEnabled) { Review comment: can this if () {} block be merged inside the above if() {} block? e.g. if (isPathDedupeEnabled) { pathsToRegisterFromSingleState.contains(path) ? continue: pathsToRegisterFromSingleState.add(path);} This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293377) Time Spent: 0.5h (was: 20m) > Reorganize the code for hive registration to isolate function > - > > Key: GOBBLIN-852 > URL: https://issues.apache.org/jira/browse/GOBBLIN-852 > Project: Apache Gobblin > Issue Type: Task > Components: hive-registration >Reporter: Zihan Li >Assignee: Abhishek Tiwari >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function
sv2000 commented on a change in pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708#discussion_r313123196 ## File path: gobblin-core/src/main/java/org/apache/gobblin/publisher/HiveRegistrationPublisher.java ## @@ -157,7 +146,9 @@ public void publishData(Collection states) throws IOExc if (isPathDedupeEnabled && pathsToRegisterFromSingleState.contains(path)){ continue; } - pathsToRegisterFromSingleState.add(path); + if(isPathDedupeEnabled) { Review comment: can this if () {} block be merged inside the above if() {} block? e.g. if (isPathDedupeEnabled) { pathsToRegisterFromSingleState.contains(path) ? continue: pathsToRegisterFromSingleState.add(path);} This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293363 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 20:29 Start Date: 12/Aug/19 20:29 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#issuecomment-520582425 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=h1) Report > Merging [#2707](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **increase** coverage by `0.01%`. > The diff coverage is `56.25%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#2707 +/- ## + Coverage 44.87% 44.88% +0.01% - Complexity 8708 8712 +4 Files 1879 1879 Lines 7009570103 +8 Branches 7704 7706 +2 + Hits 3145531466 +11 + Misses3572835721 -7 - Partials 2912 2916 +4 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...che/gobblin/hive/metastore/HiveMetaStoreUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL21ldGFzdG9yZS9IaXZlTWV0YVN0b3JlVXRpbHMuamF2YQ==) | `31.83% <50%> (-0.15%)` | `12 <0> (ø)` | | | [...g/apache/gobblin/metrics/event/EventSubmitter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9FdmVudFN1Ym1pdHRlci5qYXZh) | `37.03% <50%> (-0.7%)` | `3 <0> (ø)` | | | [...apache/gobblin/hive/avro/HiveAvroSerDeManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL2F2cm8vSGl2ZUF2cm9TZXJEZU1hbmFnZXIuamF2YQ==) | `52.17% <57.14%> (-0.77%)` | `8 <1> (ø)` | | | [.../org/apache/gobblin/hive/HiveRegistrationUnit.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL0hpdmVSZWdpc3RyYXRpb25Vbml0LmphdmE=) | `47.39% <60%> (+0.37%)` | `33 <1> (+1)` | :arrow_up: | | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `15.49% <0%> (+0.84%)` | `4% <0%> (+1%)` | :arrow_up: | | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: | | [...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh) | `39.25% <0%> (+3.73%)` | `13% <0%> (+1%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=footer). Last update [50280ee...3ac88ef](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git
[GitHub] [incubator-gobblin] codecov-io commented on issue #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
codecov-io commented on issue #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#issuecomment-520582425 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=h1) Report > Merging [#2707](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **increase** coverage by `0.01%`. > The diff coverage is `56.25%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#2707 +/- ## + Coverage 44.87% 44.88% +0.01% - Complexity 8708 8712 +4 Files 1879 1879 Lines 7009570103 +8 Branches 7704 7706 +2 + Hits 3145531466 +11 + Misses3572835721 -7 - Partials 2912 2916 +4 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...che/gobblin/hive/metastore/HiveMetaStoreUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL21ldGFzdG9yZS9IaXZlTWV0YVN0b3JlVXRpbHMuamF2YQ==) | `31.83% <50%> (-0.15%)` | `12 <0> (ø)` | | | [...g/apache/gobblin/metrics/event/EventSubmitter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9FdmVudFN1Ym1pdHRlci5qYXZh) | `37.03% <50%> (-0.7%)` | `3 <0> (ø)` | | | [...apache/gobblin/hive/avro/HiveAvroSerDeManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL2F2cm8vSGl2ZUF2cm9TZXJEZU1hbmFnZXIuamF2YQ==) | `52.17% <57.14%> (-0.77%)` | `8 <1> (ø)` | | | [.../org/apache/gobblin/hive/HiveRegistrationUnit.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL0hpdmVSZWdpc3RyYXRpb25Vbml0LmphdmE=) | `47.39% <60%> (+0.37%)` | `33 <1> (+1)` | :arrow_up: | | [...main/java/org/apache/gobblin/yarn/YarnService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFyblNlcnZpY2UuamF2YQ==) | `15.49% <0%> (+0.84%)` | `4% <0%> (+1%)` | :arrow_up: | | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `80.37% <0%> (+0.93%)` | `24% <0%> (ø)` | :arrow_down: | | [...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2707/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh) | `39.25% <0%> (+3.73%)` | `13% <0%> (+1%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=footer). Last update [50280ee...3ac88ef](https://codecov.io/gh/apache/incubator-gobblin/pull/2707?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ZihanLi58 commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function
ZihanLi58 commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520576484 @sv2000 @autumnust Can you help review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function
[ https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293356 ] ASF GitHub Bot logged work on GOBBLIN-852: -- Author: ASF GitHub Bot Created on: 12/Aug/19 20:11 Start Date: 12/Aug/19 20:11 Worklog Time Spent: 10m Work Description: ZihanLi58 commented on issue #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708#issuecomment-520576484 @sv2000 @autumnust Can you help review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293356) Time Spent: 20m (was: 10m) > Reorganize the code for hive registration to isolate function > - > > Key: GOBBLIN-852 > URL: https://issues.apache.org/jira/browse/GOBBLIN-852 > Project: Apache Gobblin > Issue Type: Task > Components: hive-registration >Reporter: Zihan Li >Assignee: Abhishek Tiwari >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] ZihanLi58 opened a new pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function
ZihanLi58 opened a new pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708 …(ETL-8815) Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [GOBBLIN-852](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-852 ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Run streaming job on azkaban. ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-852) Reorganize the code for hive registration to isolate function
[ https://issues.apache.org/jira/browse/GOBBLIN-852?focusedWorklogId=293354&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293354 ] ASF GitHub Bot logged work on GOBBLIN-852: -- Author: ASF GitHub Bot Created on: 12/Aug/19 20:10 Start Date: 12/Aug/19 20:10 Worklog Time Spent: 10m Work Description: ZihanLi58 commented on pull request #2708: [GOBBLIN-852]Reorganize the code for hive registration to isolate function URL: https://github.com/apache/incubator-gobblin/pull/2708 …(ETL-8815) Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [GOBBLIN-852](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-852 ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Run streaming job on azkaban. ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293354) Time Spent: 10m Remaining Estimate: 0h > Reorganize the code for hive registration to isolate function > - > > Key: GOBBLIN-852 > URL: https://issues.apache.org/jira/browse/GOBBLIN-852 > Project: Apache Gobblin > Issue Type: Task > Components: hive-registration >Reporter: Zihan Li >Assignee: Abhishek Tiwari >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-852) Reorganize the code for hive registration to isolate function
Zihan Li created GOBBLIN-852: Summary: Reorganize the code for hive registration to isolate function Key: GOBBLIN-852 URL: https://issues.apache.org/jira/browse/GOBBLIN-852 Project: Apache Gobblin Issue Type: Task Components: hive-registration Reporter: Zihan Li Assignee: Abhishek Tiwari -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313072043 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java ## @@ -211,7 +211,9 @@ private static StorageDescriptor getStorageDescriptor(HiveRegistrationUnit unit) State props = unit.getStorageProps(); StorageDescriptor sd = new StorageDescriptor(); sd.setParameters(getParameters(props)); -sd.setCols(getFieldSchemas(unit)); +if (unit instanceof HiveTable) { Review comment: Due to we skip the schema registration in partition, the conversion from HivePartition to Partition will fail due to below exception. The addSchemaProperties is for HivePartition creation, not the HivePartition to Partition conversion. org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:119) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:177) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:103) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:80) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:520) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:399) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getFieldSchemas(HiveMetaStoreUtils.java:356) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getStorageDescriptor(HiveMetaStoreUtils.java:214) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getPartition(HiveMetaStoreUtils.java:164) at org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.addOrAlterPartition(HiveMetaStoreBasedRegister.java:458) at org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.registerPath(HiveMetaStoreBasedRegister.java:159) at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:109) at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:93) at org.apache.gobblin.util.executors.MDCPropagatingCallable.call(MDCPropagatingCallable.java:42) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293310 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 18:42 Start Date: 12/Aug/19 18:42 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313072043 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java ## @@ -211,7 +211,9 @@ private static StorageDescriptor getStorageDescriptor(HiveRegistrationUnit unit) State props = unit.getStorageProps(); StorageDescriptor sd = new StorageDescriptor(); sd.setParameters(getParameters(props)); -sd.setCols(getFieldSchemas(unit)); +if (unit instanceof HiveTable) { Review comment: Due to we skip the schema registration in partition, the conversion from HivePartition to Partition will fail due to below exception. The addSchemaProperties is for HivePartition creation, not the HivePartition to Partition conversion. org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:119) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:177) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:103) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:80) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:520) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:399) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getFieldSchemas(HiveMetaStoreUtils.java:356) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getStorageDescriptor(HiveMetaStoreUtils.java:214) at org.apache.gobblin.hive.metastore.HiveMetaStoreUtils.getPartition(HiveMetaStoreUtils.java:164) at org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.addOrAlterPartition(HiveMetaStoreBasedRegister.java:458) at org.apache.gobblin.hive.metastore.HiveMetaStoreBasedRegister.registerPath(HiveMetaStoreBasedRegister.java:159) at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:109) at org.apache.gobblin.hive.HiveRegister$1.call(HiveRegister.java:93) at org.apache.gobblin.util.executors.MDCPropagatingCallable.call(MDCPropagatingCallable.java:42) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293310) Time Spent: 1h (was: 50m) > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293307&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293307 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 18:35 Start Date: 12/Aug/19 18:35 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313068740 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java ## @@ -418,6 +421,11 @@ public T withTableName(String tableName) { return (T) this; } +public T withRegisterSchema(boolean registerSchema) { Review comment: Put it this way, the getHivePartition's actual implementation is in gobblin-hive-registration MP, which is out of the open source code base. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293307) Time Spent: 50m (was: 40m) > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313068740 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java ## @@ -418,6 +421,11 @@ public T withTableName(String tableName) { return (T) this; } +public T withRegisterSchema(boolean registerSchema) { Review comment: Put it this way, the getHivePartition's actual implementation is in gobblin-hive-registration MP, which is out of the open source code base. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293300&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293300 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 18:23 Start Date: 12/Aug/19 18:23 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313063000 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java ## @@ -418,6 +421,11 @@ public T withTableName(String tableName) { return (T) this; } +public T withRegisterSchema(boolean registerSchema) { Review comment: This is the builder method, so that our gobblin-hive-registration can set weather the partition needs to register schema. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293300) Time Spent: 40m (was: 0.5h) > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
yukuai518 commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313063000 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java ## @@ -418,6 +421,11 @@ public T withTableName(String tableName) { return (T) this; } +public T withRegisterSchema(boolean registerSchema) { Review comment: This is the builder method, so that our gobblin-hive-registration can set weather the partition needs to register schema. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293292&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293292 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 18:08 Start Date: 12/Aug/19 18:08 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313056277 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java ## @@ -418,6 +421,11 @@ public T withTableName(String tableName) { return (T) this; } +public T withRegisterSchema(boolean registerSchema) { Review comment: What's the usage of this field? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293292) Time Spent: 20m (was: 10m) > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293293 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 18:08 Start Date: 12/Aug/19 18:08 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313055179 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java ## @@ -211,7 +211,9 @@ private static StorageDescriptor getStorageDescriptor(HiveRegistrationUnit unit) State props = unit.getStorageProps(); StorageDescriptor sd = new StorageDescriptor(); sd.setParameters(getParameters(props)); -sd.setCols(getFieldSchemas(unit)); +if (unit instanceof HiveTable) { Review comment: I believe this should happen inside `addSchemaProperties` method This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293293) Time Spent: 0.5h (was: 20m) > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313056277 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/HiveRegistrationUnit.java ## @@ -418,6 +421,11 @@ public T withTableName(String tableName) { return (T) this; } +public T withRegisterSchema(boolean registerSchema) { Review comment: What's the usage of this field? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
autumnust commented on a change in pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707#discussion_r313055179 ## File path: gobblin-hive-registration/src/main/java/org/apache/gobblin/hive/metastore/HiveMetaStoreUtils.java ## @@ -211,7 +211,9 @@ private static StorageDescriptor getStorageDescriptor(HiveRegistrationUnit unit) State props = unit.getStorageProps(); StorageDescriptor sd = new StorageDescriptor(); sd.setParameters(getParameters(props)); -sd.setCols(getFieldSchemas(unit)); +if (unit instanceof HiveTable) { Review comment: I believe this should happen inside `addSchemaProperties` method This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] yukuai518 opened a new pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration.
yukuai518 opened a new pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-851 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): We had problems when table level schema and partition level schema diverges. Think about the case when user register two partitions : 2019/08/10, 2019/08/11, but schema changes in between(S1->S2). Now the table level has schema S2, but 2019/08/10 will have schema S1. Query on the latest schema will cause the old partition failure. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Unit tests in gobblin-hive-registration module. ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?focusedWorklogId=293283&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293283 ] ASF GitHub Bot logged work on GOBBLIN-851: -- Author: ASF GitHub Bot Created on: 12/Aug/19 17:48 Start Date: 12/Aug/19 17:48 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2707: [GOBBLIN-851] Provide capability to disable Hive partition schema registration. URL: https://github.com/apache/incubator-gobblin/pull/2707 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-851 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): We had problems when table level schema and partition level schema diverges. Think about the case when user register two partitions : 2019/08/10, 2019/08/11, but schema changes in between(S1->S2). Now the table level has schema S2, but 2019/08/10 will have schema S1. Query on the latest schema will cause the old partition failure. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Unit tests in gobblin-hive-registration module. ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293283) Time Spent: 10m Remaining Estimate: 0h > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
[ https://issues.apache.org/jira/browse/GOBBLIN-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuai Yu updated GOBBLIN-851: Description: We had problems when table level schema and partition level schema diverges. Think about the case when user register two partitions : 2019/08/10, 2019/08/11, but schema changes in between(S1->S2). Now the table level has schema S2, but 2019/08/10 will have schema S1. Query on the latest schema will cause the old partition failure. > Provide capability to disable hive schema registration in partition level > - > > Key: GOBBLIN-851 > URL: https://issues.apache.org/jira/browse/GOBBLIN-851 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Kuai Yu >Priority: Major > > We had problems when table level schema and partition level schema diverges. > Think about the case when user register two partitions : 2019/08/10, > 2019/08/11, but schema changes in between(S1->S2). Now the table level has > schema S2, but 2019/08/10 will have schema S1. > Query on the latest schema will cause the old partition failure. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (GOBBLIN-851) Provide capability to disable hive schema registration in partition level
Kuai Yu created GOBBLIN-851: --- Summary: Provide capability to disable hive schema registration in partition level Key: GOBBLIN-851 URL: https://issues.apache.org/jira/browse/GOBBLIN-851 Project: Apache Gobblin Issue Type: Improvement Reporter: Kuai Yu -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: Read locality on gobblin jobs
The Helix framework in cluster mode doesn't have data locality concept. I think that is only in YARN/MR mode. From: Jay Sen Sent: Sunday, August 11, 2019 5:25 PM To: dev@gobblin.incubator.apache.org Subject: Read locality on gobblin jobs Hi Dev Team, when gobblin runs on cluster mode or MR mode, if the job requires to reads data from the hadoop filesystem which is local, i.e on the same gobblin cluster, does gobblin or Helix figures out the data locality automatically ( as in typical MR job ) ? I doubt if this is the case, but just wanted to get some info on whether there is a way to achieve it anyway. some more context: I am preparing for following Gobblin pipeline. *external hadoop cluster *---job 1---> *gobblin hadoop cluster-1* ---job 2---> *gobblin hadoop cluster-2* job 3 --> *target platform* (oracle/mysql) here job 1,2,3 are the gobblin jobs. looking for suggestion on which gobblin mode would be best for this scenario as well. currently looking at Gobblin cluster mode and MR mode. Thanks Jay
[GitHub] [incubator-gobblin] krishraman commented on issue #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs.
krishraman commented on issue #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs. URL: https://github.com/apache/incubator-gobblin/pull/2706#issuecomment-520502737 Hi @htran1 @zxcware Can you please review this PL? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] codecov-io commented on issue #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs.
codecov-io commented on issue #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs. URL: https://github.com/apache/incubator-gobblin/pull/2706#issuecomment-520502431 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=h1) Report > Merging [#2706](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/50280ee6bd591e74746faf4cb4452733095e3c36?src=pr&el=desc) will **decrease** coverage by `40.71%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/graphs/tree.svg?width=650&token=4MgURJ0bGc&height=150&src=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #2706 +/- ## - Coverage 44.87% 4.15% -40.72% + Complexity 8708 736 -7972 Files 18791879 Lines 70095 70098+3 Branches 77047704 - Hits 314552915-28540 - Misses35728 66877+31149 + Partials 2912 306 -2606 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2706?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ava/org/apache/gobblin/metrics/event/JobEvent.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1tZXRyaWNzLWxpYnMvZ29iYmxpbi1tZXRyaWNzLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0cmljcy9ldmVudC9Kb2JFdmVudC5qYXZh) | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...pache/gobblin/runtime/mapreduce/MRJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvbWFwcmVkdWNlL01SSm9iTGF1bmNoZXIuamF2YQ==) | `54.49% <100%> (+0.38%)` | `18 <0> (ø)` | :arrow_down: | | [...n/converter/AvroStringFieldDecryptorConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY3J5cHRvLXByb3ZpZGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbnZlcnRlci9BdnJvU3RyaW5nRmllbGREZWNyeXB0b3JDb252ZXJ0ZXIuamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (-2%)` | | | [...he/gobblin/cluster/TaskRunnerSuiteThreadModel.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvVGFza1J1bm5lclN1aXRlVGhyZWFkTW9kZWwuamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (-5%)` | | | [...n/mapreduce/avro/AvroKeyCompactorOutputFormat.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vbWFwcmVkdWNlL2F2cm8vQXZyb0tleUNvbXBhY3Rvck91dHB1dEZvcm1hdC5qYXZh) | `0% <0%> (-100%)` | `0% <0%> (-3%)` | | | [...apache/gobblin/fork/CopyNotSupportedException.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZm9yay9Db3B5Tm90U3VwcG9ydGVkRXhjZXB0aW9uLmphdmE=) | `0% <0%> (-100%)` | `0% <0%> (-1%)` | | | [.../gobblin/kafka/writer/KafkaWriterCommonConfig.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2thZmthL3dyaXRlci9LYWZrYVdyaXRlckNvbW1vbkNvbmZpZy5qYXZh) | `0% <0%> (-100%)` | `0% <0%> (-7%)` | | | [...ker/task/TaskLevelPolicyCheckerBuilderFactory.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3F1YWxpdHljaGVja2VyL3Rhc2svVGFza0xldmVsUG9saWN5Q2hlY2tlckJ1aWxkZXJGYWN0b3J5LmphdmE=) | `0% <0%> (-100%)` | `0% <0%> (-2%)` | | | [...bblin/data/management/copy/AllEqualComparator.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvQWxsRXF1YWxDb21wYXJhdG9yLmphdmE=) | `0% <0%> (-100%)` | `0% <0%> (-2%)` | | | [...blin/converter/string/ObjectToStringConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbnZlcnRlci9zdHJpbmcvT2JqZWN0VG9TdHJpbmdDb252ZXJ0ZXIuamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (-3%)` | | | ... and [1075 more](https://codecov.io/gh/apache/incubator-gobblin/pull/2706/diff?src=pr&el=tree-more) | | -- [Conti
[GitHub] [incubator-gobblin] krishraman opened a new pull request #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs.
krishraman opened a new pull request #2706: Emit WorkUnitsCreated Count Event for MR deployed jobs. URL: https://github.com/apache/incubator-gobblin/pull/2706 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-766] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-766 ### Description - [ x] Here are some details about my PR, including screenshots (if applicable): ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Tested on Yugioh by emitted the events to kafka and verifying using kafka-tool ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Reopened] (GOBBLIN-766) Emit Workunits created event in Apache gobblin
[ https://issues.apache.org/jira/browse/GOBBLIN-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kraman reopened GOBBLIN-766: > Emit Workunits created event in Apache gobblin > - > > Key: GOBBLIN-766 > URL: https://issues.apache.org/jira/browse/GOBBLIN-766 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: kraman >Priority: Minor > Fix For: 0.15.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Emit a new workunits created metric to be captured for monitoring/Alerting -- This message was sent by Atlassian JIRA (v7.6.14#76016)