[jira] [Work logged] (GOBBLIN-804) Fix config member variable not being set
[ https://issues.apache.org/jira/browse/GOBBLIN-804?focusedWorklogId=259272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259272 ] ASF GitHub Bot logged work on GOBBLIN-804: -- Author: ASF GitHub Bot Created on: 13/Jun/19 04:51 Start Date: 13/Jun/19 04:51 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2670: [GOBBLIN-804] Fix config member variable not being set URL: https://github.com/apache/incubator-gobblin/pull/2670 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259272) Time Spent: 0.5h (was: 20m) > Fix config member variable not being set > > > Key: GOBBLIN-804 > URL: https://issues.apache.org/jira/browse/GOBBLIN-804 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jack Moseley >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] asfgit closed pull request #2670: [GOBBLIN-804] Fix config member variable not being set
asfgit closed pull request #2670: [GOBBLIN-804] Fix config member variable not being set URL: https://github.com/apache/incubator-gobblin/pull/2670 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] jack-moseley commented on issue #2670: [GOBBLIN-804] Fix config member variable not being set
jack-moseley commented on issue #2670: [GOBBLIN-804] Fix config member variable not being set URL: https://github.com/apache/incubator-gobblin/pull/2670#issuecomment-501524188 @sv2000 please merge This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-804) Fix config member variable not being set
[ https://issues.apache.org/jira/browse/GOBBLIN-804?focusedWorklogId=259216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259216 ] ASF GitHub Bot logged work on GOBBLIN-804: -- Author: ASF GitHub Bot Created on: 13/Jun/19 01:54 Start Date: 13/Jun/19 01:54 Worklog Time Spent: 10m Work Description: jack-moseley commented on pull request #2670: [GOBBLIN-804] Fix config member variable not being set URL: https://github.com/apache/incubator-gobblin/pull/2670 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-804 ### Description - [ ] Here are some details about my PR, including screenshots (if applicable): #2647 removed this line, `this.config` should be set since there is a getter and subclasses could call it. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: trivial ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259216) Time Spent: 10m Remaining Estimate: 0h > Fix config member variable not being set > > > Key: GOBBLIN-804 > URL: https://issues.apache.org/jira/browse/GOBBLIN-804 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Jack Moseley >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-759) DistCP files modified in last n days within a look back period
[ https://issues.apache.org/jira/browse/GOBBLIN-759?focusedWorklogId=259196=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259196 ] ASF GitHub Bot logged work on GOBBLIN-759: -- Author: ASF GitHub Bot Created on: 13/Jun/19 00:52 Start Date: 13/Jun/19 00:52 Worklog Time Spent: 10m Work Description: amarnathkarthik commented on issue #2633: GOBBLIN-759: Added feature to support DistCP to copy files that were … URL: https://github.com/apache/incubator-gobblin/pull/2633#issuecomment-501507352 @jhsenjaliya Pushed the changes, please review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259196) Time Spent: 3h 10m (was: 3h) > DistCP files modified in last n days within a look back period > -- > > Key: GOBBLIN-759 > URL: https://issues.apache.org/jira/browse/GOBBLIN-759 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Karthik Amarnath >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > *Feature Request:* > # DistCP only the files modified in last n days within the look back window. > # DistCP will copy only the files modified even when the source file which > were NOT modified in last n days in the destination directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] amarnathkarthik commented on issue #2633: GOBBLIN-759: Added feature to support DistCP to copy files that were …
amarnathkarthik commented on issue #2633: GOBBLIN-759: Added feature to support DistCP to copy files that were … URL: https://github.com/apache/incubator-gobblin/pull/2633#issuecomment-501507352 @jhsenjaliya Pushed the changes, please review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=259003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259003 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 12/Jun/19 19:09 Start Date: 12/Jun/19 19:09 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293072861 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { + private static final String FORK_METRICS_BRANCH_NAME_KEY = "forkBranchName"; + + protected ForkMetrics(TaskState taskState, int index) { +super(name(taskState, index), parentContextForFork(taskState), getForkMetricsTags(taskState, index)); + } + + private static MetricContext parentContextForFork(TaskState taskState) { +return TaskMetrics.get(METRICS_ID_PREFIX + taskState.getJobId() + "." + taskState.getTaskId()).getMetricContext(); + } + + public static ForkMetrics get(final TaskState taskState, int index) { +return (ForkMetrics) GOBBLIN_METRICS_REGISTRY.getOrDefault(name(taskState, index), new Callable() { + @Override + public GobblinMetrics call() throws Exception { +return new ForkMetrics(taskState, index); + } +}); + } + + /** + * Creates a unique {@link String} representing this branch. + */ + private static String getForkMetricsId(State state, int index) { +return state.getProp(ConfigurationKeys.FORK_BRANCH_NAME_KEY + "." + index, +ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + index); + } + + /** + * Creates a {@link List} of {@link Tag}s for a {@link Fork} instance. The {@link Tag}s are purely based on the + * index and the branch name. + */ + private static List> getForkMetricsTags(State state, int index) { +return ImmutableList.>of(new Tag<>(FORK_METRICS_BRANCH_NAME_KEY, getForkMetricsId(state, index))); Review comment: > Minor comments. LGTM. One question: can we enhance the ClusterIntegrationTest to verify that metrics objects are getting removed from the metrics registry? @sv2000 I have added the metrics cleanup validation in ClusterIntegrationTest. Please review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259003) Time Spent: 1h 40m (was: 1.5h) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry
yukuai518 commented on a change in pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293072861 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { + private static final String FORK_METRICS_BRANCH_NAME_KEY = "forkBranchName"; + + protected ForkMetrics(TaskState taskState, int index) { +super(name(taskState, index), parentContextForFork(taskState), getForkMetricsTags(taskState, index)); + } + + private static MetricContext parentContextForFork(TaskState taskState) { +return TaskMetrics.get(METRICS_ID_PREFIX + taskState.getJobId() + "." + taskState.getTaskId()).getMetricContext(); + } + + public static ForkMetrics get(final TaskState taskState, int index) { +return (ForkMetrics) GOBBLIN_METRICS_REGISTRY.getOrDefault(name(taskState, index), new Callable() { + @Override + public GobblinMetrics call() throws Exception { +return new ForkMetrics(taskState, index); + } +}); + } + + /** + * Creates a unique {@link String} representing this branch. + */ + private static String getForkMetricsId(State state, int index) { +return state.getProp(ConfigurationKeys.FORK_BRANCH_NAME_KEY + "." + index, +ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + index); + } + + /** + * Creates a {@link List} of {@link Tag}s for a {@link Fork} instance. The {@link Tag}s are purely based on the + * index and the branch name. + */ + private static List> getForkMetricsTags(State state, int index) { +return ImmutableList.>of(new Tag<>(FORK_METRICS_BRANCH_NAME_KEY, getForkMetricsId(state, index))); Review comment: > Minor comments. LGTM. One question: can we enhance the ClusterIntegrationTest to verify that metrics objects are getting removed from the metrics registry? @sv2000 I have added the metrics cleanup validation in ClusterIntegrationTest. Please review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=259002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259002 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 12/Jun/19 19:08 Start Date: 12/Jun/19 19:08 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293072511 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java ## @@ -451,13 +452,26 @@ private Task createTaskRunnable(WorkUnitState workUnitState, CountDownLatch coun public void runAndOptionallyCommitTaskAttempt(CommitPolicy multiTaskAttemptCommitPolicy) throws IOException, InterruptedException { -run(); -if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { - this.log.info("Will commit tasks directly."); - commit(); -} else if (!isSpeculativeExecutionSafe()) { - throw new RuntimeException( - "Speculative execution is enabled. However, the task context is not safe for speculative execution."); +try { + run(); + if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { +this.log.info("Will commit tasks directly."); +commit(); + } else if (!isSpeculativeExecutionSafe()) { +throw new RuntimeException( +"Speculative execution is enabled. However, the task context is not safe for speculative execution."); + } +} finally { + // During the task execution, the fork/task instances will create metric contexts (fork, task, job, container) + // along the hierarchy up to the root metric context. Although root metric context has a weak reference to + // those metric contexts, they are meanwhile cached by GobblinMetricsRegistry. Here we will remove all those + // strong reference from the cache to make sure it can be reclaimed by Java GC when JVM has run out of memory. + + this.tasks.forEach(task-> { +TaskMetrics.remove(task); + }); + + JobMetrics.remove(GobblinMetrics.METRICS_ID_PREFIX + jobState.getJobId()); Review comment: @ibuenros Please check my latest change where a creatorTag was added to make sure the JobMetrics can be removed by the correct owner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259002) Time Spent: 1.5h (was: 1h 20m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] yukuai518 commented on a change in pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry
yukuai518 commented on a change in pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293072511 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java ## @@ -451,13 +452,26 @@ private Task createTaskRunnable(WorkUnitState workUnitState, CountDownLatch coun public void runAndOptionallyCommitTaskAttempt(CommitPolicy multiTaskAttemptCommitPolicy) throws IOException, InterruptedException { -run(); -if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { - this.log.info("Will commit tasks directly."); - commit(); -} else if (!isSpeculativeExecutionSafe()) { - throw new RuntimeException( - "Speculative execution is enabled. However, the task context is not safe for speculative execution."); +try { + run(); + if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { +this.log.info("Will commit tasks directly."); +commit(); + } else if (!isSpeculativeExecutionSafe()) { +throw new RuntimeException( +"Speculative execution is enabled. However, the task context is not safe for speculative execution."); + } +} finally { + // During the task execution, the fork/task instances will create metric contexts (fork, task, job, container) + // along the hierarchy up to the root metric context. Although root metric context has a weak reference to + // those metric contexts, they are meanwhile cached by GobblinMetricsRegistry. Here we will remove all those + // strong reference from the cache to make sure it can be reclaimed by Java GC when JVM has run out of memory. + + this.tasks.forEach(task-> { +TaskMetrics.remove(task); + }); + + JobMetrics.remove(GobblinMetrics.METRICS_ID_PREFIX + jobState.getJobId()); Review comment: @ibuenros Please check my latest change where a creatorTag was added to make sure the JobMetrics can be removed by the correct owner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] autumnust commented on issue #2658: [GOBBLIN-793] Separate SpecSerDe from SpecCatalogs and add GsonSpecSerDe
autumnust commented on issue #2658: [GOBBLIN-793] Separate SpecSerDe from SpecCatalogs and add GsonSpecSerDe URL: https://github.com/apache/incubator-gobblin/pull/2658#issuecomment-501379152 +1 LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-802) register Gauge in innerMetricsContext
[ https://issues.apache.org/jira/browse/GOBBLIN-802?focusedWorklogId=258880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-258880 ] ASF GitHub Bot logged work on GOBBLIN-802: -- Author: ASF GitHub Bot Created on: 12/Jun/19 16:52 Start Date: 12/Jun/19 16:52 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2668: [GOBBLIN-802] change gauge metrics context to RootMetricsContext URL: https://github.com/apache/incubator-gobblin/pull/2668 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 258880) Time Spent: 0.5h (was: 20m) > register Gauge in innerMetricsContext > - > > Key: GOBBLIN-802 > URL: https://issues.apache.org/jira/browse/GOBBLIN-802 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] asfgit closed pull request #2668: [GOBBLIN-802] change gauge metrics context to RootMetricsContext
asfgit closed pull request #2668: [GOBBLIN-802] change gauge metrics context to RootMetricsContext URL: https://github.com/apache/incubator-gobblin/pull/2668 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services