[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=260580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-260580 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 14/Jun/19 17:15 Start Date: 14/Jun/19 17:15 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 260580) Time Spent: 2h 10m (was: 2h) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Fix For: 0.15.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=259968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259968 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 13/Jun/19 22:32 Start Date: 13/Jun/19 22:32 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293601532 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/JobMetrics.java ## @@ -36,67 +40,112 @@ * * @author Yinan Li */ +@Slf4j public class JobMetrics extends GobblinMetrics { + public static final CreatorTag DEFAULT_CREATOR_TAG = new CreatorTag( "driver"); protected final String jobName; - - protected JobMetrics(JobState job) { -this(job, null); + @Getter + protected final CreatorTag creatorTag; + protected JobMetrics(JobState job, CreatorTag tag) { +this(job, null, tag); } - protected JobMetrics(JobState job, MetricContext parentContext) { + protected JobMetrics(JobState job, MetricContext parentContext, CreatorTag creatorTag) { super(name(job), parentContext, tagsForJob(job)); this.jobName = job.getJobName(); +this.creatorTag = creatorTag; + } + + public static class CreatorTag extends Tag { +public CreatorTag(String value) { + super("creator", value); +} } /** * Get a new {@link GobblinMetrics} instance for a given job. + * This method has been deprecated. Please consider to use {@link JobMetrics#get(String, String, CreatorTag)} * * @param jobName job name * @param jobId job ID * @return a new {@link GobblinMetrics} instance for the given job */ + @Deprecated public static JobMetrics get(String jobName, String jobId) { -return get(new JobState(jobName, jobId)); +return get(new JobState(jobName, jobId), DEFAULT_CREATOR_TAG); } /** * Get a new {@link GobblinMetrics} instance for a given job. * + * @param creatorTag the unique id which can tell who initiates this get operation + * @param jobName job name * @param jobId job ID + * @param creatorTag who creates this job metrics * @return a new {@link GobblinMetrics} instance for the given job */ - public static JobMetrics get(String jobId) { -return get(null, jobId); + public static JobMetrics get(String jobName, String jobId, CreatorTag creatorTag) { +return get(new JobState(jobName, jobId), creatorTag); } /** * Get a new {@link GobblinMetrics} instance for a given job. * + * @param creatorTag the unique id which can tell who initiates this get operation + * @param jobId job ID + * @return a new {@link GobblinMetrics} instance for the given job + */ + public static JobMetrics get(String jobId, CreatorTag creatorTag) { +return get(null, jobId, creatorTag); + } + + /** + * Get a new {@link GobblinMetrics} instance for a given job. + * + * @param creatorTag the unique id which can tell who initiates this get operation * @param jobState the given {@link JobState} instance * @param parentContext is the parent {@link MetricContext} * @return a {@link JobMetrics} instance */ - public static JobMetrics get(final JobState jobState, final MetricContext parentContext) { + public static JobMetrics get(final JobState jobState, final MetricContext parentContext, CreatorTag creatorTag) { return (JobMetrics) GOBBLIN_METRICS_REGISTRY.getOrDefault(name(jobState), new Callable() { @Override public GobblinMetrics call() throws Exception { -return new JobMetrics(jobState, parentContext); +return new JobMetrics(jobState, parentContext, creatorTag); } }); } /** * Get a {@link JobMetrics} instance for the job with the given {@link JobState} instance. + * This method has been deprecated. Please consider to use {@link JobMetrics#get(JobState, CreatorTag)}. Review comment: Use `@deprecated` instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259968) Time Spent: 1h 50m (was: 1h 40m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter:
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=259969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259969 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 13/Jun/19 22:32 Start Date: 13/Jun/19 22:32 Worklog Time Spent: 10m Work Description: ibuenros commented on issue #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#issuecomment-501902393 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259969) Time Spent: 2h (was: 1h 50m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=259003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259003 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 12/Jun/19 19:09 Start Date: 12/Jun/19 19:09 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293072861 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { + private static final String FORK_METRICS_BRANCH_NAME_KEY = "forkBranchName"; + + protected ForkMetrics(TaskState taskState, int index) { +super(name(taskState, index), parentContextForFork(taskState), getForkMetricsTags(taskState, index)); + } + + private static MetricContext parentContextForFork(TaskState taskState) { +return TaskMetrics.get(METRICS_ID_PREFIX + taskState.getJobId() + "." + taskState.getTaskId()).getMetricContext(); + } + + public static ForkMetrics get(final TaskState taskState, int index) { +return (ForkMetrics) GOBBLIN_METRICS_REGISTRY.getOrDefault(name(taskState, index), new Callable() { + @Override + public GobblinMetrics call() throws Exception { +return new ForkMetrics(taskState, index); + } +}); + } + + /** + * Creates a unique {@link String} representing this branch. + */ + private static String getForkMetricsId(State state, int index) { +return state.getProp(ConfigurationKeys.FORK_BRANCH_NAME_KEY + "." + index, +ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + index); + } + + /** + * Creates a {@link List} of {@link Tag}s for a {@link Fork} instance. The {@link Tag}s are purely based on the + * index and the branch name. + */ + private static List> getForkMetricsTags(State state, int index) { +return ImmutableList.>of(new Tag<>(FORK_METRICS_BRANCH_NAME_KEY, getForkMetricsId(state, index))); Review comment: > Minor comments. LGTM. One question: can we enhance the ClusterIntegrationTest to verify that metrics objects are getting removed from the metrics registry? @sv2000 I have added the metrics cleanup validation in ClusterIntegrationTest. Please review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259003) Time Spent: 1h 40m (was: 1.5h) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=259002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-259002 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 12/Jun/19 19:08 Start Date: 12/Jun/19 19:08 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r293072511 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java ## @@ -451,13 +452,26 @@ private Task createTaskRunnable(WorkUnitState workUnitState, CountDownLatch coun public void runAndOptionallyCommitTaskAttempt(CommitPolicy multiTaskAttemptCommitPolicy) throws IOException, InterruptedException { -run(); -if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { - this.log.info("Will commit tasks directly."); - commit(); -} else if (!isSpeculativeExecutionSafe()) { - throw new RuntimeException( - "Speculative execution is enabled. However, the task context is not safe for speculative execution."); +try { + run(); + if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { +this.log.info("Will commit tasks directly."); +commit(); + } else if (!isSpeculativeExecutionSafe()) { +throw new RuntimeException( +"Speculative execution is enabled. However, the task context is not safe for speculative execution."); + } +} finally { + // During the task execution, the fork/task instances will create metric contexts (fork, task, job, container) + // along the hierarchy up to the root metric context. Although root metric context has a weak reference to + // those metric contexts, they are meanwhile cached by GobblinMetricsRegistry. Here we will remove all those + // strong reference from the cache to make sure it can be reclaimed by Java GC when JVM has run out of memory. + + this.tasks.forEach(task-> { +TaskMetrics.remove(task); + }); + + JobMetrics.remove(GobblinMetrics.METRICS_ID_PREFIX + jobState.getJobId()); Review comment: @ibuenros Please check my latest change where a creatorTag was added to make sure the JobMetrics can be removed by the correct owner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 259002) Time Spent: 1.5h (was: 1h 20m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=257296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257296 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 10/Jun/19 23:50 Start Date: 10/Jun/19 23:50 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r292232789 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { + private static final String FORK_METRICS_BRANCH_NAME_KEY = "forkBranchName"; + + protected ForkMetrics(TaskState taskState, int index) { +super(name(taskState, index), parentContextForFork(taskState), getForkMetricsTags(taskState, index)); + } + + private static MetricContext parentContextForFork(TaskState taskState) { +return TaskMetrics.get(METRICS_ID_PREFIX + taskState.getJobId() + "." + taskState.getTaskId()).getMetricContext(); + } + + public static ForkMetrics get(final TaskState taskState, int index) { +return (ForkMetrics) GOBBLIN_METRICS_REGISTRY.getOrDefault(name(taskState, index), new Callable() { + @Override + public GobblinMetrics call() throws Exception { +return new ForkMetrics(taskState, index); + } +}); + } + + /** + * Creates a unique {@link String} representing this branch. + */ + private static String getForkMetricsId(State state, int index) { +return state.getProp(ConfigurationKeys.FORK_BRANCH_NAME_KEY + "." + index, +ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + index); + } + + /** + * Creates a {@link List} of {@link Tag}s for a {@link Fork} instance. The {@link Tag}s are purely based on the + * index and the branch name. + */ + private static List> getForkMetricsTags(State state, int index) { +return ImmutableList.>of(new Tag<>(FORK_METRICS_BRANCH_NAME_KEY, getForkMetricsId(state, index))); Review comment: The getForkMetricsId never returns null, because it has a default value. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 257296) Time Spent: 1h 10m (was: 1h) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=257297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257297 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 10/Jun/19 23:50 Start Date: 10/Jun/19 23:50 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r292213096 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java ## @@ -451,13 +452,26 @@ private Task createTaskRunnable(WorkUnitState workUnitState, CountDownLatch coun public void runAndOptionallyCommitTaskAttempt(CommitPolicy multiTaskAttemptCommitPolicy) throws IOException, InterruptedException { -run(); -if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { - this.log.info("Will commit tasks directly."); - commit(); -} else if (!isSpeculativeExecutionSafe()) { - throw new RuntimeException( - "Speculative execution is enabled. However, the task context is not safe for speculative execution."); +try { + run(); + if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { +this.log.info("Will commit tasks directly."); +commit(); + } else if (!isSpeculativeExecutionSafe()) { +throw new RuntimeException( +"Speculative execution is enabled. However, the task context is not safe for speculative execution."); + } +} finally { + // During the task execution, the fork/task instances will create metric contexts (fork, task, job, container) + // along the hierarchy up to the root metric context. Although root metric context has a weak reference to + // those metric contexts, they are meanwhile cached by GobblinMetricsRegistry. Here we will remove all those + // strong reference from the cache to make sure it can be reclaimed by Java GC when JVM has run out of memory. + + this.tasks.forEach(task-> { +TaskMetrics.remove(task); + }); + + JobMetrics.remove(GobblinMetrics.METRICS_ID_PREFIX + jobState.getJobId()); Review comment: Most of our Gobblin code runs in distributed mode, like MRJobLauncher, HelixJobLauncher, etc. I will move this logic outside of GobblinMultiTaskAttempt to make sure it doesn't break LocalJobLauncher. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 257297) Time Spent: 1h 20m (was: 1h 10m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=257245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257245 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 10/Jun/19 22:16 Start Date: 10/Jun/19 22:16 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r292213096 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java ## @@ -451,13 +452,26 @@ private Task createTaskRunnable(WorkUnitState workUnitState, CountDownLatch coun public void runAndOptionallyCommitTaskAttempt(CommitPolicy multiTaskAttemptCommitPolicy) throws IOException, InterruptedException { -run(); -if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { - this.log.info("Will commit tasks directly."); - commit(); -} else if (!isSpeculativeExecutionSafe()) { - throw new RuntimeException( - "Speculative execution is enabled. However, the task context is not safe for speculative execution."); +try { + run(); + if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { +this.log.info("Will commit tasks directly."); +commit(); + } else if (!isSpeculativeExecutionSafe()) { +throw new RuntimeException( +"Speculative execution is enabled. However, the task context is not safe for speculative execution."); + } +} finally { + // During the task execution, the fork/task instances will create metric contexts (fork, task, job, container) + // along the hierarchy up to the root metric context. Although root metric context has a weak reference to + // those metric contexts, they are meanwhile cached by GobblinMetricsRegistry. Here we will remove all those + // strong reference from the cache to make sure it can be reclaimed by Java GC when JVM has run out of memory. + + this.tasks.forEach(task-> { +TaskMetrics.remove(task); + }); + + JobMetrics.remove(GobblinMetrics.METRICS_ID_PREFIX + jobState.getJobId()); Review comment: Most of our Gobblin code runs in distributed mode, like MRJobLauncher, HelixJobLauncher, etc. I will add the defensive check to make sure that this GobblinMultiTaskAttempt was actually in the separate process. But even for LocalJobLauncher, GobblinMetricsRegistry is actually a cache, there is no guarantee that JobMetrics cannot be evicted if TaskMetrics is still alive. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 257245) Time Spent: 1h (was: 50m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=257239=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257239 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 10/Jun/19 22:10 Start Date: 10/Jun/19 22:10 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r292211586 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { Review comment: This is just refactoring This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 257239) Time Spent: 50m (was: 40m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=257236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257236 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 10/Jun/19 21:55 Start Date: 10/Jun/19 21:55 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r292207566 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { Review comment: Is this just refactoring or is different logic being added? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 257236) Time Spent: 40m (was: 0.5h) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=256229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256229 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 07/Jun/19 21:23 Start Date: 07/Jun/19 21:23 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r291757745 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/ForkMetrics.java ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.gobblin.runtime.util; + +import java.util.List; +import java.util.concurrent.Callable; + +import com.google.common.collect.ImmutableList; + +import org.apache.gobblin.configuration.ConfigurationKeys; +import org.apache.gobblin.configuration.State; +import org.apache.gobblin.metrics.GobblinMetrics; +import org.apache.gobblin.metrics.MetricContext; +import org.apache.gobblin.metrics.Tag; +import org.apache.gobblin.runtime.TaskState; +import org.apache.gobblin.runtime.fork.Fork; + +/** + * An extension to {@link GobblinMetrics} specifically for {@link Fork}. + */ +public class ForkMetrics extends GobblinMetrics { + private static final String FORK_METRICS_BRANCH_NAME_KEY = "forkBranchName"; + + protected ForkMetrics(TaskState taskState, int index) { +super(name(taskState, index), parentContextForFork(taskState), getForkMetricsTags(taskState, index)); + } + + private static MetricContext parentContextForFork(TaskState taskState) { +return TaskMetrics.get(METRICS_ID_PREFIX + taskState.getJobId() + "." + taskState.getTaskId()).getMetricContext(); + } + + public static ForkMetrics get(final TaskState taskState, int index) { +return (ForkMetrics) GOBBLIN_METRICS_REGISTRY.getOrDefault(name(taskState, index), new Callable() { + @Override + public GobblinMetrics call() throws Exception { +return new ForkMetrics(taskState, index); + } +}); + } + + /** + * Creates a unique {@link String} representing this branch. + */ + private static String getForkMetricsId(State state, int index) { +return state.getProp(ConfigurationKeys.FORK_BRANCH_NAME_KEY + "." + index, +ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + index); + } + + /** + * Creates a {@link List} of {@link Tag}s for a {@link Fork} instance. The {@link Tag}s are purely based on the + * index and the branch name. + */ + private static List> getForkMetricsTags(State state, int index) { +return ImmutableList.>of(new Tag<>(FORK_METRICS_BRANCH_NAME_KEY, getForkMetricsId(state, index))); Review comment: Maybe add a null check here i.e. if (getForkMetricsId(state,index) != null)? Else, ImmutableList.of() may throw an exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256229) Time Spent: 0.5h (was: 20m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=256228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256228 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 07/Jun/19 21:23 Start Date: 07/Jun/19 21:23 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r291756656 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/TaskMetrics.java ## @@ -61,15 +62,35 @@ public GobblinMetrics call() throws Exception { /** * Remove the {@link TaskMetrics} instance for the task with the given {@link TaskMetrics} instance. + * Please note this method is invoked by job driver so it won't delete any underlying {@link ForkMetrics} + * because the {@link org.apache.gobblin.runtime.fork.Fork} can be created on different nodes. * * @param taskState the given {@link TaskState} instance */ public static void remove(TaskState taskState) { remove(name(taskState)); } + /** + * Remove the {@link TaskMetrics} instance for the task with the given {@link TaskMetrics} instance. + * Please note that this will also delete the underlying {@link ForkMetrics} related to this specific task. + * + * @param task the given task instance + */ + public static void remove(Task task) { +task.getForks().forEach(forkOpt -> { + remove(ForkMetrics.name(task.getTaskState(), forkOpt.get().getIndex())); +}); + +remove(name(task)); + } + private static String name(TaskState taskState) { -return "gobblin.metrics." + taskState.getJobId() + "." + taskState.getTaskId(); +return METRICS_ID_PREFIX + taskState.getJobId() + "." + taskState.getTaskId(); + } + + private static String name(Task task) { +return METRICS_ID_PREFIX + task.getJobId() + "." + task.getTaskId(); Review comment: return name(task.getTaskState())? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256228) Time Spent: 0.5h (was: 20m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-800) Remove the metric context cache from GobblinMetricsRegistry
[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=256152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256152 ] ASF GitHub Bot logged work on GOBBLIN-800: -- Author: ASF GitHub Bot Created on: 07/Jun/19 18:53 Start Date: 07/Jun/19 18:53 Worklog Time Spent: 10m Work Description: yukuai518 commented on issue #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#issuecomment-49595 @sv2000 can you help review? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256152) Time Spent: 20m (was: 10m) > Remove the metric context cache from GobblinMetricsRegistry > --- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Kuai Yu >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)