[ https://issues.apache.org/jira/browse/GOBBLIN-800?focusedWorklogId=257245&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257245 ]
ASF GitHub Bot logged work on GOBBLIN-800: ------------------------------------------ Author: ASF GitHub Bot Created on: 10/Jun/19 22:16 Start Date: 10/Jun/19 22:16 Worklog Time Spent: 10m Work Description: yukuai518 commented on pull request #2667: [GOBBLIN-800] Remove the metric context cache from GobblinMetricsRegistry URL: https://github.com/apache/incubator-gobblin/pull/2667#discussion_r292213096 ########## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java ########## @@ -451,13 +452,26 @@ private Task createTaskRunnable(WorkUnitState workUnitState, CountDownLatch coun public void runAndOptionallyCommitTaskAttempt(CommitPolicy multiTaskAttemptCommitPolicy) throws IOException, InterruptedException { - run(); - if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { - this.log.info("Will commit tasks directly."); - commit(); - } else if (!isSpeculativeExecutionSafe()) { - throw new RuntimeException( - "Speculative execution is enabled. However, the task context is not safe for speculative execution."); + try { + run(); + if (multiTaskAttemptCommitPolicy.equals(GobblinMultiTaskAttempt.CommitPolicy.IMMEDIATE)) { + this.log.info("Will commit tasks directly."); + commit(); + } else if (!isSpeculativeExecutionSafe()) { + throw new RuntimeException( + "Speculative execution is enabled. However, the task context is not safe for speculative execution."); + } + } finally { + // During the task execution, the fork/task instances will create metric contexts (fork, task, job, container) + // along the hierarchy up to the root metric context. Although root metric context has a weak reference to + // those metric contexts, they are meanwhile cached by GobblinMetricsRegistry. Here we will remove all those + // strong reference from the cache to make sure it can be reclaimed by Java GC when JVM has run out of memory. + + this.tasks.forEach(task-> { + TaskMetrics.remove(task); + }); + + JobMetrics.remove(GobblinMetrics.METRICS_ID_PREFIX + jobState.getJobId()); Review comment: Most of our Gobblin code runs in distributed mode, like MRJobLauncher, HelixJobLauncher, etc. I will add the defensive check to make sure that this GobblinMultiTaskAttempt was actually in the separate process. But even for LocalJobLauncher, GobblinMetricsRegistry is actually a cache, there is no guarantee that JobMetrics cannot be evicted if TaskMetrics is still alive. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 257245) Time Spent: 1h (was: 50m) > Remove the metric context cache from GobblinMetricsRegistry > ----------------------------------------------------------- > > Key: GOBBLIN-800 > URL: https://issues.apache.org/jira/browse/GOBBLIN-800 > Project: Apache Gobblin > Issue Type: Bug > Reporter: Kuai Yu > Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Remove the metric context cache from GobblinMetricsRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)