[ 
https://issues.apache.org/jira/browse/BEAM-9474?focusedWorklogId=400426&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-400426
 ]

ASF GitHub Bot logged work on BEAM-9474:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Mar/20 22:49
            Start Date: 09/Mar/20 22:49
    Worklog Time Spent: 10m 
      Work Description: mxm commented on pull request #11084: [BEAM-9474] 
Improve robustness of BundleFactory and ProcessEnvironment
URL: https://github.com/apache/beam/pull/11084#discussion_r390003729
 
 

 ##########
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java
 ##########
 @@ -166,11 +168,20 @@ public static DefaultJobBundleFactory create(
         CacheBuilder.newBuilder()
             .removalListener(
                 (RemovalNotification<Environment, WrappedSdkHarnessClient> 
notification) -> {
-                  int refCount = notification.getValue().unref();
-                  LOG.debug(
-                      "Removed environment {} with {} remaining bundle 
references.",
-                      notification.getKey(),
-                      refCount);
+                  WrappedSdkHarnessClient client = notification.getValue();
 
 Review comment:
   It doesn't work though if we do not ensure dereferencing under all 
circumstances. We need a safeguard here, also considering other runners may not 
dereference correctly. Generally, it is hard to guarantee dereferencing due to 
the nesting of DoFnRunners which may not even allow closing the bundle in error 
cases. I considered not doing this but I think it is the safer route.
   
   If you take a step back, when would the reference counting really be useful? 
Every restarted job will anyways run in a new classloader, so the environment 
will never be recycled. When we call close we should tear down everything. 
   
   Taking back another step, the reference counting should really be removed 
entirely. It was error prone from the beginning leading to subtle problems with 
dereferencing. If you don't mind, I'd remove it. What do you think?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 400426)
    Time Spent: 2h 20m  (was: 2h 10m)

> Environment cleanup is not robust enough and may leak resources
> ---------------------------------------------------------------
>
>                 Key: BEAM-9474
>                 URL: https://issues.apache.org/jira/browse/BEAM-9474
>             Project: Beam
>          Issue Type: Bug
>          Components: java-fn-execution
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> The cleanup code in {{DefaultJobBundleFactory}} and its {{RemoteEnvironment}} 
> s may leak resources. This is especially a concern when the execution engines 
> reuses the same JVM or underlying machines for multiple runs of a pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to