[ 
https://issues.apache.org/jira/browse/BEAM-7750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-7750:
----------------------------------

This Jira ticket has a pull request attached to it, but is still open. Did the 
pull request resolve the issue? If so, could you please mark it resolved? This 
will help the project have a clear view of its open issues.

> Pipeline instances are not garbage collected
> --------------------------------------------
>
>                 Key: BEAM-7750
>                 URL: https://issues.apache.org/jira/browse/BEAM-7750
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.14.0
>         Environment: OS: Debian rodete.
> Tested using: 
> Beam versions: 2.13.0, 2.15.0.dev
> Python versions: Python 2.7, Python 3.7.
> Runners:  DirectRunner, DataflowRunner.
>            Reporter: Alexey Strokach
>            Priority: P3
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> It seems that Apache Beam's Pipeline instances are not garbage collected, 
> even if the pipelines are finished or cancelled and there are no references 
> to those pipelines in the Python interpreter.
> For pipelines executed in a script, this is not a problem. However, for 
> interactive pipelines executed inside a Jupyter notebook, this limits how 
> well we can track and remove the dependencies of those pipelines. For 
> example, if a pipeline reads from some cache, it would be nice to be able to 
> delete that cache once there are no references to it from pipelines or the 
> global namespace.
> The issue can be reproduced using the following script: 
> [https://gist.github.com/ostrokach/a16556dc77c96b87fe23c2fbd8fb6346].
> -----
> On further examination, turns out that this is due to the 
> [{{_PubSubReadEvaluator._subscription_cache}}|https://github.com/apache/beam/blob/27bb5bc7b244809e7f6022adb2730d10204ce4d3/sdks/python/apache_beam/runners/direct/transform_evaluator.py#L418]
>  class attribute keeping references to all {{ReadFromPubSub}} transforms.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to