[
https://issues.apache.org/jira/browse/BEAM-7750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ismaël Mejía updated BEAM-7750:
-------------------------------
Status: Open (was: Triage Needed)
> Pipeline instances are not garbage collected
> --------------------------------------------
>
> Key: BEAM-7750
> URL: https://issues.apache.org/jira/browse/BEAM-7750
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.14.0
> Environment: OS: Debian rodete.
> Tested using:
> Beam versions: 2.13.0, 2.15.0.dev
> Python versions: Python 2.7, Python 3.7.
> Runners: DirectRunner, DataflowRunner.
> Reporter: Alexey Strokach
> Priority: Minor
>
> It seems that Apache Beam's Pipeline instances are not garbage collected,
> even if the pipelines are finished or cancelled and there are no references
> to those pipelines in the Python interpreter.
> For pipelines executed in a script, this is not a problem. However, for
> interactive pipelines executed inside a Jupyter notebook, this limits how
> well we can track and remove the dependencies of those pipelines. For
> example, if a pipeline reads from some cache, it would be nice to be able to
> delete that cache once there are no references to it from pipelines or the
> global namespace.
> The issue can be reproduced using the following script:
> [https://github.com/ostrokach/beam-notebooks/blob/48718038e63342a5f3acc31352a6326fffd34888/scripts/error_pipeline_gc.py]
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)