[
https://issues.apache.org/jira/browse/BEAM-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Charles Chen resolved BEAM-3512.
--------------------------------
Resolution: Fixed
Fix Version/s: 2.3.0
> Python PTransform overrides do not completely remove the overriden transform
> ----------------------------------------------------------------------------
>
> Key: BEAM-3512
> URL: https://issues.apache.org/jira/browse/BEAM-3512
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Affects Versions: 2.2.0
> Reporter: Charles Chen
> Assignee: Charles Chen
> Priority: Major
> Fix For: 2.3.0
>
>
> In Python Beam runners, we support the use of PTransformOverrides to allow
> runners to override the behavior of specific transforms. The overriding
> mechanism seeks to excise the original transform (with its composite children
> transforms, if any) from the original graph, and graft the new replacement to
> any inputs and outputs of the original transform. However, this mechanism
> does not completely remove these pieces. Specifically:
> 1. Composite transform parts are not removed from the overridden
> AppliedPTransform; we only attempt to remove all the labels.
> 2. Not all labels are recursively removed; there is a bug in the label
> removing logic so that only the labels on every other level of the nested
> composite PTransform hierarchy are removed (see
> [https://github.com/apache/beam/blob/61777b4338733d99f4f858be0d7d0313ec138a06/sdks/python/apache_beam/pipeline.py#L171]).
> 3. Some excised parts of the pipeline still seem to be run: if we set
> "part.transform = None" for each overridden composite part of the pipeline,
> some tests fail with an error indicating that it could not execute the "None"
> transform. This implies that these parts are still somehow present in the
> execution graph.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)