[
https://issues.apache.org/jira/browse/BEAM-11362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241133#comment-17241133
]
Ning Kang commented on BEAM-11362:
----------------------------------
This is intended behavior with change of
https://github.com/apache/beam/pull/13335.
As an InteractiveRunner user, when defining a pipeline outside the main scope
(such as within a function), they have to use `ib.watch({'pipeline_name':
pipeline_instance})` or `ib.watch(locals())` to track the pipeline in the
interactive environment.
Otherwise, they will run into the issue.
This does not impact Beam notebook users, because they either define pipelines
in the main scope (directly in notebook cells) or pass around the
pipeline/pcollection objects and use them in `ib.show()`, `ib.show_graph()` or
`ib.collect()` (the APIs track the pcollections/pipelines automatically).
This only affects users using InteractiveRunner for the purpose of
materializing PCollections such as in tests where they don't really care about
the pipeline object.
The fix should be adding the explicit `watch` statement in the same scope after
defining the pipeline.
> retrieved_user_pipeline.visit(CacheableUnboundedPCollectionVisitor())
> AttributeError: 'NoneType' object has no attribute 'visit'
> --------------------------------------------------------------------------------------------------------------------------------
>
> Key: BEAM-11362
> URL: https://issues.apache.org/jira/browse/BEAM-11362
> Project: Beam
> Issue Type: Improvement
> Components: runner-py-interactive
> Reporter: Ning Kang
> Assignee: Ning Kang
> Priority: P2
>
> Traceback (most recent call last):
> File
> "/build/work/f1a2b3ea7c34e8e49f1a90317e5acde7889a/google3/runfiles/google3/photos/vision/features/delf/extract/global_descriptor/python/revisited_datasets/beam/eval_utils_test.py",
> line 614, in testProduceOutputVisualization
> _ = pipeline.run()
> File
> "/build/work/f1a2b3ea7c34e8e49f1a90317e5acde7889a/google3/runfiles/google3/third_party/py/apache_beam/pipeline.py",
> line 553, in run
> return self.runner.run_pipeline(self, self._options)
> File
> "/build/work/f1a2b3ea7c34e8e49f1a90317e5acde7889a/google3/runfiles/google3/third_party/py/apache_beam/runners/interactive/interactive_runner.py",
> line 136, in run_pipeline
> inst.watch_sources(pipeline)
> File
> "/build/work/f1a2b3ea7c34e8e49f1a90317e5acde7889a/google3/runfiles/google3/third_party/py/apache_beam/runners/interactive/pipeline_instrument.py",
> line 1008, in watch_sources
> retrieved_user_pipeline.visit(CacheableUnboundedPCollectionVisitor())
> AttributeError: 'NoneType' object has no attribute 'visit'
> Probably related to this change: https://github.com/apache/beam/pull/13335
--
This message was sent by Atlassian Jira
(v8.3.4#803005)