Thanks for the heads-up, Ning! I haven't tried out interactive Beam, but
this puts it back on my radar :)
Cheers,
Max
On 04.12.19 20:45, Ning Kang wrote:
*If you are not an Interactive Beam user, you can ignore this email.*
*
*
Hi Interactive Beam users,
We've recently made some changes to how Interactive Beam gets to
understand the context of the pipelines/PCollections defined in your
notebook/code.
If you write Beam pipelines with the InteractiveRunner directly in
notebook cells like the Interactive Beam Examples
<https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb> or
define everything in "__main__", you will not be affected by the changes.
If you define your pipelines in local scope such as functions (an
example scenario, unit tests) and you rely on interactive features to
introspect the data of a PCollection after a pipeline run, you might see
such "raise ValueError('PCollection not available, please run the
pipeline.')".
It's because Interactive Beam now "watches" the "__main__" scope by
default to provide features implicitly. To avoid the error, you only
need to tell Interactive Beam to "watch
<https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/interactive_beam.py#L36>"
your local scopes too.
An example to fix the issue,
from apache_beam.runners.interactive import interactive_beam
...
def some_func(...):
p = beam.Pipeline(InteractiveRunner())
pcoll = p | 'SomeTransform' >> SomeTransform()
...
interactive_beam.watch(locals())
result = p.run()
...
...
Thanks for using Interactive Beam!
Ning.