Thanks for the heads-up, Ning! I haven't tried out interactive Beam, but this puts it back on my radar :)

Cheers,
Max

On 04.12.19 20:45, Ning Kang wrote:
*If you are not an Interactive Beam user, you can ignore this email.*
*
*
Hi Interactive Beam users,

We've recently made some changes to how Interactive Beam gets to understand the context of the pipelines/PCollections defined in your notebook/code.

If you write Beam pipelines with the InteractiveRunner directly in notebook cells like the Interactive Beam Examples <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb> or define everything in "__main__", you will not be affected by the changes.

If you define your pipelines in local scope such as functions (an example scenario, unit tests) and you rely on interactive features to introspect the data of a PCollection after a pipeline run, you might see such  "raise ValueError('PCollection not available, please run the pipeline.')".

It's because Interactive Beam now "watches" the "__main__" scope by default to provide features implicitly. To avoid the error, you only need to tell Interactive Beam to "watch <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/interactive_beam.py#L36>" your local scopes too.
An example to fix the issue,
from apache_beam.runners.interactive import interactive_beam
...
def some_func(...):
     p = beam.Pipeline(InteractiveRunner())
     pcoll = p | 'SomeTransform' >> SomeTransform()
     ...
interactive_beam.watch(locals())
     result = p.run()
     ...
...

Thanks for using Interactive Beam!

Ning.

Reply via email to