*If you are not an Interactive Beam user, you can ignore this email.* Hi everyone,
Recently, we've been actively developing on top of the existing InteractiveRunner for more Interactive Beam features <https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit?usp=sharing> . One of the things we've changed is what PCollections will be cached and available for *get_result(pcoll)*. If your unit tests or code depend on executing a pipeline with the InteractiveRunner and check data of the PCollection through *get_result(pcoll)*, those code might run into an error saying "raise ValueError('PCollection not available, please run the pipeline.')". This is because now Interactive Beam automatically figures out what PCollections have been assigned to variables in the user-defined pipelines in your code/test/notebooks by looking at a "watched" scope of variable definitions. By default everything defined in "__main__" is watched. So if you've defined a pipeline in a local scope such as a function, Interactive Beam will not be able to "watch" it and then cache data for those PCollections. There is only one line change needed to fix the usage: watch your local scope. Something like, from apache_beam.runners.interactive import interactive_beam ... def some_func(...): p = beam.Pipeline(InteractiveRunner()) pcoll = p | 'SomeTransform' >> SomeTransform() ... interactive_beam.watch(locals()) result = p.run() ... ... Thanks for using Interactive Beam! Ning.
