AnandInguva commented on code in PR #26236: URL: https://github.com/apache/beam/pull/26236#discussion_r1164397085
########## website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md: ########## @@ -141,3 +141,12 @@ However, it may be possible to pre-build the SDK containers and perform the depe Dataflow, see [Pre-building the python SDK custom container image with extra dependencies](https://cloud.google.com/dataflow/docs/guides/using-custom-containers#prebuild). **NOTE**: This feature is available only for the `Dataflow Runner v2`. + +## Pickling and Managing Main Session + +Pickling in the Python SDK is set up to pickle the state of the global namespace. By default, global imports, functions, and variables defined in the main session are not saved during the serialization of a Dataflow job. +Thus, one might encounter unexpected `NameErrors` when running a `DoFn` on Dataflow Runner. To resolve this, manage the main session by Review Comment: ```suggestion Thus, one might encounter unexpected `NameError`s when running a `DoFn` on Dataflow Runner. To resolve this, manage the main session by ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
