This is an automated email from the ASF dual-hosted git repository.
pabloem pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push:
new 3a2dc20 Update pardo.md
new 22a68fa Merge pull request #14542 from pkch/patch-1
3a2dc20 is described below
commit 3a2dc20dddbb54abc16e9314b739045c044b5b43
Author: Max <[email protected]>
AuthorDate: Wed Apr 14 14:26:19 2021 -0700
Update pardo.md
Fix docs for `setup` method:
1. It is not correct that it is called once per initialization; since it is
not called at all when the initialization happens in the pipeline construction
stage. I clarified that it's called during deserialization on the worker.
2. The phrase "setup need not to be cached, so it could be called more than
once per worker." is completely unclear. Perhaps it is meant to be "The objects
created in setup need not be cached by the user because it is called whenever
the DoFn instance is initialized", but then the implication goes in the wrong
direction. I removed it since it is hard to figure out what the intended
meaning was. Instead I explained why this method may be called more than once
per worker.
---
.../content/en/documentation/transforms/python/elementwise/pardo.md | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git
a/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
b/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
index 5021e57..efa64ab 100644
---
a/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
+++
b/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
@@ -90,8 +90,9 @@ You can also customize what to do when a
starts and finishes with `start_bundle` and `finish_bundle`.
*
[`DoFn.setup()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.setup):
- Called *once per `DoFn` instance* when the `DoFn` instance is initialized.
- `setup` need not to be cached, so it could be called more than once per
worker.
+ Called whenever the `DoFn` instance is deserialized on the worker. This
means it can be called more than once per worker because
+ multiple instances of a given `DoFn` subclass may be created (e.g., due to
parallelization, or due to garbage collection after a period
+ of disuse).
This is a good place to connect to database instances, open network
connections or other resources.
*
[`DoFn.start_bundle()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.start_bundle):