This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
     new 3a2dc20  Update pardo.md
     new 22a68fa  Merge pull request #14542 from pkch/patch-1
3a2dc20 is described below

commit 3a2dc20dddbb54abc16e9314b739045c044b5b43
Author: Max <[email protected]>
AuthorDate: Wed Apr 14 14:26:19 2021 -0700

    Update pardo.md
    
    Fix docs for `setup` method:
    
    1. It is not correct that it is called once per initialization; since it is 
not called at all when the initialization happens in the pipeline construction 
stage. I clarified that it's called during deserialization on the worker.
    2. The phrase "setup need not to be cached, so it could be called more than 
once per worker." is completely unclear. Perhaps it is meant to be "The objects 
created in setup need not be cached by the user because it is called whenever 
the DoFn instance is initialized", but then the implication goes in the wrong 
direction. I removed it since it is hard to figure out what the intended 
meaning was. Instead I explained why this method may be called more than once 
per worker.
---
 .../content/en/documentation/transforms/python/elementwise/pardo.md  | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git 
a/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
 
b/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
index 5021e57..efa64ab 100644
--- 
a/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
+++ 
b/website/www/site/content/en/documentation/transforms/python/elementwise/pardo.md
@@ -90,8 +90,9 @@ You can also customize what to do when a
 starts and finishes with `start_bundle` and `finish_bundle`.
 
 * 
[`DoFn.setup()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.setup):
-  Called *once per `DoFn` instance* when the `DoFn` instance is initialized.
-  `setup` need not to be cached, so it could be called more than once per 
worker.
+  Called whenever the `DoFn` instance is deserialized on the worker. This 
means it can be called more than once per worker because
+  multiple instances of a given `DoFn` subclass may be created (e.g., due to 
parallelization, or due to garbage collection after a period
+  of disuse).
   This is a good place to connect to database instances, open network 
connections or other resources.
 
 * 
[`DoFn.start_bundle()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.start_bundle):

Reply via email to