[ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=293357&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293357
 ]

ASF GitHub Bot logged work on BEAM-7389:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Aug/19 20:20
            Start Date: 12/Aug/19 20:20
    Worklog Time Spent: 10m 
      Work Description: davidcavazos commented on pull request #9257: 
[BEAM-7389] Add DoFn methods sample
URL: https://github.com/apache/beam/pull/9257#discussion_r313109881
 
 

 ##########
 File path: 
sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo.py
 ##########
 @@ -81,3 +82,44 @@ def process(self, elem, timestamp=beam.DoFn.TimestampParam, 
window=beam.DoFn.Win
     # pylint: enable=line-too-long
     if test:
       test(dofn_params)
+
+
+def pardo_dofn_methods(test=None):
+  # [START pardo_dofn_methods]
+  import apache_beam as beam
+
+  class DoFnMethods(beam.DoFn):
+    def __init__(self):
+      print('__init__')
+      self.window = beam.window.GlobalWindow()
+
+    def setup(self):
+      print('setup')
 
 Review comment:
   I tried doing that here, but the code sample ended up looking a lot more 
cluttered and intimidating. I think it looked better in the docs themselves. 
Here's an extract of what I was adding in the docs:
   
   A 
[`DoFn`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn)
 can be customized with a number of methods that can help create more complex 
behaviors. You can customize what a worker will do when it starts and shuts 
down with `setup` and `teardown`. You can also customize what to do when a 
[*bundle of 
elements*](https://beam.apache.org/documentation/execution-model/#bundling-and-persistence)
 starts and when a bundle finishes with `start_bundle` and `finish_bundle`.
   
   * 
[`DoFn.setup()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.setup):
 Called *once per worker* when the worker is starting to run. This is a good 
place to connect to database instances, open network connections or other 
resources.
   
   * 
[`DoFn.start_bundle()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.start_bundle):
 Called *once per bundle of elements* before calling `process` on the first 
element of the bundle. This is a good place to start keeping track of the 
bundle elements.
   
   * [**`DoFn.process(element, *args, 
**kwargs)`**](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.process)
 *[required]:* Called *once per element*, can *yield zero or more elements*.
   
   * 
[`DoFn.finish_bundle()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.finish_bundle):
 Called *once per bundle of elements* after calling `process` after the last 
element of the bundle, can *yield zero or more elements*. This is a good place 
to do batch calls on a bundle of elements, such as running a database query. 
For example, you can initialize a batch in `start_bundle`, add elements to the 
batch in `process` instead of yielding them, then running a batch query on 
those elements on `finish_bundle`, and yielding all the results.
   
     Note that yielded elements from `finish_bundle` must be of the type 
`apache_beam.utils.windowed_value.WindowedValue`. You will need to provide a 
timestamp as a unix timestamp, which you can get from the last processed 
element. You will also need to provide a window, which you can get from the 
last processed element like in the example below.
   
   * 
[`DoFn.teardown()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.teardown):
 Called *once per worker* when the worker is shutting down. This is a good 
place to close database instances, close network connections or other resources.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 293357)
    Time Spent: 39.5h  (was: 39h 20m)

> Colab examples for element-wise transforms (Python)
> ---------------------------------------------------
>
>                 Key: BEAM-7389
>                 URL: https://issues.apache.org/jira/browse/BEAM-7389
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Rose Nguyen
>            Assignee: David Cavazos
>            Priority: Minor
>          Time Spent: 39.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to