[ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=309184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-309184
 ]

ASF GitHub Bot logged work on BEAM-7389:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Sep/19 20:17
            Start Date: 09/Sep/19 20:17
    Worklog Time Spent: 10m 
      Work Description: davidcavazos commented on pull request #9503: 
[BEAM-7389] Add code examples for ParDo page
URL: https://github.com/apache/beam/pull/9503#discussion_r322433992
 
 

 ##########
 File path: website/src/documentation/transforms/python/element-wise/pardo.md
 ##########
 @@ -19,26 +19,180 @@ limitations under the License.
 -->
 
 # ParDo
-<table align="left">
-    <a target="_blank" class="button"
+
+<script type="text/javascript">
+localStorage.setItem('language', 'language-py')
+</script>
+
+<table>
+  <td>
+    <a class="button" target="_blank"
         
href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.ParDo";>
-      <img src="https://beam.apache.org/images/logos/sdks/python.png"; 
width="20px" height="20px"
-           alt="Pydoc" />
-     Pydoc
+      <img src="https://beam.apache.org/images/logos/sdks/python.png";
+          width="20px" height="20px" alt="Pydoc" />
+      Pydoc
     </a>
+  </td>
 </table>
 <br>
-A transform for generic parallel processing. A `ParDo` transform considers each
-element in the input `PCollection`, performs some processing function
-(your user code) on that element, and emits zero or more elements to
-an output PCollection.
 
-See more information in the [Beam Programming Guide]({{ site.baseurl 
}}/documentation/programming-guide/#pardo).
+A transform for generic parallel processing.
+A `ParDo` transform considers each element in the input `PCollection`,
+performs some processing function (your user code) on that element,
+and emits zero or more elements to an output `PCollection`.
+
+See more information in the
+[Beam Programming Guide]({{ site.baseurl 
}}/documentation/programming-guide/#pardo).
 
 ## Examples
-See [BEAM-7389](https://issues.apache.org/jira/browse/BEAM-7389) for updates. 
 
-## Related transforms 
-* [FlatMap]({{ site.baseurl 
}}/documentation/transforms/python/elementwise/flatmap) behaves the same as 
`Map`, but for each input it may produce zero or more outputs.
-* [Filter]({{ site.baseurl 
}}/documentation/transforms/python/elementwise/filter) is useful if the 
function is just 
-  deciding whether to output an element or not.
\ No newline at end of file
+In the following examples, we explore how to create custom `DoFn`s and access
+the timestamp and windowing information.
+
+### Example 1: ParDo with a simple DoFn
+
+The following example defines a simple `DoFn` class called `SplitWords`
+which stores the `delimiter` as an object field.
+The `process` method is called once per element,
+and it can yield zero or more output elements.
+
+```py
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo.py
 tag:pardo_dofn %}```
+
+Output `PCollection` after `ParDo`:
+
+```
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo_test.py
 tag:plants %}```
+
+<table>
+  <td>
+    <a class="button" target="_blank"
+        
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo.py";>
+      <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png";
+        width="20px" height="20px" alt="View on GitHub" />
+      View on GitHub
+    </a>
+  </td>
+</table>
+<br>
+
+### Example 2: ParDo with timestamp and window information
+
+We add new parameters to the `process` method to bind parameter values at 
runtime.
+
+* 
[`beam.DoFn.TimestampParam`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.TimestampParam)
+  binds the timestamp information as an
+  
[`apache_beam.utils.timestamp.Timestamp`](https://beam.apache.org/releases/pydoc/current/apache_beam.utils.timestamp.html#apache_beam.utils.timestamp.Timestamp)
+  object.
+* 
[`beam.DoFn.WindowParam`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.WindowParam)
+  binds the window information as the appropriate
+  
[`apache_beam.transforms.window.*Window`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.window.html)
+  object.
+
+```py
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo.py
 tag:pardo_dofn_params %}```
+
+`stdout` output:
+
+```
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo_test.py
 tag:dofn_params %}```
+
+<table>
+  <td>
+    <a class="button" target="_blank"
+        
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo.py";>
+      <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png";
+        width="20px" height="20px" alt="View on GitHub" />
+      View on GitHub
+    </a>
+  </td>
+</table>
+<br>
+
+### Example 3: ParDo with DoFn methods
+
+A 
[`DoFn`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn)
+can be customized with a number of methods that can help create more complex 
behaviors.
+You can customize what a worker does when it starts and shuts down with 
`setup` and `teardown`.
+You can also customize what to do when a
+[*bundle of 
elements*](https://beam.apache.org/documentation/execution-model/#bundling-and-persistence)
+starts and finishes with `start_bundle` and `finish_bundle`.
+
+* 
[`DoFn.setup()`](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.DoFn.setup):
+  Called *once per worker* when the worker is starting to run.
 
 Review comment:
   Got it, clarified
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 309184)
    Time Spent: 56.5h  (was: 56h 20m)

> Colab examples for element-wise transforms (Python)
> ---------------------------------------------------
>
>                 Key: BEAM-7389
>                 URL: https://issues.apache.org/jira/browse/BEAM-7389
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Rose Nguyen
>            Assignee: David Cavazos
>            Priority: Minor
>          Time Spent: 56.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to