[ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=291489&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291489
 ]

ASF GitHub Bot logged work on BEAM-7389:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Aug/19 18:22
            Start Date: 08/Aug/19 18:22
    Worklog Time Spent: 10m 
      Work Description: rosetn commented on pull request #9261: [BEAM-7389] Add 
code examples for Partition page
URL: https://github.com/apache/beam/pull/9261#discussion_r312179194
 
 

 ##########
 File path: 
website/src/documentation/transforms/python/element-wise/partition.md
 ##########
 @@ -39,12 +46,110 @@ You cannot determine the number of partitions in 
mid-pipeline
 See more information in the [Beam Programming Guide]({{ site.baseurl 
}}/documentation/programming-guide/#partition).
 
 ## Examples
-See [BEAM-7389](https://issues.apache.org/jira/browse/BEAM-7389) for updates. 
 
-## Related transforms 
-* [Filter]({{ site.baseurl 
}}/documentation/transforms/python/elementwise/filter) is useful if the 
function is just 
+In the following examples, we create a pipeline with a `PCollection` of 
produce their icon, name, and duration.
+Then, we apply `Partition` in multiple ways to split the `PCollection` into 
multiple `PCollections`.
+
+`Partition` accepts a function that receives the number of partitions,
+and returns the index of the desired partition for the element.
+The number of partitions passed must be a positive integer,
+and it must return an integer in the range `0` to `num_partitions-1`.
+
+### Example 1: Partition with a function
+
+In the following example, we have a known list of durations.
+We partition the `PCollection` into one `PCollection` for every duration type.
+
+```py
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/partition.py
 tag:partition_function %}```
+
+Output `PCollection`s:
+
+```
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/partition_test.py
 tag:partitions %}```
+
+<table>
+  <td>
+    <a class="button" target="_blank"
+        
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/partition.py";>
+      <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png";
+        width="20px" height="20px" alt="View on GitHub" />
+      View on GitHub
+    </a>
+  </td>
+</table>
+<br>
+
+### Example 2: Partition with a lambda function
+
+We can also use lambda functions to simplify **Example 1**.
+
+```py
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/partition.py
 tag:partition_lambda %}```
+
+Output `PCollection`s:
+
+```
+{% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/partition_test.py
 tag:partitions %}```
+
+<table>
+  <td>
+    <a class="button" target="_blank"
+        
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/partition.py";>
+      <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png";
+        width="20px" height="20px" alt="View on GitHub" />
+      View on GitHub
+    </a>
+  </td>
+</table>
+<br>
+
+### Example 3: Partition with multiple arguments
+
+You can pass functions with multiple arguments to `Partition`.
+They are passed as additional positional arguments or keyword arguments to the 
function.
+
+In this example, `split_dataset` takes `plant`, `num_partitions`, and `ratio` 
as arguments.
+`num_partitions` is used by `Partitions` as a positional argument,
+while any other argument will be passed to `split_dataset`.
 
 Review comment:
   Can you add some more explanation by using the concrete details? For 
example, what does sample will do with the arguments in 
beam.Partition(split_dataset, 2, ratio=[8, 2]) and how that affects the output. 
   
   "test_dataset" and "train_dataset" might make sense to some, but isn't 
generic enough. Either rename them to something like "dataset1" and "dataset2" 
or in your explanation before the code.
   
   Something along the lines of "In this example, we want to split the dataset 
into a training dataset and a testing dataset using `Partition`. " And 
elaborate on what setting the ration to [8, 2] does.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 291489)
    Time Spent: 38h 50m  (was: 38h 40m)

> Colab examples for element-wise transforms (Python)
> ---------------------------------------------------
>
>                 Key: BEAM-7389
>                 URL: https://issues.apache.org/jira/browse/BEAM-7389
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Rose Nguyen
>            Assignee: David Cavazos
>            Priority: Minor
>          Time Spent: 38h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to