[ 
https://issues.apache.org/jira/browse/BEAM-7700?focusedWorklogId=282476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282476
 ]

ASF GitHub Bot logged work on BEAM-7700:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Jul/19 06:51
            Start Date: 25/Jul/19 06:51
    Worklog Time Spent: 10m 
      Work Description: robinyqiu commented on pull request #9129: [BEAM-7700] 
Java transform catalog
URL: https://github.com/apache/beam/pull/9129#discussion_r307122989
 
 

 ##########
 File path: website/src/documentation/transforms/java/element-wise/partition.md
 ##########
 @@ -0,0 +1,62 @@
+---
+layout: section
+title: "Partition"
+permalink: /documentation/transforms/java/elementwise/partition/
+section_menu: section-menu/documentation.html
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Partition
+<table align="left">
+    <a target="_blank" class="button"
+        
href="https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/Partition.html";>
+      <img src="https://beam.apache.org/images/logos/sdks/java.png"; 
width="20px" height="20px"
+           alt="Javadoc" />
+     Javadoc
+    </a>
+</table>
+<br>
+Separates elements in a collection into multiple output collections. The 
partitioning function contains the logic that determines how to separate the 
elements of the input collection into each resulting partition output 
collection.
+
+The number of partitions must be determined at graph construction time. You 
cannot determine the number of partitions in mid-pipeline.
+
+See more information in the [Beam Programming Guide]({{ site.baseurl 
}}/documentation/programming-guide/#partition).
+
+## Examples
+**Example**: dividing a `PCollection` into percentile groups
+
+```java
+// Provide an int value with the desired number of result partitions, and a 
PartitionFn that represents the
+// partitioning function. In this example, we define the PartitionFn in-line. 
Returns a PCollectionList
+// containing each of the resulting partitions as individual PCollection 
objects.
+PCollection<Student> students = ...;
+// Split students up into 10 partitions, by percentile:
+PCollectionList<Student> studentsByPercentile =
+    students.apply(Partition.of(10, new PartitionFn<Student>() {
+        public int partitionFor(Student student, int numPartitions) {
+            return student.getPercentile()  // 0..99
+                 * numPartitions / 100;
+        }}));
+
+// You can extract each partition from the PCollectionList using the get 
method, as follows:
+PCollection<Student> fortiethPercentile = studentsByPercentile.get(4);
+```
+
+## Related transforms 
+* [Filter]({{ site.baseurl 
}}/documentation/transforms/java/elementwise/filter) is useful if the function 
is just 
+  deciding whether to output an element or not.
+* [ParDo]({{ site.baseurl }}/documentation/transforms/java/elementwise/pardo) 
is the most general element-wise mapping
+  operation, and includes other abilities such as multiple output collections 
and side-inputs. 
+* [CoGroupByKey]({{ site.baseurl 
}}/documentation/transforms/java/aggregation/cogroupbykey)
 
 Review comment:
   Not sure how `CoGroupByKey` is related.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 282476)
    Time Spent: 1h 50m  (was: 1h 40m)

> Java transform catalog
> ----------------------
>
>                 Key: BEAM-7700
>                 URL: https://issues.apache.org/jira/browse/BEAM-7700
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Rose Nguyen
>            Assignee: Rose Nguyen
>            Priority: Minor
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Create catalog of core transforms (Java)
> -Java transforms overview
> -Links to Javadocs
> -Brief description
> -Related transforms
> -Links to programming guide
> -Examples section to integrate Colab notebooks
>  
> See BEAM-7464 for Python.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to