This is an automated email from the ASF dual-hosted git repository.
altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push:
new 94a46c8 Improve basic explanation of Beam PTransforms.
new 930efef Merge pull request #14943 from robertwb/transforms-doc
94a46c8 is described below
commit 94a46c8d9b815908ee8dc75d33f2ae108763ebb6
Author: Robert Bradshaw <[email protected]>
AuthorDate: Thu Jun 3 17:25:32 2021 -0700
Improve basic explanation of Beam PTransforms.
The existing definition is an (out-of-date) list of the primitives,
and comments about implementing them as a runner author, but doesn't
explain at all what they are from a user's point of view which is
better suited for this page.
---
.../www/site/content/en/documentation/basics.md | 24 ++++++----------------
1 file changed, 6 insertions(+), 18 deletions(-)
diff --git a/website/www/site/content/en/documentation/basics.md
b/website/www/site/content/en/documentation/basics.md
index bccd2b9..f21ceda 100644
--- a/website/www/site/content/en/documentation/basics.md
+++ b/website/www/site/content/en/documentation/basics.md
@@ -42,27 +42,15 @@ transforms, there are some special features worth
highlighting.
A pipeline in Beam is a graph of PTransforms operating on PCollections. A
pipeline is constructed by a user in their SDK of choice, and makes its way to
-your runner either via the SDK directly or via the Runner API's (forthcoming)
+your runner either via the SDK directly or via the Runner API's
RPC interfaces.
### PTransforms
-In Beam, a PTransform can be one of the five primitives or it can be a
-composite transform encapsulating a subgraph. The primitives are:
-
- * [_Read_](#implementing-the-read-primitive) - parallel connectors to external
- systems
- * [_ParDo_](#implementing-the-pardo-primitive) - per element processing
- * [_GroupByKey_](#implementing-the-groupbykey-and-window-primitive) -
- aggregating elements per key and window
- * [_Flatten_](#implementing-the-flatten-primitive) - union of PCollections
- * [_Window_](#implementing-the-window-primitive) - set the windowing strategy
- for a PCollection
-
-When implementing a runner, these are the operations you need to implement.
-Composite transforms may or may not be important to your runner. If you expose
-a UI, maintaining some of the composite structure will make the pipeline easier
-for a user to understand. But the result of processing is not changed.
+A `PTransform` represents a data processing operation, or a step,
+in your pipeline. A `PTransform` can be applied to one or more
+`PCollection` objects as input which performs some processing on the elements
of that
+`PCollection` and produces zero or more output `PCollection` objects.
### PCollections
@@ -173,7 +161,7 @@ The UDFs of Beam are:
* _Coder_ - encodes user data; some coders have standard formats and are not
really UDFs
The various types of user-defined functions will be described further alongside
-the primitives that use them.
+the [_PTransforms_](#ptransforms) that use them.
### Runner