[ 
https://issues.apache.org/jira/browse/BEAM-11758?focusedWorklogId=668708&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-668708
 ]

ASF GitHub Bot logged work on BEAM-11758:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Oct/21 23:59
            Start Date: 21/Oct/21 23:59
    Worklog Time Spent: 10m 
      Work Description: melap commented on a change in pull request #15763:
URL: https://github.com/apache/beam/pull/15763#discussion_r734121803



##########
File path: website/www/site/content/en/documentation/basics.md
##########
@@ -137,45 +147,158 @@ windowing strategy, and
 [GroupByKey](#implementing-the-groupbykey-and-window-primitive) primitive,
 which has behavior governed by the windowing strategy.
 
-### User-Defined Functions (UDFs)
-
-Beam has seven varieties of user-defined function (UDF). A Beam pipeline
-may contain UDFs written in a language other than your runner, or even multiple
-languages in the same pipeline (see the [Runner API](#the-runner-api)) so the
-definitions are language-independent (see the [Fn API](#the-fn-api)).
-
-The UDFs of Beam are:
-
- * _DoFn_ - per-element processing function (used in ParDo)
- * _WindowFn_ - places elements in windows and merges windows (used in Window
-   and GroupByKey)
+### Aggregation
+
+Aggregation is computing a value from multiple (1 or more) input elements. In
+Beam, the primary computational pattern for aggregation is to group all 
elements
+with a common key and window then combine each group of elements using an
+associative and commutative operation. This is similar to the "Reduce" 
operation
+in the [MapReduce](https://en.wikipedia.org/wiki/MapReduce) model, though it is
+enhanced to work with unbounded input streams as well as bounded data sets.
+
+<img src="/images/aggregation.png" alt="Aggregation of elements." 
width="120px">
+
+*Figure 1: Aggregation of elements. Elements with the same color represent 
those
+with a common key and window.*
+
+Some simple aggregation transforms include `Count` (computes the count of all
+elements in the aggregation), `Max` (computes the maximum element in the
+aggregation), and `Sum` (computes the sum of all elements in the aggregation).
+
+When elements are grouped and emitted as a bag, the aggregation is known as
+`GroupByKey` (the associative/commutative operation is bag union). In this 
case,
+the output is no smaller than the input. Often, you will apply an operation 
such
+as summation, called a `CombineFn`, in which the output is significantly 
smaller
+than the input. In this case the aggregation is called `CombinePerKey`.
+
+The associativity and commutativity of a `CombineFn` allows runners to
+automatically apply some optimizations:
+
+ * _Combiner lifting_: This is the most significant optimization. Input 
elements

Review comment:
       Ok, moved the bullets to the programming guide and added the pointer to 
the aggregation page




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 668708)
    Time Spent: 1h 40m  (was: 1.5h)

> Create concepts guide in the Beam documentation
> -----------------------------------------------
>
>                 Key: BEAM-11758
>                 URL: https://issues.apache.org/jira/browse/BEAM-11758
>             Project: Beam
>          Issue Type: New Feature
>          Components: website
>            Reporter: David Huntsperger
>            Assignee: Melissa Pashniak
>            Priority: P3
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Create a conceptual guide to help new users understand Beam concepts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to