This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new a89b07d  Publishing website 2021/12/07 06:03:36 at commit 6d63a70
a89b07d is described below

commit a89b07df119671d4dcb216913f33723d82798d96
Author: jenkins <[email protected]>
AuthorDate: Tue Dec 7 06:03:37 2021 +0000

    Publishing website 2021/12/07 06:03:36 at commit 6d63a70
---
 .../documentation/basics/index.html                |  71 +++++++++++---
 website/generated-content/documentation/index.xml  | 107 ++++++++++++++++++---
 .../documentation/programming-guide/index.html     |  11 ++-
 website/generated-content/sitemap.xml              |   2 +-
 4 files changed, 157 insertions(+), 34 deletions(-)

diff --git a/website/generated-content/documentation/basics/index.html 
b/website/generated-content/documentation/basics/index.html
index 89478b1..d74214d 100644
--- a/website/generated-content/documentation/basics/index.html
+++ b/website/generated-content/documentation/basics/index.html
@@ -18,7 +18,7 @@
 function addPlaceholder(){$('input:text').attr('placeholder',"What are you 
looking for?");}
 function endSearch(){var 
search=document.querySelector(".searchBar");search.classList.add("disappear");var
 icons=document.querySelector("#iconsBar");icons.classList.remove("disappear");}
 function blockScroll(){$("body").toggleClass("fixedPosition");}
-function openMenu(){addPlaceholder();blockScroll();}</script><div 
class="clearfix container-main-content"><div class="section-nav closed" 
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back 
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list 
data-section-nav><li><span 
class=section-nav-list-main-title>Documentation</span></li><li><a 
href=/documentation>Using the Documentation</a></li><li 
class=section-nav-item--collapsible><span class=section-nav-lis [...]
+function openMenu(){addPlaceholder();blockScroll();}</script><div 
class="clearfix container-main-content"><div class="section-nav closed" 
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back 
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list 
data-section-nav><li><span 
class=section-nav-list-main-title>Documentation</span></li><li><a 
href=/documentation>Using the Documentation</a></li><li 
class=section-nav-item--collapsible><span class=section-nav-lis [...]
 data-parallel processing pipelines. To get started with Beam, you&rsquo;ll 
need to
 understand an important set of core concepts:</p><ul><li><a 
href=#pipeline><em>Pipeline</em></a> - A pipeline is a user-constructed graph of
 transformations that defines the desired data processing 
operations.</li><li><a href=#pcollection><em>PCollection</em></a> - A 
<code>PCollection</code> is a data set or data
@@ -33,7 +33,13 @@ a <code>PCollection</code>. The schema for a 
<code>PCollection</code> defines el
 <code>PCollection</code> as an ordered list of named fields.</li><li><a 
href=/documentation/sdks/java/><em>SDK</em></a> - A language-specific library 
that lets
 pipeline authors build transforms, construct their pipelines, and submit
 them to a runner.</li><li><a href=#runner><em>Runner</em></a> - A runner runs 
a Beam pipeline using the capabilities of
-your chosen data processing engine.</li><li><a 
href=#trigger><em>Trigger</em></a> - A trigger determines when to aggregate the 
results of
+your chosen data processing engine.</li><li><a 
href=#window><em>Window</em></a> - A <code>PCollection</code> can be subdivided 
into windows based on
+the timestamps of the individual elements. Windows enable grouping operations
+over collections that grow over time by dividing the collection into windows
+of finite collections.</li><li><a href=#watermark><em>Watermark</em></a> - A 
watermark is a guess as to when all data in a
+certain window is expected to have arrived. This is needed because data isn’t
+always guaranteed to arrive in a pipeline in time order, or to always arrive
+at predictable intervals.</li><li><a href=#trigger><em>Trigger</em></a> - A 
trigger determines when to aggregate the results of
 each window.</li><li><a href=#state-and-timers><em>State and timers</em></a> - 
Per-key state and timer callbacks
 are lower level primitives that give you full control over aggregating input
 collections that grow over time.</li><li><a 
href=#splittable-dofn><em>Splittable DoFn</em></a> - Splittable DoFns let you 
process
@@ -95,18 +101,17 @@ responsible for providing initial timestamps. The runner 
must propagate and
 aggregate timestamps. If the timestamp is not important, such as with certain
 batch processing jobs where elements do not denote events, the timestamp will 
be
 the minimum representable timestamp, often referred to colloquially as 
&ldquo;negative
-infinity&rdquo;.</p><h4 id=watermarks>Watermarks</h4><p>Every 
<code>PCollection</code> must have a watermark that estimates how complete the
-<code>PCollection</code> is.</p><p>The watermark is a guess that 
&ldquo;we&rsquo;ll never see an element with an earlier
+infinity&rdquo;.</p><h4 id=watermarks>Watermarks</h4><p>Every 
<code>PCollection</code> must have a <a href=#watermark>watermark</a> that 
estimates how
+complete the <code>PCollection</code> is.</p><p>The watermark is a guess that 
&ldquo;we&rsquo;ll never see an element with an earlier
 timestamp&rdquo;. Data sources are responsible for producing a watermark. The 
runner
 must implement watermark propagation as PCollections are processed, merged, and
 partitioned.</p><p>The contents of a <code>PCollection</code> are complete 
when a watermark advances to
 &ldquo;infinity&rdquo;. In this manner, you can discover that an unbounded 
PCollection is
-finite.</p><h4 id=windowed-elements>Windowed elements</h4><p>Every element in 
a <code>PCollection</code> resides in a window. No element resides in
-multiple windows; two elements can be equal except for their window, but they
-are not the same.</p><p>When elements are read from the outside world, they 
arrive in the global window.
-When they are written to the outside world, they are effectively placed back
+finite.</p><h4 id=windowed-elements>Windowed elements</h4><p>Every element in 
a <code>PCollection</code> resides in a <a href=#window>window</a>. No element
+resides in multiple windows; two elements can be equal except for their window,
+but they are not the same.</p><p>When elements are written to the outside 
world, they are effectively placed back
 into the global window. Transforms that write data and don&rsquo;t take this
-perspective probably risks data loss.</p><p>A window has a maximum timestamp. 
When the watermark exceeds the maximum
+perspective risk data loss.</p><p>A window has a maximum timestamp. When the 
watermark exceeds the maximum
 timestamp plus the user-specified allowed lateness, the window is expired. All
 data related to an expired window might be discarded at any time.</p><h4 
id=coder>Coder</h4><p>Every <code>PCollection</code> has a coder, which is a 
specification of the binary format
 of the elements.</p><p>In Beam, the user&rsquo;s pipeline can be written in a 
language other than the
@@ -150,8 +155,8 @@ the transform. For example, when using <code>ParDo</code>, 
user-defined code spe
 operation to apply to every element. For <code>Combine</code>, it specifies 
how values
 should be combined. By using <a 
href=/documentation/patterns/cross-language/>cross-language transforms</a>,
 a Beam pipeline can contain UDFs written in a different language, or even
-multiple languages in the same pipeline.</p><p>Beam has several varieties of 
UDFs:</p><ul><li><a href=/programming-guide/#pardo><em>DoFn</em></a> - 
per-element processing function (used
-in <code>ParDo</code>)</li><li><a 
href=/programming-guide/#setting-your-pcollections-windowing-function><em>WindowFn</em></a>
 -
+multiple languages in the same pipeline.</p><p>Beam has several varieties of 
UDFs:</p><ul><li><a 
href=/documentation/programming-guide/#pardo><em>DoFn</em></a> - per-element 
processing
+function (used in <code>ParDo</code>)</li><li><a 
href=/documentation/programming-guide/#setting-your-pcollections-windowing-function><em>WindowFn</em></a>
 -
 places elements in windows and merges windows (used in <code>Window</code> and
 <code>GroupByKey</code>)</li><li><a 
href=/documentation/programming-guide/#side-inputs><em>ViewFn</em></a> - adapts 
a
 materialized <code>PCollection</code> to a particular interface (used in side 
inputs)</li><li><a 
href=/documentation/programming-guide/#side-inputs-windowing><em>WindowMappingFn</em></a>
 -
@@ -167,7 +172,7 @@ without communicating or sharing state with any of the 
other copies. Each copy
 of your user code function might be retried or run multiple times, depending on
 the pipeline runner and the processing backend that you choose for your
 pipeline. Beam also supports stateful processing through the
-<a href=/blog/stateful-processing/>stateful processing API</a>.</p><p>For more 
information about user-defined functions, see the following 
pages:</p><ul><li><a 
href=/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms>Requirements
 for writing user code for Beam transforms</a></li><li><a 
href=/documentation/programming-guide/#pardo>Beam Programming Guide: 
ParDo</a></li><li><a 
href=/programming-guide/#setting-your-pcollections-windowing-function>Beam Pro 
[...]
+<a href=/blog/stateful-processing/>stateful processing API</a>.</p><p>For more 
information about user-defined functions, see the following 
pages:</p><ul><li><a 
href=/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms>Requirements
 for writing user code for Beam transforms</a></li><li><a 
href=/documentation/programming-guide/#pardo>Beam Programming Guide: 
ParDo</a></li><li><a 
href=/documentation/programming-guide/#setting-your-pcollections-windowing-fun 
[...]
 schema for a <code>PCollection</code> defines elements of that 
<code>PCollection</code> as an ordered
 list of named fields. Each field has a name, a type, and possibly a set of user
 options.</p><p>In many cases, the element type in a <code>PCollection</code> 
has a structure that can be
@@ -188,7 +193,45 @@ Flink runner translates a Beam pipeline into a Flink job. 
The Direct Runner runs
 pipelines locally so you can test, debug, and validate that your pipeline
 adheres to the Apache Beam model as closely as possible.</p><p>For an 
up-to-date list of Beam runners and which features of the Apache Beam
 model they support, see the runner
-<a href=/documentation/runners/capability-matrix/>capability 
matrix</a>.</p><p>For more information about runners, see the following 
pages:</p><ul><li><a href=/documentation/#choosing-a-runner>Choosing a 
Runner</a></li><li><a href=/documentation/runners/capability-matrix/>Beam 
Capability Matrix</a></li></ul><h3 id=trigger>Trigger</h3><p>When collecting 
and grouping data into windows, Beam uses <em>triggers</em> to
+<a href=/documentation/runners/capability-matrix/>capability 
matrix</a>.</p><p>For more information about runners, see the following 
pages:</p><ul><li><a href=/documentation/#choosing-a-runner>Choosing a 
Runner</a></li><li><a href=/documentation/runners/capability-matrix/>Beam 
Capability Matrix</a></li></ul><h3 id=window>Window</h3><p>Windowing subdivides 
a <code>PCollection</code> into <em>windows</em> according to the timestamps
+of its individual elements. Windows enable grouping operations over unbounded
+collections by dividing the collection into windows of finite 
collections.</p><p>A <em>windowing function</em> tells the runner how to assign 
elements to one or more
+initial windows, and how to merge windows of grouped elements. Each element in 
a
+<code>PCollection</code> can only be in one window, so if a windowing function 
specifies
+multiple windows for an element, the element is conceptually duplicated into
+each of the windows and each element is identical except for its 
window.</p><p>Transforms that aggregate multiple elements, such as 
<code>GroupByKey</code> and <code>Combine</code>,
+work implicitly on a per-window basis; they process each 
<code>PCollection</code> as a
+succession of multiple, finite windows, though the entire collection itself may
+be of unbounded size.</p><p>Beam provides several windowing 
functions:</p><ul><li><strong>Fixed time windows</strong> (also known as 
&ldquo;tumbling windows&rdquo;) represent a consistent
+duration, non-overlapping time interval in the data 
stream.</li><li><strong>Sliding time windows</strong> (also known as 
&ldquo;hopping windows&rdquo;) also represent time
+intervals in the data stream; however, sliding time windows can 
overlap.</li><li><strong>Per-session windows</strong> define windows that 
contain elements that are within a
+certain gap duration of another element.</li><li><strong>Single global 
window</strong>: by default, all data in a <code>PCollection</code> is assigned 
to
+the single global window, and late data is 
discarded.</li><li><strong>Calendar-based windows</strong> (not supported by 
the Beam SDK for Python)</li></ul><p>You can also define your own windowing 
function if you have more complex
+requirements.</p><p>For example, let&rsquo;s say we have a 
<code>PCollection</code> that uses fixed-time windowing,
+with windows that are five minutes long. For each window, Beam must collect all
+the data with an event time timestamp in the given window range (between 0:00
+and 4:59 in the first window, for instance). Data with timestamps outside that
+range (data from 5:00 or later) belongs to a different window.</p><p>Two 
concepts are closely related to windowing and covered in the following
+sections: <a href=#watermark>watermarks</a> and <a 
href=#trigger>triggers</a>.</p><p>For more information about windows, see the 
following page:</p><ul><li><a 
href=/documentation/programming-guide/#windowing>Beam Programming Guide: 
Windowing</a></li><li><a 
href=/documentation/programming-guide/#setting-your-pcollections-windowing-function>Beam
 Programming Guide: WindowFn</a></li></ul><h3 id=watermark>Watermark</h3><p>In 
any data processing system, there is a certain amount of lag between [...]
+a data event occurs (the “event time”, determined by the timestamp on the data
+element itself) and the time the actual data element gets processed at any 
stage
+in your pipeline (the “processing time”, determined by the clock on the system
+processing the element). In addition, data isn’t always guaranteed to arrive in
+a pipeline in time order, or to always arrive at predictable intervals. For
+example, you might have intermediate systems that don&rsquo;t preserve order, 
or you
+might have two servers that timestamp data but one has a better network
+connection.</p><p>To address this potential unpredictability, Beam tracks a 
<em>watermark</em>. A
+watermark is a guess as to when all data in a certain window is expected to 
have
+arrived in the pipeline. You can also think of this as “we’ll never see an
+element with an earlier timestamp”.</p><p>Data sources are responsible for 
producing a watermark, and every <code>PCollection</code>
+must have a watermark that estimates how complete the <code>PCollection</code> 
is. The
+contents of a <code>PCollection</code> are complete when a watermark advances 
to
+“infinity”. In this manner, you might discover that an unbounded 
<code>PCollection</code>
+is finite. After the watermark progresses past the end of a window, any further
+element that arrives with a timestamp in that window is considered <em>late 
data</em>.</p><p>Triggers are a related concept that allow you to modify and 
refine the windowing
+strategy for a <code>PCollection</code>. You can use triggers to decide when 
each
+individual window aggregates and reports its results, including how the window
+emits late elements.</p><p>For more information about watermarks, see the 
following page:</p><ul><li><a 
href=/documentation/programming-guide/#watermarks-and-late-data>Beam 
Programming Guide: Watermarks and late data</a></li></ul><h3 
id=trigger>Trigger</h3><p>When collecting and grouping data into windows, Beam 
uses <em>triggers</em> to
 determine when to emit the aggregated results of each window (referred to as a
 <em>pane</em>). If you use Beam’s default windowing configuration and default 
trigger,
 Beam outputs the aggregated result when it estimates all data has arrived, and
@@ -259,7 +302,7 @@ checkpoint the sub-element and the runner repeats step 
2.</li></ol><p>You can al
 processing. For example, if you write a splittable <code>DoFn</code> to watch 
a set of
 directories and output filenames as they arrive, you can split to subdivide the
 work of different directories. This allows the runner to split off a hot
-directory and give it additional resources.</p><p>For more information about 
Splittable <code>DoFn</code>, see the following pages:</p><ul><li><a 
href=/documentation/programming-guide/#splittable-dofns>Splittable 
DoFns</a></li><li><a href=/blog/splittable-do-fn-is-available/>Splittable DoFn 
in Apache Beam is Ready to Use</a></li></ul><div class=feedback><p 
class=update>Last updated on 2021/10/21</p><h3>Have you found everything you 
were looking for?</h3><p class=description>Was it all us [...]
+directory and give it additional resources.</p><p>For more information about 
Splittable <code>DoFn</code>, see the following pages:</p><ul><li><a 
href=/documentation/programming-guide/#splittable-dofns>Splittable 
DoFns</a></li><li><a href=/blog/splittable-do-fn-is-available/>Splittable DoFn 
in Apache Beam is Ready to Use</a></li></ul><div class=feedback><p 
class=update>Last updated on 2021/12/06</p><h3>Have you found everything you 
were looking for?</h3><p class=description>Was it all us [...]
 <a href=http://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam 
logo, and the Apache feather logo are either registered trademarks or 
trademarks of The Apache Software Foundation. All other products or name brands 
are trademarks of their respective holders, including The Apache Software 
Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/documentation/index.xml 
b/website/generated-content/documentation/index.xml
index fc279a0..53c27a4 100644
--- a/website/generated-content/documentation/index.xml
+++ b/website/generated-content/documentation/index.xml
@@ -3205,6 +3205,14 @@ pipeline authors build transforms, construct their 
pipelines, and submit
 them to a runner.&lt;/li>
 &lt;li>&lt;a href="#runner">&lt;em>Runner&lt;/em>&lt;/a> - A runner runs a 
Beam pipeline using the capabilities of
 your chosen data processing engine.&lt;/li>
+&lt;li>&lt;a href="#window">&lt;em>Window&lt;/em>&lt;/a> - A 
&lt;code>PCollection&lt;/code> can be subdivided into windows based on
+the timestamps of the individual elements. Windows enable grouping operations
+over collections that grow over time by dividing the collection into windows
+of finite collections.&lt;/li>
+&lt;li>&lt;a href="#watermark">&lt;em>Watermark&lt;/em>&lt;/a> - A watermark 
is a guess as to when all data in a
+certain window is expected to have arrived. This is needed because data isn’t
+always guaranteed to arrive in a pipeline in time order, or to always arrive
+at predictable intervals.&lt;/li>
 &lt;li>&lt;a href="#trigger">&lt;em>Trigger&lt;/em>&lt;/a> - A trigger 
determines when to aggregate the results of
 each window.&lt;/li>
 &lt;li>&lt;a href="#state-and-timers">&lt;em>State and timers&lt;/em>&lt;/a> - 
Per-key state and timer callbacks
@@ -3316,8 +3324,8 @@ batch processing jobs where elements do not denote 
events, the timestamp will be
 the minimum representable timestamp, often referred to colloquially as 
&amp;ldquo;negative
 infinity&amp;rdquo;.&lt;/p>
 &lt;h4 id="watermarks">Watermarks&lt;/h4>
-&lt;p>Every &lt;code>PCollection&lt;/code> must have a watermark that 
estimates how complete the
-&lt;code>PCollection&lt;/code> is.&lt;/p>
+&lt;p>Every &lt;code>PCollection&lt;/code> must have a &lt;a 
href="#watermark">watermark&lt;/a> that estimates how
+complete the &lt;code>PCollection&lt;/code> is.&lt;/p>
 &lt;p>The watermark is a guess that &amp;ldquo;we&amp;rsquo;ll never see an 
element with an earlier
 timestamp&amp;rdquo;. Data sources are responsible for producing a watermark. 
The runner
 must implement watermark propagation as PCollections are processed, merged, and
@@ -3326,13 +3334,12 @@ partitioned.&lt;/p>
 &amp;ldquo;infinity&amp;rdquo;. In this manner, you can discover that an 
unbounded PCollection is
 finite.&lt;/p>
 &lt;h4 id="windowed-elements">Windowed elements&lt;/h4>
-&lt;p>Every element in a &lt;code>PCollection&lt;/code> resides in a window. 
No element resides in
-multiple windows; two elements can be equal except for their window, but they
-are not the same.&lt;/p>
-&lt;p>When elements are read from the outside world, they arrive in the global 
window.
-When they are written to the outside world, they are effectively placed back
+&lt;p>Every element in a &lt;code>PCollection&lt;/code> resides in a &lt;a 
href="#window">window&lt;/a>. No element
+resides in multiple windows; two elements can be equal except for their window,
+but they are not the same.&lt;/p>
+&lt;p>When elements are written to the outside world, they are effectively 
placed back
 into the global window. Transforms that write data and don&amp;rsquo;t take 
this
-perspective probably risks data loss.&lt;/p>
+perspective risk data loss.&lt;/p>
 &lt;p>A window has a maximum timestamp. When the watermark exceeds the maximum
 timestamp plus the user-specified allowed lateness, the window is expired. All
 data related to an expired window might be discarded at any time.&lt;/p>
@@ -3410,9 +3417,9 @@ a Beam pipeline can contain UDFs written in a different 
language, or even
 multiple languages in the same pipeline.&lt;/p>
 &lt;p>Beam has several varieties of UDFs:&lt;/p>
 &lt;ul>
-&lt;li>&lt;a href="/programming-guide/#pardo">&lt;em>DoFn&lt;/em>&lt;/a> - 
per-element processing function (used
-in &lt;code>ParDo&lt;/code>)&lt;/li>
-&lt;li>&lt;a 
href="/programming-guide/#setting-your-pcollections-windowing-function">&lt;em>WindowFn&lt;/em>&lt;/a>
 -
+&lt;li>&lt;a 
href="/documentation/programming-guide/#pardo">&lt;em>DoFn&lt;/em>&lt;/a> - 
per-element processing
+function (used in &lt;code>ParDo&lt;/code>)&lt;/li>
+&lt;li>&lt;a 
href="/documentation/programming-guide/#setting-your-pcollections-windowing-function">&lt;em>WindowFn&lt;/em>&lt;/a>
 -
 places elements in windows and merges windows (used in 
&lt;code>Window&lt;/code> and
 &lt;code>GroupByKey&lt;/code>)&lt;/li>
 &lt;li>&lt;a 
href="/documentation/programming-guide/#side-inputs">&lt;em>ViewFn&lt;/em>&lt;/a>
 - adapts a
@@ -3439,7 +3446,7 @@ pipeline. Beam also supports stateful processing through 
the
 &lt;ul>
 &lt;li>&lt;a 
href="/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms">Requirements
 for writing user code for Beam transforms&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#pardo">Beam Programming 
Guide: ParDo&lt;/a>&lt;/li>
-&lt;li>&lt;a 
href="/programming-guide/#setting-your-pcollections-windowing-function">Beam 
Programming Guide: WindowFn&lt;/a>&lt;/li>
+&lt;li>&lt;a 
href="/documentation/programming-guide/#setting-your-pcollections-windowing-function">Beam
 Programming Guide: WindowFn&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#combine">Beam Programming 
Guide: CombineFn&lt;/a>&lt;/li>
 &lt;li>&lt;a 
href="/documentation/programming-guide/#data-encoding-and-type-safety">Beam 
Programming Guide: Coder&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#side-inputs">Beam 
Programming Guide: Side inputs&lt;/a>&lt;/li>
@@ -3482,6 +3489,73 @@ model they support, see the runner
 &lt;li>&lt;a href="/documentation/#choosing-a-runner">Choosing a 
Runner&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/runners/capability-matrix/">Beam Capability 
Matrix&lt;/a>&lt;/li>
 &lt;/ul>
+&lt;h3 id="window">Window&lt;/h3>
+&lt;p>Windowing subdivides a &lt;code>PCollection&lt;/code> into 
&lt;em>windows&lt;/em> according to the timestamps
+of its individual elements. Windows enable grouping operations over unbounded
+collections by dividing the collection into windows of finite 
collections.&lt;/p>
+&lt;p>A &lt;em>windowing function&lt;/em> tells the runner how to assign 
elements to one or more
+initial windows, and how to merge windows of grouped elements. Each element in 
a
+&lt;code>PCollection&lt;/code> can only be in one window, so if a windowing 
function specifies
+multiple windows for an element, the element is conceptually duplicated into
+each of the windows and each element is identical except for its window.&lt;/p>
+&lt;p>Transforms that aggregate multiple elements, such as 
&lt;code>GroupByKey&lt;/code> and &lt;code>Combine&lt;/code>,
+work implicitly on a per-window basis; they process each 
&lt;code>PCollection&lt;/code> as a
+succession of multiple, finite windows, though the entire collection itself may
+be of unbounded size.&lt;/p>
+&lt;p>Beam provides several windowing functions:&lt;/p>
+&lt;ul>
+&lt;li>&lt;strong>Fixed time windows&lt;/strong> (also known as 
&amp;ldquo;tumbling windows&amp;rdquo;) represent a consistent
+duration, non-overlapping time interval in the data stream.&lt;/li>
+&lt;li>&lt;strong>Sliding time windows&lt;/strong> (also known as 
&amp;ldquo;hopping windows&amp;rdquo;) also represent time
+intervals in the data stream; however, sliding time windows can 
overlap.&lt;/li>
+&lt;li>&lt;strong>Per-session windows&lt;/strong> define windows that contain 
elements that are within a
+certain gap duration of another element.&lt;/li>
+&lt;li>&lt;strong>Single global window&lt;/strong>: by default, all data in a 
&lt;code>PCollection&lt;/code> is assigned to
+the single global window, and late data is discarded.&lt;/li>
+&lt;li>&lt;strong>Calendar-based windows&lt;/strong> (not supported by the 
Beam SDK for Python)&lt;/li>
+&lt;/ul>
+&lt;p>You can also define your own windowing function if you have more complex
+requirements.&lt;/p>
+&lt;p>For example, let&amp;rsquo;s say we have a 
&lt;code>PCollection&lt;/code> that uses fixed-time windowing,
+with windows that are five minutes long. For each window, Beam must collect all
+the data with an event time timestamp in the given window range (between 0:00
+and 4:59 in the first window, for instance). Data with timestamps outside that
+range (data from 5:00 or later) belongs to a different window.&lt;/p>
+&lt;p>Two concepts are closely related to windowing and covered in the 
following
+sections: &lt;a href="#watermark">watermarks&lt;/a> and &lt;a 
href="#trigger">triggers&lt;/a>.&lt;/p>
+&lt;p>For more information about windows, see the following page:&lt;/p>
+&lt;ul>
+&lt;li>&lt;a href="/documentation/programming-guide/#windowing">Beam 
Programming Guide: Windowing&lt;/a>&lt;/li>
+&lt;li>&lt;a 
href="/documentation/programming-guide/#setting-your-pcollections-windowing-function">Beam
 Programming Guide: WindowFn&lt;/a>&lt;/li>
+&lt;/ul>
+&lt;h3 id="watermark">Watermark&lt;/h3>
+&lt;p>In any data processing system, there is a certain amount of lag between 
the time
+a data event occurs (the “event time”, determined by the timestamp on the data
+element itself) and the time the actual data element gets processed at any 
stage
+in your pipeline (the “processing time”, determined by the clock on the system
+processing the element). In addition, data isn’t always guaranteed to arrive in
+a pipeline in time order, or to always arrive at predictable intervals. For
+example, you might have intermediate systems that don&amp;rsquo;t preserve 
order, or you
+might have two servers that timestamp data but one has a better network
+connection.&lt;/p>
+&lt;p>To address this potential unpredictability, Beam tracks a 
&lt;em>watermark&lt;/em>. A
+watermark is a guess as to when all data in a certain window is expected to 
have
+arrived in the pipeline. You can also think of this as “we’ll never see an
+element with an earlier timestamp”.&lt;/p>
+&lt;p>Data sources are responsible for producing a watermark, and every 
&lt;code>PCollection&lt;/code>
+must have a watermark that estimates how complete the 
&lt;code>PCollection&lt;/code> is. The
+contents of a &lt;code>PCollection&lt;/code> are complete when a watermark 
advances to
+“infinity”. In this manner, you might discover that an unbounded 
&lt;code>PCollection&lt;/code>
+is finite. After the watermark progresses past the end of a window, any further
+element that arrives with a timestamp in that window is considered &lt;em>late 
data&lt;/em>.&lt;/p>
+&lt;p>Triggers are a related concept that allow you to modify and refine the 
windowing
+strategy for a &lt;code>PCollection&lt;/code>. You can use triggers to decide 
when each
+individual window aggregates and reports its results, including how the window
+emits late elements.&lt;/p>
+&lt;p>For more information about watermarks, see the following page:&lt;/p>
+&lt;ul>
+&lt;li>&lt;a 
href="/documentation/programming-guide/#watermarks-and-late-data">Beam 
Programming Guide: Watermarks and late data&lt;/a>&lt;/li>
+&lt;/ul>
 &lt;h3 id="trigger">Trigger&lt;/h3>
 &lt;p>When collecting and grouping data into windows, Beam uses 
&lt;em>triggers&lt;/em> to
 determine when to emit the aggregated results of each window (referred to as a
@@ -8893,9 +8967,12 @@ window.&lt;/p>
 &lt;/ul>
 &lt;p>You can also define your own &lt;code>WindowFn&lt;/code> if you have a 
more complex need.&lt;/p>
 &lt;p>Note that each element can logically belong to more than one window, 
depending
-on the windowing function you use. Sliding time windowing, for example, creates
-overlapping windows wherein a single element can be assigned to multiple
-windows.&lt;/p>
+on the windowing function you use. Sliding time windowing, for example, can
+create overlapping windows wherein a single element can be assigned to multiple
+windows. However, each element in a &lt;code>PCollection&lt;/code> can only be 
in one window, so
+if an element is assigned to multiple windows, the element is conceptually
+duplicated into each of the windows and each element is identical except for 
its
+window.&lt;/p>
 &lt;h4 id="fixed-time-windows">8.2.1. Fixed time windows&lt;/h4>
 &lt;p>The simplest form of windowing is using &lt;strong>fixed time 
windows&lt;/strong>: given a
 timestamped &lt;code>PCollection&lt;/code> which might be continuously 
updating, each window
diff --git 
a/website/generated-content/documentation/programming-guide/index.html 
b/website/generated-content/documentation/programming-guide/index.html
index 1ef8286..82acce2 100644
--- a/website/generated-content/documentation/programming-guide/index.html
+++ b/website/generated-content/documentation/programming-guide/index.html
@@ -2558,9 +2558,12 @@ for that <code>PCollection</code>. The 
<code>GroupByKey</code> transform groups
 subsequent <code>ParDo</code> transform gets applied multiple times per key, 
once for each
 window.</p><h3 id=provided-windowing-functions>8.2. Provided windowing 
functions</h3><p>You can define different kinds of windows to divide the 
elements of your
 <code>PCollection</code>. Beam provides several windowing functions, 
including:</p><ul><li>Fixed Time Windows</li><li>Sliding Time 
Windows</li><li>Per-Session Windows</li><li>Single Global 
Window</li><li>Calendar-based Windows (not supported by the Beam SDK for Python 
or Go)</li></ul><p>You can also define your own <code>WindowFn</code> if you 
have a more complex need.</p><p>Note that each element can logically belong to 
more than one window, depending
-on the windowing function you use. Sliding time windowing, for example, creates
-overlapping windows wherein a single element can be assigned to multiple
-windows.</p><h4 id=fixed-time-windows>8.2.1. Fixed time windows</h4><p>The 
simplest form of windowing is using <strong>fixed time windows</strong>: given a
+on the windowing function you use. Sliding time windowing, for example, can
+create overlapping windows wherein a single element can be assigned to multiple
+windows. However, each element in a <code>PCollection</code> can only be in 
one window, so
+if an element is assigned to multiple windows, the element is conceptually
+duplicated into each of the windows and each element is identical except for 
its
+window.</p><h4 id=fixed-time-windows>8.2.1. Fixed time windows</h4><p>The 
simplest form of windowing is using <strong>fixed time windows</strong>: given a
 timestamped <code>PCollection</code> which might be continuously updating, 
each window
 might capture (for example) all elements with timestamps that fall into a 30
 second interval.</p><p>A fixed time window represents a consistent duration, 
non overlapping time
@@ -4307,7 +4310,7 @@ expansionAddr := &#34;localhost:8097&#34;
 outT := beam.UnnamedOutput(typex.New(reflectx.String))
 res := beam.CrossLanguage(s, urn, payload, expansionAddr, 
beam.UnnamedInput(inputPCol), outT)
    </code></pre></div></div></li><li><p>After the job has been submitted to 
the Beam runner, shutdown the expansion service by
-terminating the expansion service process.</p></li></ol><h3 
id=x-lang-transform-runner-support>13.3. Runner Support</h3><p>Currently, 
portable runners such as Flink, Spark, and the Direct runner can be used with 
multi-language pipelines.</p><p>Google Cloud Dataflow supports multi-language 
pipelines through the Dataflow Runner v2 backend architecture.</p><div 
class=feedback><p class=update>Last updated on 2021/11/18</p><h3>Have you found 
everything you were looking for?</h3><p class=descr [...]
+terminating the expansion service process.</p></li></ol><h3 
id=x-lang-transform-runner-support>13.3. Runner Support</h3><p>Currently, 
portable runners such as Flink, Spark, and the Direct runner can be used with 
multi-language pipelines.</p><p>Google Cloud Dataflow supports multi-language 
pipelines through the Dataflow Runner v2 backend architecture.</p><div 
class=feedback><p class=update>Last updated on 2021/12/06</p><h3>Have you found 
everything you were looking for?</h3><p class=descr [...]
 <a href=http://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam 
logo, and the Apache feather logo are either registered trademarks or 
trademarks of The Apache Software Foundation. All other products or name brands 
are trademarks of their respective holders, including The Apache Software 
Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index 78bbfff..5047cde 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.34.0/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-12-01T21:32:04+03:00</lastmod></url><url><loc>/blog/g
 [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.34.0/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-12-01T21:32:04+03:00</lastmod></url><url><loc>/blog/g
 [...]
\ No newline at end of file

Reply via email to