27 00:01:45 at commit 5cb634e

git-site-role Tue, 26 Oct 2021 17:02:35 -0700

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new b5fbe9e  Publishing website 2021/10/27 00:01:45 at commit 5cb634e
b5fbe9e is described below

commit b5fbe9e0dc7f19c65d536b81f83089af018a322f
Author: jenkins <[email protected]>
AuthorDate: Wed Oct 27 00:01:46 2021 +0000

    Publishing website 2021/10/27 00:01:45 at commit 5cb634e
---
 .../documentation/basics/index.html                | 26 +++++++++++++--
 website/generated-content/documentation/index.xml  | 38 ++++++++++++++++++++++
 website/generated-content/sitemap.xml              |  2 +-
 3 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/website/generated-content/documentation/basics/index.html 
b/website/generated-content/documentation/basics/index.html
index 14a033f..8dc0acd 100644
--- a/website/generated-content/documentation/basics/index.html
+++ b/website/generated-content/documentation/basics/index.html
@@ -18,7 +18,7 @@
 function addPlaceholder(){$('input:text').attr('placeholder',"What are you 
looking for?");}
 function endSearch(){var 
search=document.querySelector(".searchBar");search.classList.add("disappear");var
 icons=document.querySelector("#iconsBar");icons.classList.remove("disappear");}
 function blockScroll(){$("body").toggleClass("fixedPosition");}
-function openMenu(){addPlaceholder();blockScroll();}</script><div 
class="clearfix container-main-content"><div class="section-nav closed" 
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back 
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list 
data-section-nav><li><span 
class=section-nav-list-main-title>Documentation</span></li><li><a 
href=/documentation>Using the Documentation</a></li><li 
class=section-nav-item--collapsible><span class=section-nav-lis [...]
+function openMenu(){addPlaceholder();blockScroll();}</script><div 
class="clearfix container-main-content"><div class="section-nav closed" 
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back 
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list 
data-section-nav><li><span 
class=section-nav-list-main-title>Documentation</span></li><li><a 
href=/documentation>Using the Documentation</a></li><li 
class=section-nav-item--collapsible><span class=section-nav-lis [...]
 data-parallel processing pipelines. To get started with Beam, you&rsquo;ll 
need to
 understand an important set of core concepts:</p><ul><li><a 
href=#pipeline><em>Pipeline</em></a> - A pipeline is a user-constructed graph of
 transformations that defines the desired data processing 
operations.</li><li><a href=#pcollection><em>PCollection</em></a> - A 
<code>PCollection</code> is a data set or data
@@ -33,7 +33,10 @@ a <code>PCollection</code>. The schema for a 
<code>PCollection</code> defines el
 <code>PCollection</code> as an ordered list of named fields.</li><li><a 
href=/documentation/sdks/java/><em>SDK</em></a> - A language-specific library 
that lets
 pipeline authors build transforms, construct their pipelines, and submit
 them to a runner.</li><li><a href=#runner><em>Runner</em></a> - A runner runs 
a Beam pipeline using the capabilities of
-your chosen data processing engine.</li></ul><p>The following sections cover 
these concepts in more detail and provide links to
+your chosen data processing engine.</li><li><a 
href=#splittable-dofn><em>Splittable DoFn</em></a> - Splittable DoFns let you 
process
+elements in a non-monolithic way. You can checkpoint the processing of an
+element, and the runner can split the remaining work to yield additional
+parallelism.</li></ul><p>The following sections cover these concepts in more 
detail and provide links to
 additional documentation.</p><h3 id=pipeline>Pipeline</h3><p>A Beam pipeline 
is a graph (specifically, a
 <a href=https://en.wikipedia.org/wiki/Directed_acyclic_graph>directed acyclic 
graph</a>)
 of all the data and computations in your data processing task. This includes
@@ -182,7 +185,24 @@ Flink runner translates a Beam pipeline into a Flink job. 
The Direct Runner runs
 pipelines locally so you can test, debug, and validate that your pipeline
 adheres to the Apache Beam model as closely as possible.</p><p>For an 
up-to-date list of Beam runners and which features of the Apache Beam
 model they support, see the runner
-<a href=/documentation/runners/capability-matrix/>capability 
matrix</a>.</p><p>For more information about runners, see the following 
pages:</p><ul><li><a href=/documentation/#choosing-a-runner>Choosing a 
Runner</a></li><li><a href=/documentation/runners/capability-matrix/>Beam 
Capability Matrix</a></li></ul><div class=feedback><p class=update>Last updated 
on 2021/10/25</p><h3>Have you found everything you were looking for?</h3><p 
class=description>Was it all useful and clear? Is there an [...]
+<a href=/documentation/runners/capability-matrix/>capability 
matrix</a>.</p><p>For more information about runners, see the following 
pages:</p><ul><li><a href=/documentation/#choosing-a-runner>Choosing a 
Runner</a></li><li><a href=/documentation/runners/capability-matrix/>Beam 
Capability Matrix</a></li></ul><h3 id=splittable-dofn>Splittable 
DoFn</h3><p>Splittable <code>DoFn</code> (SDF) is a generalization of 
<code>DoFn</code> that lets you process
+elements in a non-monolithic way. Splittable <code>DoFn</code> makes it easier 
to create
+complex, modular I/O connectors in Beam.</p><p>A regular <code>ParDo</code> 
processes an entire element at a time, applying your regular
+<code>DoFn</code> and waiting for the call to terminate. When you instead 
apply a
+splittable <code>DoFn</code> to each element, the runner has the option of 
splitting the
+element&rsquo;s processing into smaller tasks. You can checkpoint the 
processing of an
+element, and you can split the remaining work to yield additional 
parallelism.</p><p>For example, imagine you want to read every line from very 
large text files.
+When you write your splittable <code>DoFn</code>, you can have separate pieces 
of logic to
+read a segment of a file, split a segment of a file into sub-segments, and
+report progress through the current segment. The runner can then invoke your
+splittable <code>DoFn</code> intelligently to split up each input and read 
portions
+separately, in parallel.</p><p>A common computation pattern has the following 
steps:</p><ol><li>The runner splits an incoming element before starting any 
processing.</li><li>The runner starts running your processing logic on each 
sub-element.</li><li>If the runner notices that some sub-elements are taking 
longer than others,
+the runner splits those sub-elements further and repeats step 2.</li><li>The 
sub-element either finishes processing, or the user chooses to
+checkpoint the sub-element and the runner repeats step 2.</li></ol><p>You can 
also write your splittable <code>DoFn</code> so the runner can split the 
unbounded
+processing. For example, if you write a splittable <code>DoFn</code> to watch 
a set of
+directories and output filenames as they arrive, you can split to subdivide the
+work of different directories. This allows the runner to split off a hot
+directory and give it additional resources.</p><p>For more information about 
Splittable <code>DoFn</code>, see the following pages:</p><ul><li><a 
href=/documentation/programming-guide/#splittable-dofns>Splittable 
DoFns</a></li><li><a href=/blog/splittable-do-fn-is-available/>Splittable DoFn 
in Apache Beam is Ready to Use</a></li></ul><div class=feedback><p 
class=update>Last updated on 2021/10/26</p><h3>Have you found everything you 
were looking for?</h3><p class=description>Was it all us [...]
 <a href=http://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam 
logo, and the Apache feather logo are either registered trademarks or 
trademarks of The Apache Software Foundation. All other products or name brands 
are trademarks of their respective holders, including The Apache Software 
Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/documentation/index.xml 
b/website/generated-content/documentation/index.xml
index 69e5ee9..3fd4cca 100644
--- a/website/generated-content/documentation/index.xml
+++ b/website/generated-content/documentation/index.xml
@@ -3205,6 +3205,10 @@ pipeline authors build transforms, construct their 
pipelines, and submit
 them to a runner.&lt;/li>
 &lt;li>&lt;a href="#runner">&lt;em>Runner&lt;/em>&lt;/a> - A runner runs a 
Beam pipeline using the capabilities of
 your chosen data processing engine.&lt;/li>
+&lt;li>&lt;a href="#splittable-dofn">&lt;em>Splittable DoFn&lt;/em>&lt;/a> - 
Splittable DoFns let you process
+elements in a non-monolithic way. You can checkpoint the processing of an
+element, and the runner can split the remaining work to yield additional
+parallelism.&lt;/li>
 &lt;/ul>
 &lt;p>The following sections cover these concepts in more detail and provide 
links to
 additional documentation.&lt;/p>
@@ -3472,6 +3476,40 @@ model they support, see the runner
 &lt;ul>
 &lt;li>&lt;a href="/documentation/#choosing-a-runner">Choosing a 
Runner&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/runners/capability-matrix/">Beam Capability 
Matrix&lt;/a>&lt;/li>
+&lt;/ul>
+&lt;h3 id="splittable-dofn">Splittable DoFn&lt;/h3>
+&lt;p>Splittable &lt;code>DoFn&lt;/code> (SDF) is a generalization of 
&lt;code>DoFn&lt;/code> that lets you process
+elements in a non-monolithic way. Splittable &lt;code>DoFn&lt;/code> makes it 
easier to create
+complex, modular I/O connectors in Beam.&lt;/p>
+&lt;p>A regular &lt;code>ParDo&lt;/code> processes an entire element at a 
time, applying your regular
+&lt;code>DoFn&lt;/code> and waiting for the call to terminate. When you 
instead apply a
+splittable &lt;code>DoFn&lt;/code> to each element, the runner has the option 
of splitting the
+element&amp;rsquo;s processing into smaller tasks. You can checkpoint the 
processing of an
+element, and you can split the remaining work to yield additional 
parallelism.&lt;/p>
+&lt;p>For example, imagine you want to read every line from very large text 
files.
+When you write your splittable &lt;code>DoFn&lt;/code>, you can have separate 
pieces of logic to
+read a segment of a file, split a segment of a file into sub-segments, and
+report progress through the current segment. The runner can then invoke your
+splittable &lt;code>DoFn&lt;/code> intelligently to split up each input and 
read portions
+separately, in parallel.&lt;/p>
+&lt;p>A common computation pattern has the following steps:&lt;/p>
+&lt;ol>
+&lt;li>The runner splits an incoming element before starting any 
processing.&lt;/li>
+&lt;li>The runner starts running your processing logic on each 
sub-element.&lt;/li>
+&lt;li>If the runner notices that some sub-elements are taking longer than 
others,
+the runner splits those sub-elements further and repeats step 2.&lt;/li>
+&lt;li>The sub-element either finishes processing, or the user chooses to
+checkpoint the sub-element and the runner repeats step 2.&lt;/li>
+&lt;/ol>
+&lt;p>You can also write your splittable &lt;code>DoFn&lt;/code> so the runner 
can split the unbounded
+processing. For example, if you write a splittable &lt;code>DoFn&lt;/code> to 
watch a set of
+directories and output filenames as they arrive, you can split to subdivide the
+work of different directories. This allows the runner to split off a hot
+directory and give it additional resources.&lt;/p>
+&lt;p>For more information about Splittable &lt;code>DoFn&lt;/code>, see the 
following pages:&lt;/p>
+&lt;ul>
+&lt;li>&lt;a 
href="/documentation/programming-guide/#splittable-dofns">Splittable 
DoFns&lt;/a>&lt;/li>
+&lt;li>&lt;a href="/blog/splittable-do-fn-is-available/">Splittable DoFn in 
Apache Beam is Ready to Use&lt;/a>&lt;/li>
 &lt;/ul></description></item><item><title>Documentation: Beam 
glossary</title><link>/documentation/glossary/</link><pubDate>Mon, 01 Jan 0001 
00:00:00 +0000</pubDate><guid>/documentation/glossary/</guid><description>
 &lt;!--
 Licensed under the Apache License, Version 2.0 (the "License");
diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index 32613c6..b570578 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.33.0/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/blog/b
 [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.33.0/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-10-11T18:22:03-07:00</lastmod></url><url><loc>/blog/b
 [...]
\ No newline at end of file

[beam] branch asf-site updated: Publishing website 2021/10/27 00:01:45 at commit 5cb634e

Reply via email to