This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 2f207cf Publishing website 2020/11/10 18:03:00 at commit 4a62f2b
2f207cf is described below
commit 2f207cfe307dee193ee7e1f95d9d7d093fbf91cd
Author: jenkins <[email protected]>
AuthorDate: Tue Nov 10 18:03:00 2020 +0000
Publishing website 2020/11/10 18:03:00 at commit 4a62f2b
---
.../documentation/runners/direct/index.html | 90 ++++++++++++----------
.../documentation/runners/flink/index.html | 2 +-
website/generated-content/sitemap.xml | 2 +-
3 files changed, 52 insertions(+), 42 deletions(-)
diff --git a/website/generated-content/documentation/runners/direct/index.html
b/website/generated-content/documentation/runners/direct/index.html
index 952a053..3bb23d4 100644
--- a/website/generated-content/documentation/runners/direct/index.html
+++ b/website/generated-content/documentation/runners/direct/index.html
@@ -9,52 +9,62 @@
<span class=o></</span><span class=n>dependency</span><span
class=o>></span></code></pre></div></div></p><p><span class=language-py>This
section is not applicable to the Beam SDK for Python.</span></p><h2
id=pipeline-options-for-the-direct-runner>Pipeline options for the Direct
Runner</h2><p>When executing your pipeline from the command-line, set
<code>runner</code> to <code>direct</code> or <code>DirectRunner</code>. The
default values for the other pipeline options are generally [...]
<span class=language-java><a
href=https://beam.apache.org/releases/javadoc/2.25.0/index.html?org/apache/beam/runners/direct/DirectOptions.html><code>DirectOptions</code></a></span>
<span class=language-py><a
href=https://beam.apache.org/releases/pydoc/2.25.0/apache_beam.options.pipeline_options.html#apache_beam.options.pipeline_options.DirectOptions><code>DirectOptions</code></a></span>
-interface for defaults and additional pipeline configuration options.</p><h2
id=additional-information-and-caveats>Additional information and
caveats</h2><h3 id=memory-considerations>Memory considerations</h3><p>Local
execution is limited by the memory available in your local environment. It is
highly recommended that you run your pipeline with data sets small enough to
fit in local memory. You can create a small in-memory data set using a <span
class=language-java><a href=https://beam.a [...]
-From 2.22.0, <code>direct_num_workers = 0</code> is supported. When
<code>direct_num_workers</code> is set to 0, it will set the number of
threads/subprocess to the number of cores of the machine where the pipelien is
running.</p><p>There are several ways to set this option.</p><ul><li>Passing
through CLI when executing a pipeline.</li></ul><pre><code>python wordcount.py
--input xx --output xx --direct_num_workers 2
-</code></pre><ul><li>Setting with
<code>PipelineOptions</code>.</li></ul><pre><code>from
apache_beam.options.pipeline_options import PipelineOptions
-pipeline_options = PipelineOptions(['--direct_num_workers', '2'])
-</code></pre><ul><li>Adding to existing
<code>PipelineOptions</code>.</li></ul><pre><code>from
apache_beam.options.pipeline_options import DirectOptions
-pipeline_options = PipelineOptions(xxx)
-pipeline_options.view_as(DirectOptions).direct_num_workers = 2
-</code></pre><p><strong>Setting running mode</strong></p><p>From 2.19, a new
option was added to set running mode. We can use
<code>direct_running_mode</code> option to set the running mode.
-<code>direct_running_mode</code> can be one of [<code>'in_memory'</code>,
<code>'multi_threading'</code>,
<code>'multi_processing'</code>].</p><p><b>in_memory</b>: Runner and
workers’ communication happens in memory (not through gRPC). This is a
default mode.</p><p><b>multi_threading</b>: Runner and workers communicate
through gRPC and each worker runs in a thread.</p><p><b>multi_processing</b>:
Runner and workers communicate through gRPC and each worker runs in a
subprocess.</p><p [...]
+interface for defaults and additional pipeline configuration options.</p><h2
id=additional-information-and-caveats>Additional information and
caveats</h2><h3 id=memory-considerations>Memory considerations</h3><p>Local
execution is limited by the memory available in your local environment. It is
highly recommended that you run your pipeline with data sets small enough to
fit in local memory. You can create a small in-memory data set using a <span
class=language-java><a href=https://beam.a [...]
+Python <a
href=https://beam.apache.org/contribute/runner-guide/#the-fn-api>FnApiRunner</a>
supports multi-threading and multi-processing mode.</p><p>{:.language-py}
+<strong>Setting parallelism</strong></p><p>{:.language-py}
+Number of threads or subprocesses is defined by setting the
<code>direct_num_workers</code> option.
+From 2.22.0, <code>direct_num_workers = 0</code> is supported. When
<code>direct_num_workers</code> is set to 0, it will set the number of
threads/subprocess to the number of cores of the machine where the pipeline is
running.</p><p>{:.language-py}</p><ul><li>There are several ways to set this
option.</li></ul><div class=highlight><pre class=chroma><code class=language-py
data-lang=py><span class=n>python</span> <span class=n>wordcount</span><span
class=o>.</span><span class=n>py</span> [...]
+</code></pre></div><p>{:.language-py}</p><ul><li>Setting with
<code>PipelineOptions</code>.</li></ul><div class=highlight><pre
class=chroma><code class=language-py data-lang=py><span class=kn>from</span>
<span class=nn>apache_beam.options.pipeline_options</span> <span
class=kn>import</span> <span class=n>PipelineOptions</span>
+<span class=n>pipeline_options</span> <span class=o>=</span> <span
class=n>PipelineOptions</span><span class=p>([</span><span
class=s1>'--direct_num_workers'</span><span class=p>,</span> <span
class=s1>'2'</span><span class=p>])</span>
+</code></pre></div><p>{:.language-py}</p><ul><li>Adding to existing
<code>PipelineOptions</code>.</li></ul><div class=highlight><pre
class=chroma><code class=language-py data-lang=py><span class=kn>from</span>
<span class=nn>apache_beam.options.pipeline_options</span> <span
class=kn>import</span> <span class=n>DirectOptions</span>
+<span class=n>pipeline_options</span> <span class=o>=</span> <span
class=n>PipelineOptions</span><span class=p>(</span><span
class=n>xxx</span><span class=p>)</span>
+<span class=n>pipeline_options</span><span class=o>.</span><span
class=n>view_as</span><span class=p>(</span><span
class=n>DirectOptions</span><span class=p>)</span><span class=o>.</span><span
class=n>direct_num_workers</span> <span class=o>=</span> <span class=mi>2</span>
+</code></pre></div><p>{:.language-py}
+<strong>Setting running mode</strong></p><p>{:.language-py}
+From 2.19, a new option was added to set running mode. We can use
<code>direct_running_mode</code> option to set the running mode.
+<code>direct_running_mode</code> can be one of [<code>'in_memory'</code>,
<code>'multi_threading'</code>,
<code>'multi_processing'</code>].</p><p>{:.language-py}
+<b>in_memory</b>: Runner and workers’ communication happens in memory
(not through gRPC). This is a default mode.</p><p>{:.language-py}
+<b>multi_threading</b>: Runner and workers communicate through gRPC and each
worker runs in a thread.</p><p>{:.language-py}
+<b>multi_processing</b>: Runner and workers communicate through gRPC and each
worker runs in a subprocess.</p><p>{:.language-py}
+Same as other options, <code>direct_running_mode</code> can be passed through
CLI or set with <code>PipelineOptions</code>.</p><p>{:.language-py}
+For the versions before 2.19.0, the running mode should be set with
<code>FnApiRunner()</code>. Please refer following
examples.</p><p>{:.language-py}</p><h4
id=running-with-multi-threading-mode>Running with multi-threading mode</h4><div
class=highlight><pre class=chroma><code class=language-py data-lang=py><span
class=kn>import</span> <span class=nn>argparse</span>
-import apache_beam as beam
-from apache_beam.options.pipeline_options import PipelineOptions
-from apache_beam.runners.portability import fn_api_runner
-from apache_beam.portability.api import beam_runner_api_pb2
-from apache_beam.portability import python_urns
+<span class=kn>import</span> <span class=nn>apache_beam</span> <span
class=kn>as</span> <span class=nn>beam</span>
+<span class=kn>from</span> <span
class=nn>apache_beam.options.pipeline_options</span> <span
class=kn>import</span> <span class=n>PipelineOptions</span>
+<span class=kn>from</span> <span
class=nn>apache_beam.runners.portability</span> <span class=kn>import</span>
<span class=n>fn_api_runner</span>
+<span class=kn>from</span> <span class=nn>apache_beam.portability.api</span>
<span class=kn>import</span> <span class=n>beam_runner_api_pb2</span>
+<span class=kn>from</span> <span class=nn>apache_beam.portability</span> <span
class=kn>import</span> <span class=n>python_urns</span>
-parser = argparse.ArgumentParser()
-parser.add_argument(...)
-known_args, pipeline_args = parser.parse_known_args(argv)
-pipeline_options = PipelineOptions(pipeline_args)
+<span class=n>parser</span> <span class=o>=</span> <span
class=n>argparse</span><span class=o>.</span><span
class=n>ArgumentParser</span><span class=p>()</span>
+<span class=n>parser</span><span class=o>.</span><span
class=n>add_argument</span><span class=p>(</span><span class=o>...</span><span
class=p>)</span>
+<span class=n>known_args</span><span class=p>,</span> <span
class=n>pipeline_args</span> <span class=o>=</span> <span
class=n>parser</span><span class=o>.</span><span
class=n>parse_known_args</span><span class=p>(</span><span
class=n>argv</span><span class=p>)</span>
+<span class=n>pipeline_options</span> <span class=o>=</span> <span
class=n>PipelineOptions</span><span class=p>(</span><span
class=n>pipeline_args</span><span class=p>)</span>
-p = beam.Pipeline(options=pipeline_options,
- runner=fn_api_runner.FnApiRunner(
- default_environment=beam_runner_api_pb2.Environment(
- urn=python_urns.EMBEDDED_PYTHON_GRPC)))
-</code></pre><h4 id=running-with-multi-processing-mode>Running with
multi-processing mode</h4><pre><code>import argparse
-import sys
+<span class=n>p</span> <span class=o>=</span> <span class=n>beam</span><span
class=o>.</span><span class=n>Pipeline</span><span class=p>(</span><span
class=n>options</span><span class=o>=</span><span
class=n>pipeline_options</span><span class=p>,</span>
+ <span class=n>runner</span><span class=o>=</span><span
class=n>fn_api_runner</span><span class=o>.</span><span
class=n>FnApiRunner</span><span class=p>(</span>
+ <span class=n>default_environment</span><span class=o>=</span><span
class=n>beam_runner_api_pb2</span><span class=o>.</span><span
class=n>Environment</span><span class=p>(</span>
+ <span class=n>urn</span><span class=o>=</span><span
class=n>python_urns</span><span class=o>.</span><span
class=n>EMBEDDED_PYTHON_GRPC</span><span class=p>)))</span>
+</code></pre></div><p>{:.language-py}</p><h4
id=running-with-multi-processing-mode>Running with multi-processing
mode</h4><div class=highlight><pre class=chroma><code class=language-py
data-lang=py><span class=kn>import</span> <span class=nn>argparse</span>
+<span class=kn>import</span> <span class=nn>sys</span>
-import apache_beam as beam
-from apache_beam.options.pipeline_options import PipelineOptions
-from apache_beam.runners.portability import fn_api_runner
-from apache_beam.portability.api import beam_runner_api_pb2
-from apache_beam.portability import python_urns
+<span class=kn>import</span> <span class=nn>apache_beam</span> <span
class=kn>as</span> <span class=nn>beam</span>
+<span class=kn>from</span> <span
class=nn>apache_beam.options.pipeline_options</span> <span
class=kn>import</span> <span class=n>PipelineOptions</span>
+<span class=kn>from</span> <span
class=nn>apache_beam.runners.portability</span> <span class=kn>import</span>
<span class=n>fn_api_runner</span>
+<span class=kn>from</span> <span class=nn>apache_beam.portability.api</span>
<span class=kn>import</span> <span class=n>beam_runner_api_pb2</span>
+<span class=kn>from</span> <span class=nn>apache_beam.portability</span> <span
class=kn>import</span> <span class=n>python_urns</span>
-parser = argparse.ArgumentParser()
-parser.add_argument(...)
-known_args, pipeline_args = parser.parse_known_args(argv)
-pipeline_options = PipelineOptions(pipeline_args)
+<span class=n>parser</span> <span class=o>=</span> <span
class=n>argparse</span><span class=o>.</span><span
class=n>ArgumentParser</span><span class=p>()</span>
+<span class=n>parser</span><span class=o>.</span><span
class=n>add_argument</span><span class=p>(</span><span class=o>...</span><span
class=p>)</span>
+<span class=n>known_args</span><span class=p>,</span> <span
class=n>pipeline_args</span> <span class=o>=</span> <span
class=n>parser</span><span class=o>.</span><span
class=n>parse_known_args</span><span class=p>(</span><span
class=n>argv</span><span class=p>)</span>
+<span class=n>pipeline_options</span> <span class=o>=</span> <span
class=n>PipelineOptions</span><span class=p>(</span><span
class=n>pipeline_args</span><span class=p>)</span>
-p = beam.Pipeline(options=pipeline_options,
- runner=fn_api_runner.FnApiRunner(
- default_environment=beam_runner_api_pb2.Environment(
- urn=python_urns.SUBPROCESS_SDK,
- payload=b'%s -m apache_beam.runners.worker.sdk_worker_main'
- % sys.executable.encode('ascii'))))
-</code></pre></div></div><footer class=footer><div
class=footer__contained><div class=footer__cols><div
class=footer__cols__col><div class=footer__cols__col__logo><img
src=/images/beam_logo_circle.svg class=footer__logo alt="Beam logo"></div><div
class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg
class=footer__logo alt="Apache logo"></div></div><div class="footer__cols__col
footer__cols__col--md"><div class=footer__cols__col__title>Start</div><div
class=footer__cols__c [...]
+<span class=n>p</span> <span class=o>=</span> <span class=n>beam</span><span
class=o>.</span><span class=n>Pipeline</span><span class=p>(</span><span
class=n>options</span><span class=o>=</span><span
class=n>pipeline_options</span><span class=p>,</span>
+ <span class=n>runner</span><span class=o>=</span><span
class=n>fn_api_runner</span><span class=o>.</span><span
class=n>FnApiRunner</span><span class=p>(</span>
+ <span class=n>default_environment</span><span class=o>=</span><span
class=n>beam_runner_api_pb2</span><span class=o>.</span><span
class=n>Environment</span><span class=p>(</span>
+ <span class=n>urn</span><span class=o>=</span><span
class=n>python_urns</span><span class=o>.</span><span
class=n>SUBPROCESS_SDK</span><span class=p>,</span>
+ <span class=n>payload</span><span class=o>=</span><span
class=sa>b</span><span class=s1>'</span><span class=si>%s</span><span
class=s1> -m apache_beam.runners.worker.sdk_worker_main'</span>
+ <span class=o>%</span> <span class=n>sys</span><span
class=o>.</span><span class=n>executable</span><span class=o>.</span><span
class=n>encode</span><span class=p>(</span><span
class=s1>'ascii'</span><span class=p>))))</span>
+</code></pre></div></div></div><footer class=footer><div
class=footer__contained><div class=footer__cols><div
class=footer__cols__col><div class=footer__cols__col__logo><img
src=/images/beam_logo_circle.svg class=footer__logo alt="Beam logo"></div><div
class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg
class=footer__logo alt="Apache logo"></div></div><div class="footer__cols__col
footer__cols__col--md"><div class=footer__cols__col__title>Start</div><div
class=footer__c [...]
<a href=http://www.apache.org>The Apache Software Foundation</a>
| <a href=/privacy_policy>Privacy Policy</a>
| <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam
logo, and the Apache feather logo are either registered trademarks or
trademarks of The Apache Software Foundation. All other products or name brands
are trademarks of their respective holders, including The Apache Software
Foundation.</div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/documentation/runners/flink/index.html
b/website/generated-content/documentation/runners/flink/index.html
index eb0f927..9d61584 100644
--- a/website/generated-content/documentation/runners/flink/index.html
+++ b/website/generated-content/documentation/runners/flink/index.html
@@ -85,7 +85,7 @@ and will not work on remote clusters.
See <a href=/documentation/runtime/sdk-harness-config/>here</a> for
details.</p><h2 id=additional-information-and-caveats>Additional information
and caveats</h2><h3 id=monitoring-your-job>Monitoring your job</h3><p>You can
monitor a running Flink job using the Flink JobManager Dashboard or its Rest
interfaces. By default, this is available at port <code>8081</code> of the
JobManager node. If you have a Flink installation on your local machine that
would be <code>http://localhost:8081</co [...]
Many sources like <code>PubSubIO</code> rely on their checkpoints to be
acknowledged which can only be done when checkpointing is enabled for the
<code>FlinkRunner</code>. To enable checkpointing, please set <span
class=language-java><code>checkpointingInterval</code></span><span
class=language-py><code>checkpointing_interval</code></span> to the desired
checkpointing interval in milliseconds.</p><h2
id=pipeline-options-for-the-flink-runner>Pipeline options for the Flink
Runner</h2><p>Wh [...]
<a
href=https://beam.apache.org/releases/javadoc/2.25.0/index.html?org/apache/beam/runners/flink/FlinkPipelineOptions.html>FlinkPipelineOptions</a>
-reference class:</p><div class=language-java><table class="table
table-bordered"><tr><td><code>allowNonRestoredState</code></td><td>Flag
indicating whether non restored state is allowed if the savepoint contains
state for an operator that is no longer part of the pipeline.</td><td>Default:
<code>false</code></td></tr><tr><td><code>autoBalanceWriteFilesShardingEnabled</code></td><td>Flag
indicating whether auto-balance sharding for WriteFiles transform should be
enabled. This might prove [...]
+reference class:</p><div class=language-java><table class="table
table-bordered"><tr><td><code>allowNonRestoredState</code></td><td>Flag
indicating whether non restored state is allowed if the savepoint contains
state for an operator that is no longer part of the pipeline.</td><td>Default:
<code>false</code></td></tr><tr><td><code>autoBalanceWriteFilesShardingEnabled</code></td><td>Flag
indicating whether auto-balance sharding for WriteFiles transform should be
enabled. This might prove [...]
<a
href=https://beam.apache.org/releases/javadoc/2.25.0/index.html?org/apache/beam/sdk/options/PipelineOptions.html>PipelineOptions</a>
reference.</p><h2 id=flink-version-compatibility>Flink Version
Compatibility</h2><p>The Flink cluster version has to match the minor version
used by the FlinkRunner.
The minor version is the first two numbers in the version string, e.g. in
<code>1.8.0</code> the
diff --git a/website/generated-content/sitemap.xml
b/website/generated-content/sitemap.xml
index 51ab6a2..2c23908 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.25.0/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/blog/b
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.25.0/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/blog/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/categories/</loc><lastmod>2020-10-29T14:08:19-07:00</lastmod></url><url><loc>/blog/b
[...]
\ No newline at end of file