This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push:
new af8db99 Publishing website 2022/03/16 00:01:56 at commit e2ca693
af8db99 is described below
commit af8db99b9ca9aa4ec026f0c39456d48af6b0fbc9
Author: jenkins <[email protected]>
AuthorDate: Wed Mar 16 00:01:57 2022 +0000
Publishing website 2022/03/16 00:01:56 at commit e2ca693
---
website/generated-content/documentation/index.xml | 4 -
.../documentation/io/built-in/snowflake/index.html | 4 +-
website/generated-content/get-started/index.xml | 501 +++++++++++----------
.../get-started/quickstart-java/index.html | 315 ++++++-------
website/generated-content/sitemap.xml | 2 +-
5 files changed, 403 insertions(+), 423 deletions(-)
diff --git a/website/generated-content/documentation/index.xml
b/website/generated-content/documentation/index.xml
index d743d96..5c2b7ee 100644
--- a/website/generated-content/documentation/index.xml
+++ b/website/generated-content/documentation/index.xml
@@ -1987,7 +1987,6 @@ options.getPrivateKeyPassphrase())
</div>
</li>
</ul>
-<p><strong>Important notice</strong>: Only encrypted private key are
supported. Unencrypted (without pasphrase) private key are not supported. For
details, see: <a
href="https://issues.apache.org/jira/browse/BEAM-13818">BEAM-13818</a>.</p>
<h3 id="oauth-token">OAuth token</h3>
<p>SnowflakeIO also supports OAuth token.</p>
<p><strong>IMPORTANT</strong>: SnowflakeIO requires a valid OAuth
access token. It will neither be able to refresh the token nor obtain it using
a web-based flow. For information on configuring an OAuth integration and
obtaining the token, see the <a
href="https://docs.snowflake.com/en/user-guide/oauth-intro.html">Snowflake
documentation</a>.</p>
@@ -3177,9 +3176,6 @@ You can read about Snowflake data types at <a
href="https://docs.snowflake.co
<p>Streaming writing supports only pair key authentication. For details,
see: <a
href="https://issues.apache.org/jira/browse/BEAM-13817">BEAM-13817</a>.</p>
</li>
<li>
-<p>Only encrypted private key are supported. Unencrypted private key are
not supported. For details, see: <a
href="https://issues.apache.org/jira/browse/BEAM-13818">BEAM-13818</a>.</p>
-</li>
-<li>
<p>The role parameter configured in
<code>SnowflakeIO.DataSourceConfiguration</code> object is ignored for
streaming writing. For details, see: <a
href="https://issues.apache.org/jira/browse/BEAM-13819">BEAM-13819</a></p>
</li>
</ol></description></item><item><title>Documentation:
ApproximateQuantiles</title><link>/documentation/transforms/java/aggregation/approximatequantiles/</link><pubDate>Mon,
01 Jan 0001 00:00:00
+0000</pubDate><guid>/documentation/transforms/java/aggregation/approximatequantiles/</guid><description>
diff --git
a/website/generated-content/documentation/io/built-in/snowflake/index.html
b/website/generated-content/documentation/io/built-in/snowflake/index.html
index c497791..707b35f 100644
--- a/website/generated-content/documentation/io/built-in/snowflake/index.html
+++ b/website/generated-content/documentation/io/built-in/snowflake/index.html
@@ -54,7 +54,7 @@ function
openMenu(){addPlaceholder();blockScroll();}</script><div class="clearfi
.withWarehouse(options.getWarehouse())
.withSchema(options.getSchema());
- </code></pre></div></div></li></ul><p><strong>Important notice</strong>:
Only encrypted private key are supported. Unencrypted (without pasphrase)
private key are not supported. For details, see: <a
href=https://issues.apache.org/jira/browse/BEAM-13818>BEAM-13818</a>.</p><h3
id=oauth-token>OAuth token</h3><p>SnowflakeIO also supports OAuth
token.</p><p><strong>IMPORTANT</strong>: SnowflakeIO requires a valid OAuth
access token. It will neither be able to refresh the token nor obtain it [...]
+ </code></pre></div></div></li></ul><h3 id=oauth-token>OAuth
token</h3><p>SnowflakeIO also supports OAuth
token.</p><p><strong>IMPORTANT</strong>: SnowflakeIO requires a valid OAuth
access token. It will neither be able to refresh the token nor obtain it using
a web-based flow. For information on configuring an OAuth integration and
obtaining the token, see the <a
href=https://docs.snowflake.com/en/user-guide/oauth-intro.html>Snowflake
documentation</a>.</p><p>Once you have the token, i [...]
.create()
.withUrl(options.getUrl())
.withServerName(options.getServerName())
@@ -311,7 +311,7 @@ Example:<div class=snippet><div class="notebook-skip
code-snippet without_switch
{"type":"string","length":null},
{"type":"text","length":null},
{"type":"varbinary","size":null},
-{"type":"varchar","length":100}]</code></pre></div></div>You
can read about Snowflake data types at <a
href=https://docs.snowflake.com/en/sql-reference/data-types.html>Snowflake data
types</a>.</p></li><li><p><code>expansion_service</code> Specifies URL of
expansion service.</p></li></ul><h2
id=limitations>Limitations</h2><p>SnowflakeIO currently has the following
limitations.</p><ol><li><p>Streaming writing supports only pair key
authentication. For details, see: [...]
+{"type":"varchar","length":100}]</code></pre></div></div>You
can read about Snowflake data types at <a
href=https://docs.snowflake.com/en/sql-reference/data-types.html>Snowflake data
types</a>.</p></li><li><p><code>expansion_service</code> Specifies URL of
expansion service.</p></li></ul><h2
id=limitations>Limitations</h2><p>SnowflakeIO currently has the following
limitations.</p><ol><li><p>Streaming writing supports only pair key
authentication. For details, see: [...]
<a href=http://www.apache.org>The Apache Software Foundation</a>
| <a href=/privacy_policy>Privacy Policy</a>
| <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam
logo, and the Apache feather logo are either registered trademarks or
trademarks of The Apache Software Foundation. All other products or name brands
are trademarks of their respective holders, including The Apache Software
Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/get-started/index.xml
b/website/generated-content/get-started/index.xml
index 55fc830..15be962 100644
--- a/website/generated-content/get-started/index.xml
+++ b/website/generated-content/get-started/index.xml
@@ -1124,45 +1124,53 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
-<h1 id="apache-beam-java-sdk-quickstart">Apache Beam Java SDK
Quickstart</h1>
-<p>This quickstart shows you how to set up a Java development environment
and run an <a href="/get-started/wordcount-example">example pipeline</a>
written with the <a href="/documentation/sdks/java">Apache Beam Java
SDK</a>, using a <a href="/documentation#runners">runner</a> of your
choice.</p>
-<p>If you&rsquo;re interested in contributing to the Apache Beam Java
codebase, see the <a href="/contribute">Contribution Guide</a>.</p>
+<h1 id="apache-beam-java-sdk-quickstart">Apache Beam Java SDK
quickstart</h1>
+<p>This quickstart shows you how to set up a Java development environment
and run
+an <a href="/get-started/wordcount-example">example pipeline</a> written
with the
+<a href="/documentation/sdks/java">Apache Beam Java SDK</a>, using a
+<a href="/documentation#runners">runner</a> of your choice.</p>
+<p>If you&rsquo;re interested in contributing to the Apache Beam Java
codebase, see the
+<a href="/contribute">Contribution Guide</a>.</p>
+<p>On this page:</p>
<nav id="TableOfContents">
<ul>
-<li><a href="#set-up-your-development-environment">Set up your
Development Environment</a></li>
-<li><a href="#get-the-example-code">Get the Example Code</a></li>
-<li><a href="#optional-convert-from-maven-to-gradle-project">Optional:
Convert from Maven to Gradle Project</a></li>
+<li><a href="#set-up-your-development-environment">Set up your
development environment</a></li>
+<li><a href="#get-the-example-code">Get the example code</a></li>
+<li><a href="#optional-convert-from-maven-to-gradle">Optional: Convert
from Maven to Gradle</a></li>
<li><a href="#get-sample-text">Get sample text</a></li>
<li><a href="#run-a-pipeline">Run a pipeline</a>
<ul>
-<li><a href="#run-wordcount-using-maven">Run WordCount Using
Maven</a></li>
-<li><a href="#run-wordcount-using-gradle">Run WordCount Using
Gradle</a></li>
+<li><a href="#run-wordcount-using-maven">Run WordCount using
Maven</a></li>
+<li><a href="#run-wordcount-using-gradle">Run WordCount using
Gradle</a></li>
</ul>
</li>
<li><a href="#inspect-the-results">Inspect the results</a></li>
<li><a href="#next-steps">Next Steps</a></li>
</ul>
</nav>
-<h2 id="set-up-your-development-environment">Set up your Development
Environment</h2>
+<h2 id="set-up-your-development-environment">Set up your development
environment</h2>
<ol>
-<li>
-<p>Download and install the <a
href="https://www.oracle.com/technetwork/java/javase/downloads/index.html">Java
Development Kit (JDK)</a> version 8. Verify that the <a
href="https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/envvars001.html">JAVA_HOME</a>
environment variable is set and points to your JDK installation.</p>
-</li>
-<li>
-<p>Download and install <a
href="https://maven.apache.org/download.cgi">Apache Maven</a> by following
Maven&rsquo;s <a
href="https://maven.apache.org/install.html">installation guide</a> for your
specific operating system.</p>
-</li>
-<li>
-<p>Optional: Install <a href="https://gradle.org/install/">Gradle</a>
if you would like to convert your Maven project into Gradle.</p>
-</li>
+<li>Download and install the
+<a
href="https://www.oracle.com/technetwork/java/javase/downloads/index.html">Java
Development Kit (JDK)</a>
+version 8, 11, or 17. Verify that the
+<a
href="https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/envvars001.html">JAVA_HOME</a>
+environment variable is set and points to your JDK installation.</li>
+<li>Download and install <a
href="https://maven.apache.org/download.cgi">Apache Maven</a> by
+following the <a href="https://maven.apache.org/install.html">installation
guide</a>
+for your operating system.</li>
+<li>Optional: If you want to convert your Maven project to Gradle, install
+<a href="https://gradle.org/install/">Gradle</a>.</li>
</ol>
-<h2 id="get-the-example-code">Get the Example Code</h2>
-<p>Use the following command to generate a Maven project that contains
Beam&rsquo;s WordCount examples and builds against the most recent Beam
release:</p>
+<h2 id="get-the-example-code">Get the example code</h2>
+<ol>
+<li>
+<p>Generate a Maven example project that builds against the latest Beam
release:
<div class='shell-unix snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-unix" data-lang="unix">$ mvn
archetype:generate \
+<pre><code class="language-unix" data-lang="unix">mvn archetype:generate
\
-DarchetypeGroupId=org.apache.beam \
-DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
-DarchetypeVersion=2.37.0 \
@@ -1170,7 +1178,8 @@ limitations under the License.
-DartifactId=word-count-beam \
-Dversion=&#34;0.1&#34; \
-Dpackage=org.apache.beam.examples \
--DinteractiveMode=false</code></pre>
+-DinteractiveMode=false
+</code></pre>
</div>
</div>
<div class='shell-powerShell snippet'>
@@ -1178,7 +1187,7 @@ limitations under the License.
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<div class="highlight"><pre class="chroma"><code
class="language-powerShell" data-lang="powerShell"><span
class="n">PS</span><span class="p">&gt;</span> <span
class="n">mvn</span> <span class="n">archetype</span><span
class="err">:</span><span class="n">generate</span> <span
class="p">`</span>
+<div class="highlight"><pre class="chroma"><code
class="language-powerShell" data-lang="powerShell"><span
class="n">mvn</span> <span class="n">archetype</span><span
class="err">:</span><span class="n">generate</span> <span
class="p">`</span>
<span class="n">-D</span> <span
class="n">archetypeGroupId</span><span class="p">=</span><span
class="n">org</span><span class="p">.</span><span
class="n">apache</span><span class="p">.</span><span
class="n">beam</span> <span class="p">`</span>
<span class="n">-D</span> <span
class="n">archetypeArtifactId</span><span class="p">=</span><span
class="n">beam-sdks-java-maven-archetypes-examples</span> <span
class="p">`</span>
<span class="n">-D</span> <span
class="n">archetypeVersion</span><span class="p">=</span><span
class="n">2</span><span class="p">.</span><span
class="n">37</span><span class="p">.</span><span
class="n">0</span> <span class="p">`</span>
@@ -1186,21 +1195,22 @@ limitations under the License.
<span class="n">-D</span> <span
class="n">artifactId</span><span class="p">=</span><span
class="n">word-count-beam</span> <span class="p">`</span>
<span class="n">-D</span> <span class="n">version</span><span
class="p">=</span><span class="s2">&#34;0.1&#34;</span>
<span class="p">`</span>
<span class="n">-D</span> <span class="n">package</span><span
class="p">=</span><span class="n">org</span><span
class="p">.</span><span class="n">apache</span><span
class="p">.</span><span class="n">beam</span><span
class="p">.</span><span class="n">examples</span> <span
class="p">`</span>
-<span class="n">-D</span> <span
class="n">interactiveMode</span><span class="p">=</span><span
class="n">false</span></code></pre></div>
+<span class="n">-D</span> <span
class="n">interactiveMode</span><span class="p">=</span><span
class="n">false</span>
+</code></pre></div>
</div>
</div>
-<p>This will create a <code>word-count-beam</code> directory that
contains a <code>pom.xml</code> and several example pipelines that count
words in text files.</p>
+</p>
+<p>Maven creates a new project in the
<strong>word-count-beam</strong> directory.</p>
+</li>
+<li>
+<p>Change into <strong>word-count-beam</strong>:
<div class='shell-unix snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-unix" data-lang="unix">$ cd word-count-beam/
-$ ls
-pom.xml src
-$ ls src/main/java/org/apache/beam/examples/
-DebuggingWordCount.java WindowedWordCount.java common
-MinimalWordCount.java WordCount.java</code></pre>
+<pre><code class="language-unix" data-lang="unix">cd word-count-beam/
+</code></pre>
</div>
</div>
<div class='shell-powerShell snippet'>
@@ -1208,116 +1218,189 @@ MinimalWordCount.java
WordCount.java</code></pre>
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<div class="highlight"><pre class="chroma"><code
class="language-powerShell" data-lang="powerShell"><span
class="n">PS</span><span class="p">&gt;</span> <span
class="nb">cd </span><span class="p">.\</span><span
class="n">word-count-beam</span>
-<span class="n">PS</span><span class="p">&gt;</span> <span
class="nb">dir
-</span><span class="nb"></span>
-<span class="p">...</span>
-<span class="n">Mode</span> <span class="n">LastWriteTime</span>
<span class="n">Length</span> <span class="n">Name</span>
-<span class="p">----</span> <span class="p">-------------</span>
<span class="p">------</span> <span class="p">----</span>
-<span class="n">d</span><span class="p">-----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">src</span>
-<span class="n">-a</span><span class="p">----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">16051</span> <span
class="n">pom</span><span class="p">.</span><span
class="n">xml</span>
-<span class="n">PS</span><span class="p">&gt;</span> <span
class="nb">dir </span><span class="p">.\</span><span
class="n">src</span><span class="p">\</span><span
class="n">main</span><span class="p">\</span><span
class="n">java</span><span class="p">\</span><span
class="n">org</span><span class="p">\</span><span
class="n">apache</span><span class="p">\</span><span
class="n">beam</span><span c [...]
-<span class="p">...</span>
-<span class="n">Mode</span> <span class="n">LastWriteTime</span>
<span class="n">Length</span> <span class="n">Name</span>
-<span class="p">----</span> <span class="p">-------------</span>
<span class="p">------</span> <span class="p">----</span>
-<span class="n">d</span><span class="p">-----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">common</span>
-<span class="n">d</span><span class="p">-----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">complete</span>
-<span class="n">d</span><span class="p">-----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">subprocess</span>
-<span class="n">-a</span><span class="p">----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">7073</span> <span
class="n">DebuggingWordCount</span><span class="p">.</span><span
class="n">java</span>
-<span class="n">-a</span><span class="p">----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">5945</span> <span
class="n">MinimalWordCount</span><span class="p">.</span><span
class="n">java</span>
-<span class="n">-a</span><span class="p">----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">9490</span> <span
class="n">WindowedWordCount</span><span class="p">.</span><span
class="n">java</span>
-<span class="n">-a</span><span class="p">----</span> <span
class="n">7</span><span class="p">/</span><span
class="n">19</span><span class="p">/</span><span
class="n">2018</span> <span class="n">11</span><span
class="err">:</span><span class="n">00</span> <span
class="n">PM</span> <span class="n">7662</span> <span
class="n">WordCount</span><span class="p">.</span><span
class="n">java</span></code> [...]
-</div>
-</div>
-<p>For a detailed introduction to the Beam concepts used in these examples,
see the <a href="/get-started/wordcount-example">WordCount Example
Walkthrough</a>. Here, we&rsquo;ll just focus on executing
<code>WordCount.java</code>.</p>
-<h2 id="optional-convert-from-maven-to-gradle-project">Optional: Convert
from Maven to Gradle Project</h2>
-<p>The steps below explain how to convert the build for the Direct Runner
from Maven to Gradle. Converting the builds for the other runners is a more
involved process and is out of scope for this guide. For additional guidance,
see <a
href="https://docs.gradle.org/current/userguide/migrating_from_maven.html">Migrating
Builds From Apache Maven</a>.</p>
+<div class="highlight"><pre class="chroma"><code
class="language-powerShell" data-lang="powerShell"><span class="nb">cd
</span><span class="p">.\</span><span
class="n">word-count-beam</span>
+</code></pre></div>
+</div>
+</div>
+The directory contains a <strong>pom.xml</strong> and a
<strong>src</strong> directory with example
+pipelines.</p>
+</li>
+<li>
+<p>List the example pipelines:
+<div class='shell-unix snippet'>
+<div class="notebook-skip code-snippet">
+<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
+<img src="/images/copy-icon.svg"/>
+</a>
+<pre><code class="language-unix" data-lang="unix">ls
src/main/java/org/apache/beam/examples/
+</code></pre>
+</div>
+</div>
+<div class='shell-powerShell snippet'>
+<div class="notebook-skip code-snippet">
+<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
+<img src="/images/copy-icon.svg"/>
+</a>
+<div class="highlight"><pre class="chroma"><code
class="language-powerShell" data-lang="powerShell"><span class="nb">dir
</span><span class="p">.\</span><span
class="n">src</span><span class="p">\</span><span
class="n">main</span><span class="p">\</span><span
class="n">java</span><span class="p">\</span><span
class="n">org</span><span class="p">\</span><span
class="n">apache</span><span class="p">\</span>< [...]
+</code></pre></div>
+</div>
+</div>
+You should see the following examples:</p>
+<ul>
+<li><strong>DebuggingWordCount.java</strong> (<a
href="https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/DebuggingWordCount.java">GitHub</a>)</li>
+<li><strong>MinimalWordCount.java</strong> (<a
href="https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java">GitHub</a>)</li>
+<li><strong>WindowedWordCount.java</strong> (<a
href="https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/WindowedWordCount.java">GitHub</a>)</li>
+<li><strong>WordCount.java</strong> (<a
href="https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/WordCount.java">GitHub</a>)</li>
+</ul>
+<p>The example used in this tutorial,
<strong>WordCount.java</strong>, defines a
+Beam pipeline that counts words from an input file (by default, a
<strong>.txt</strong>
+file containing Shakespeare&rsquo;s &ldquo;King Lear&rdquo;). To
learn more about the examples,
+see the <a href="/get-started/wordcount-example">WordCount Example
Walkthrough</a>.</p>
+</li>
+</ol>
+<h2 id="optional-convert-from-maven-to-gradle">Optional: Convert from Maven
to Gradle</h2>
+<p>The steps below explain how to convert the build from Maven to Gradle
for the
+following runners:</p>
+<ul>
+<li>Direct runner</li>
+<li>Dataflow runner</li>
+</ul>
+<p>The conversion process for other runners is similar. For additional
guidance,
+see
+<a
href="https://docs.gradle.org/current/userguide/migrating_from_maven.html">Migrating
Builds From Apache Maven</a>.</p>
<ol>
-<li>Ensure you are in the same directory as the <code>pom.xml</code>
file generated from the previous step. Automatically convert your project from
Maven to Gradle by running:
+<li>In the directory with the <strong>pom.xml</strong> file, run the
automated Maven-to-Gradle
+conversion:
<div class="snippet">
<div class="notebook-skip code-snippet without_switcher">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code>$ gradle init</code></pre>
+<pre><code>gradle init
+</code></pre>
</div>
</div>
-You&rsquo;ll be asked if you want to generate a Gradle build. Enter
<strong>yes</strong>. You&rsquo;ll also be prompted to choose a DSL
(Groovy or Kotlin). This tutorial uses Groovy, so select that if you
don&rsquo;t have a preference.</li>
-<li>After you&rsquo;ve converted the project to Gradle, open the
generated <code>build.gradle</code> file, and, in the
<code>repositories</code> block, replace <code>mavenLocal()</code>
with <code>mavenCentral()</code>:
+You&rsquo;ll be asked if you want to generate a Gradle build. Enter
<strong>yes</strong>. You&rsquo;ll
+also be prompted to choose a DSL (Groovy or Kotlin). For this tutorial, enter
+<strong>2</strong> for Kotlin.</li>
+<li>Open the generated <strong>build.gradle.kts</strong> file and
make the following changes:
+<ol>
+<li>In <code>repositories</code>, replace
<code>mavenLocal()</code> with <code>mavenCentral()</code>.</li>
+<li>In <code>repositories</code>, declare a repository for Confluent
Kafka dependencies:
+<div class="snippet">
+<div class="notebook-skip code-snippet without_switcher">
+<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
+<img src="/images/copy-icon.svg"/>
+</a>
+<pre><code>maven {
+url = uri(&#34;https://packages.confluent.io/maven/&#34;)
+}
+</code></pre>
+</div>
+</div>
+</li>
+<li>At the end of the build script, add the following conditional
dependency:
<div class="snippet">
<div class="notebook-skip code-snippet without_switcher">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code>repositories {
-mavenCentral()
-maven {
-url =
uri(&#39;https://repository.apache.org/content/repositories/snapshots/&#39;)
+<pre><code>if (project.hasProperty(&#34;dataflow-runner&#34;)) {
+dependencies {
+runtimeOnly(&#34;org.apache.beam:beam-runners-google-cloud-dataflow-java:2.37.0&#34;)
}
-maven {
-url = uri(&#39;http://repo.maven.apache.org/maven2&#39;)
}
-}</code></pre>
+</code></pre>
</div>
</div>
</li>
-<li>Add the following task in <code>build.gradle</code> to allow you
to execute pipelines with Gradle:
+<li>At the end of the build script, add the following task:
<div class="snippet">
<div class="notebook-skip code-snippet without_switcher">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code>task execute (type:JavaExec) {
-mainClass = System.getProperty(&#34;mainClass&#34;)
-classpath = sourceSets.main.runtimeClasspath
-systemProperties System.getProperties()
-args System.getProperty(&#34;exec.args&#34;,
&#34;&#34;).split()
-}</code></pre>
+<pre><code>task(&#34;execute&#34;, JavaExec::class) {
+classpath = sourceSets[&#34;main&#34;].runtimeClasspath
+mainClass.set(System.getProperty(&#34;mainClass&#34;))
+}
+</code></pre>
</div>
</div>
</li>
-<li>Rebuild your project by running:
+</ol>
+</li>
+<li>Build your project:
<div class="snippet">
<div class="notebook-skip code-snippet without_switcher">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code>$ gradle build</code></pre>
+<pre><code>gradle build
+</code></pre>
</div>
</div>
</li>
</ol>
<h2 id="get-sample-text">Get sample text</h2>
<blockquote>
-<p>If you&rsquo;re planning to use the DataflowRunner, you can skip
this step. The runner will pull text directly from Google Cloud Storage.</p>
+<p>If you&rsquo;re planning to use the DataflowRunner, you can skip
this step. The
+runner will pull text directly from Google Cloud Storage.</p>
</blockquote>
<ol>
<li>In the <strong>word-count-beam</strong> directory, create a file
called <strong>sample.txt</strong>.</li>
-<li>Add some text to the file. For this example, you can use the text of
Shakespeare&rsquo;s <a
href="https://storage.cloud.google.com/apache-beam-samples/shakespeare/sonnets.txt">Sonnets</a>.</li>
+<li>Add some text to the file. For this example, use the text of
Shakespeare&rsquo;s
+<a
href="https://storage.cloud.google.com/apache-beam-samples/shakespeare/kinglear.txt">King
Lear</a>.</li>
</ol>
<h2 id="run-a-pipeline">Run a pipeline</h2>
-<p>A single Beam pipeline can run on multiple Beam <a
href="/documentation#runners">runners</a>, including the <a
href="/documentation/runners/flink">FlinkRunner</a>, <a
href="/documentation/runners/spark">SparkRunner</a>, <a
href="/documentation/runners/nemo">NemoRunner</a>, <a
href="/documentation/runners/jet">JetRunner</a>, or <a
href="/documentation/runners/dataflow">DataflowRunner</a>. The <a
href="/documentation/runners/direct">DirectRunner [...]
+<p>A single Beam pipeline can run on multiple Beam
+<a href="/documentation#runners">runners</a>. The
+<a href="/documentation/runners/direct">DirectRunner</a> is useful for
getting started,
+because it runs on your machine and requires no specific setup. If
you&rsquo;re just
+trying out Beam and you&rsquo;re not sure what to use, use the
+<a href="/documentation/runners/direct">DirectRunner</a>.</p>
<p>The general process for running a pipeline goes like this:</p>
<ol>
-<li>Ensure you&rsquo;ve done any runner-specific setup.</li>
+<li>Complete any runner-specific setup.</li>
<li>Build your command line:
<ol>
-<li>Specify a runner with
<code>--runner=&lt;runner&gt;</code> (defaults to the <a
href="/documentation/runners/direct">DirectRunner</a>).</li>
+<li>Specify a runner with
<code>--runner=&lt;runner&gt;</code> (defaults to the
+<a href="/documentation/runners/direct">DirectRunner</a>).</li>
<li>Add any runner-specific required options.</li>
-<li>Choose input files and an output location that are accessible to the
runner. (For example, you can&rsquo;t access a local file if you are
running the pipeline on an external cluster.)</li>
+<li>Choose input files and an output location that are accessible to the
+runner. (For example, you can&rsquo;t access a local file if you are
running
+the pipeline on an external cluster.)</li>
</ol>
</li>
<li>Run the command.</li>
</ol>
-<p>To run the WordCount pipeline, see the Maven and Gradle examples
below.</p>
-<h3 id="run-wordcount-using-maven">Run WordCount Using Maven</h3>
+<p>To run the WordCount pipeline:</p>
+<ol>
+<li>
+<p>Follow the setup steps for your runner:</p>
+<ul>
+<li><a href="/documentation/runners/flink">FlinkRunner</a></li>
+<li><a href="/documentation/runners/spark">SparkRunner</a></li>
+<li><a
href="/documentation/runners/dataflow">DataflowRunner</a></li>
+<li><a href="/documentation/runners/samza">SamzaRunner</a></li>
+<li><a href="/documentation/runners/nemo">NemoRunner</a></li>
+<li><a href="/documentation/runners/jet">JetRunner</a></li>
+</ul>
+<p>The DirectRunner will work without additional setup.</p>
+</li>
+<li>
+<p>Run the corresponding Maven or Gradle command below.</p>
+</li>
+</ol>
+<h3 id="run-wordcount-using-maven">Run WordCount using Maven</h3>
<p>For Unix shells:</p>
+<p>
<div class='runner-direct snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-direct" data-lang="direct">$ mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
+<pre><code class="language-direct" data-lang="direct">mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args=&#34;--inputFile=sample.txt --output=counts&#34;
-Pdirect-runner</code></pre>
</div>
</div>
@@ -1326,7 +1409,7 @@ args System.getProperty(&#34;exec.args&#34;,
&#34;&#34;).split()
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flink" data-lang="flink">$ mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
+<pre><code class="language-flink" data-lang="flink">mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args=&#34;--runner=FlinkRunner --inputFile=sample.txt
--output=counts&#34; -Pflink-runner</code></pre>
</div>
</div>
@@ -1335,10 +1418,9 @@ args System.getProperty(&#34;exec.args&#34;,
&#34;&#34;).split()
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flinkCluster" data-lang="flinkCluster">$ mvn
package exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
+<pre><code class="language-flinkCluster" data-lang="flinkCluster">mvn
package exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args=&#34;--runner=FlinkRunner --flinkMaster=&lt;flink
master&gt; --filesToStage=target/word-count-beam-bundled-0.1.jar \
---inputFile=sample.txt --output=/tmp/counts&#34; -Pflink-runner
-You can monitor the running job by visiting the Flink dashboard at
http://&lt;flink master&gt;:8081</code></pre>
+--inputFile=sample.txt --output=/tmp/counts&#34;
-Pflink-runner</code></pre>
</div>
</div>
<div class='runner-spark snippet'>
@@ -1346,7 +1428,7 @@ You can monitor the running job by visiting the Flink
dashboard at http://&l
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-spark" data-lang="spark">$ mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
+<pre><code class="language-spark" data-lang="spark">mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args=&#34;--runner=SparkRunner --inputFile=sample.txt
--output=counts&#34; -Pspark-runner</code></pre>
</div>
</div>
@@ -1355,8 +1437,7 @@ You can monitor the running job by visiting the Flink
dashboard at http://&l
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-dataflow" data-lang="dataflow">Make sure you
complete the setup steps at /documentation/runners/dataflow/#setup
-$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
+<pre><code class="language-dataflow" data-lang="dataflow">mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args=&#34;--runner=DataflowRunner
--project=&lt;your-gcp-project&gt; \
--region=&lt;your-gcp-region&gt; \
--gcpTempLocation=gs://&lt;your-gcs-bucket&gt;/tmp \
@@ -1369,7 +1450,7 @@ $ mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-samza" data-lang="samza">$ mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
+<pre><code class="language-samza" data-lang="samza">mvn compile
exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
-Dexec.args=&#34;--inputFile=sample.txt --output=/tmp/counts
--runner=SamzaRunner&#34; -Psamza-runner</code></pre>
</div>
</div>
@@ -1378,7 +1459,7 @@ $ mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-nemo" data-lang="nemo">$ mvn package
-Pnemo-runner &amp;&amp; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examples.WordCount \
+<pre><code class="language-nemo" data-lang="nemo">mvn package
-Pnemo-runner &amp;&amp; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examples.WordCount \
--runner=NemoRunner --inputFile=`pwd`/sample.txt
--output=counts</code></pre>
</div>
</div>
@@ -1387,18 +1468,20 @@ $ mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-jet" data-lang="jet">$ mvn package
-Pjet-runner
-$ java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount \
+<pre><code class="language-jet" data-lang="jet">mvn package -Pjet-runner
+java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount \
--runner=JetRunner --jetLocalMode=3 --inputFile=`pwd`/sample.txt
--output=counts</code></pre>
</div>
</div>
+</p>
<p>For Windows PowerShell:</p>
+<p>
<div class='runner-direct snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-direct" data-lang="direct">PS&gt; mvn
compile exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
+<pre><code class="language-direct" data-lang="direct">mvn compile
exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args=&#34;--inputFile=sample.txt --output=counts&#34; -P
direct-runner</code></pre>
</div>
</div>
@@ -1407,7 +1490,7 @@ $ java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordC
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flink" data-lang="flink">PS&gt; mvn
compile exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
+<pre><code class="language-flink" data-lang="flink">mvn compile
exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args=&#34;--runner=FlinkRunner --inputFile=sample.txt
--output=counts&#34; -P flink-runner</code></pre>
</div>
</div>
@@ -1416,10 +1499,9 @@ $ java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordC
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flinkCluster"
data-lang="flinkCluster">PS&gt; mvn package exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+<pre><code class="language-flinkCluster" data-lang="flinkCluster">mvn
package exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args=&#34;--runner=FlinkRunner --flinkMaster=&lt;flink
master&gt; --filesToStage=.\target\word-count-beam-bundled-0.1.jar `
---inputFile=C:\path\to\quickstart\sample.txt --output=C:\tmp\counts&#34;
-P flink-runner
-You can monitor the running job by visiting the Flink dashboard at
http://&lt;flink master&gt;:8081</code></pre>
+--inputFile=C:\path\to\quickstart\sample.txt --output=C:\tmp\counts&#34;
-P flink-runner</code></pre>
</div>
</div>
<div class='runner-spark snippet'>
@@ -1427,7 +1509,7 @@ You can monitor the running job by visiting the Flink
dashboard at http://&l
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-spark" data-lang="spark">PS&gt; mvn
compile exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
+<pre><code class="language-spark" data-lang="spark">mvn compile
exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args=&#34;--runner=SparkRunner --inputFile=sample.txt
--output=counts&#34; -P spark-runner</code></pre>
</div>
</div>
@@ -1436,8 +1518,7 @@ You can monitor the running job by visiting the Flink
dashboard at http://&l
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-dataflow" data-lang="dataflow">Make sure you
complete the setup steps at /documentation/runners/dataflow/#setup
-PS&gt; mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+<pre><code class="language-dataflow" data-lang="dataflow">mvn compile
exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args=&#34;--runner=DataflowRunner
--project=&lt;your-gcp-project&gt; `
--region=&lt;your-gcp-region&gt; \
--gcpTempLocation=gs://&lt;your-gcs-bucket&gt;/tmp `
@@ -1450,7 +1531,7 @@ PS&gt; mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.Word
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-samza" data-lang="samza">PS&gt; mvn
compile exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
+<pre><code class="language-samza" data-lang="samza">mvn compile
exec:java -D exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args=&#34;--inputFile=sample.txt --output=/tmp/counts
--runner=SamzaRunner&#34; -P samza-runner</code></pre>
</div>
</div>
@@ -1459,8 +1540,8 @@ PS&gt; mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.Word
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-nemo" data-lang="nemo">PS&gt; mvn package
-P nemo-runner -DskipTests
-PS&gt; java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
+<pre><code class="language-nemo" data-lang="nemo">mvn package -P
nemo-runner -DskipTests
+java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
--runner=NemoRunner --inputFile=`pwd`/sample.txt
--output=counts</code></pre>
</div>
</div>
@@ -1469,20 +1550,22 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-jet" data-lang="jet">PS&gt; mvn package
-P jet-runner
-PS&gt; java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
+<pre><code class="language-jet" data-lang="jet">mvn package -P jet-runner
+java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
--runner=JetRunner --jetLocalMode=3 --inputFile=$pwd/sample.txt
--output=counts</code></pre>
</div>
</div>
-<h3 id="run-wordcount-using-gradle">Run WordCount Using Gradle</h3>
-<p>For Unix shells (Instructions currently only available for Direct,
Spark, and Dataflow):</p>
+</p>
+<h3 id="run-wordcount-using-gradle">Run WordCount using Gradle</h3>
+<p>For Unix shells:</p>
+<p>
<div class='runner-direct snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-direct" data-lang="direct">$ gradle clean
execute -DmainClass=org.apache.beam.examples.WordCount \
--Dexec.args=&#34;--inputFile=sample.txt --output=counts&#34;
-Pdirect-runner</code></pre>
+<pre><code class="language-direct" data-lang="direct">gradle clean
execute -DmainClass=org.apache.beam.examples.WordCount \
+--args=&#34;--inputFile=sample.txt
--output=counts&#34;</code></pre>
</div>
</div>
<div class='runner-flink snippet'>
@@ -1490,7 +1573,7 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flink" data-lang="flink">We are working on
adding the instruction for this runner!</code></pre>
+<pre><code class="language-flink" data-lang="flink">TODO: document Flink
on Gradle: https://issues.apache.org/jira/browse/BEAM-14057</code></pre>
</div>
</div>
<div class='runner-flinkCluster snippet'>
@@ -1498,7 +1581,7 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flinkCluster" data-lang="flinkCluster">We are
working on adding the instruction for this runner!</code></pre>
+<pre><code class="language-flinkCluster" data-lang="flinkCluster">TODO:
document FlinkCluster on Gradle:
https://issues.apache.org/jira/browse/BEAM-14060</code></pre>
</div>
</div>
<div class='runner-spark snippet'>
@@ -1506,8 +1589,7 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-spark" data-lang="spark">$ gradle clean
execute -DmainClass=org.apache.beam.examples.WordCount \
--Dexec.args=&#34;--inputFile=sample.txt --output=counts&#34;
-Pspark-runner</code></pre>
+<pre><code class="language-spark" data-lang="spark">TODO: document Spark
on Gradle: https://issues.apache.org/jira/browse/BEAM-14063</code></pre>
</div>
</div>
<div class='runner-dataflow snippet'>
@@ -1515,8 +1597,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-dataflow" data-lang="dataflow">$ gradle clean
execute -DmainClass=org.apache.beam.examples.WordCount \
--Dexec.args=&#34;--project=&lt;your-gcp-project&gt;
--inputFile=gs://apache-beam-samples/shakespeare/* \
+<pre><code class="language-dataflow" data-lang="dataflow">gradle clean
execute -DmainClass=org.apache.beam.examples.WordCount \
+--args=&#34;--project=&lt;your-gcp-project&gt;
--inputFile=gs://apache-beam-samples/shakespeare/* \
--output=gs://&lt;your-gcs-bucket&gt;/counts&#34;
-Pdataflow-runner</code></pre>
</div>
</div>
@@ -1525,7 +1607,7 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-samza" data-lang="samza">We are working on
adding the instruction for this runner!</code></pre>
+<pre><code class="language-samza" data-lang="samza">TODO: document Samza
on Gradle: https://issues.apache.org/jira/browse/BEAM-14061</code></pre>
</div>
</div>
<div class='runner-nemo snippet'>
@@ -1533,7 +1615,7 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-nemo" data-lang="nemo">We are working on
adding the instruction for this runner!</code></pre>
+<pre><code class="language-nemo" data-lang="nemo">TODO: document Nemo on
Gradle: https://issues.apache.org/jira/browse/BEAM-14058</code></pre>
</div>
</div>
<div class='runner-jet snippet'>
@@ -1541,17 +1623,23 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-jet" data-lang="jet">We are working on adding
the instruction for this runner!</code></pre>
+<pre><code class="language-jet" data-lang="jet">TODO: document Jet on
Gradle: https://issues.apache.org/jira/browse/BEAM-14062</code></pre>
</div>
</div>
+</p>
<h2 id="inspect-the-results">Inspect the results</h2>
-<p>Once the pipeline has completed, you can view the output.
You&rsquo;ll notice that there may be multiple output files prefixed by
<code>count</code>. The exact number of these files is decided by the
runner, giving it the flexibility to do efficient, distributed execution.</p>
+<p>After the pipeline has completed, you can view the output. There might be
+multiple output files prefixed by <code>count</code>. The number of
output files is decided
+by the runner, giving it the flexibility to do efficient, distributed
execution.</p>
+<ol>
+<li>View the output files in a Unix shell:
<div class='runner-direct snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-direct" data-lang="direct">$ ls
counts*</code></pre>
+<pre><code class="language-direct" data-lang="direct">ls counts*
+</code></pre>
</div>
</div>
<div class='runner-flink snippet'>
@@ -1559,7 +1647,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flink" data-lang="flink">$ ls
counts*</code></pre>
+<pre><code class="language-flink" data-lang="flink">ls counts*
+</code></pre>
</div>
</div>
<div class='runner-flinkCluster snippet'>
@@ -1567,7 +1656,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flinkCluster" data-lang="flinkCluster">$ ls
/tmp/counts*</code></pre>
+<pre><code class="language-flinkCluster" data-lang="flinkCluster">ls
/tmp/counts*
+</code></pre>
</div>
</div>
<div class='runner-spark snippet'>
@@ -1575,7 +1665,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-spark" data-lang="spark">$ ls
counts*</code></pre>
+<pre><code class="language-spark" data-lang="spark">ls counts*
+</code></pre>
</div>
</div>
<div class='runner-dataflow snippet'>
@@ -1583,7 +1674,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-dataflow" data-lang="dataflow">$ gsutil ls
gs://&lt;your-gcs-bucket&gt;/counts*</code></pre>
+<pre><code class="language-dataflow" data-lang="dataflow">gsutil ls
gs://&lt;your-gcs-bucket&gt;/counts*
+</code></pre>
</div>
</div>
<div class='runner-samza snippet'>
@@ -1591,7 +1683,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-samza" data-lang="samza">$ ls
/tmp/counts*</code></pre>
+<pre><code class="language-samza" data-lang="samza">ls /tmp/counts*
+</code></pre>
</div>
</div>
<div class='runner-nemo snippet'>
@@ -1599,7 +1692,8 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-nemo" data-lang="nemo">$ ls
counts*</code></pre>
+<pre><code class="language-nemo" data-lang="nemo">ls counts*
+</code></pre>
</div>
</div>
<div class='runner-jet snippet'>
@@ -1607,27 +1701,20 @@ PS&gt; java -cp
target/word-count-beam-bundled-0.1.jar org.apache.beam.examp
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-jet" data-lang="jet">$ ls
counts*</code></pre>
+<pre><code class="language-jet" data-lang="jet">ls counts*
+</code></pre>
</div>
</div>
-<p>When you look into the contents of the file, you&rsquo;ll see that
they contain unique words and the number of occurrences of each word. The order
of elements within the file may differ because the Beam model does not
generally guarantee ordering, again to allow runners to optimize for
efficiency.</p>
+The output files contain unique words and the number of occurrences of each
+word.</li>
+<li>View the output content in a Unix shell:
<div class='runner-direct snippet'>
<div class="notebook-skip code-snippet">
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-direct" data-lang="direct">$ more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-direct" data-lang="direct">more counts*
+</code></pre>
</div>
</div>
<div class='runner-flink snippet'>
@@ -1635,18 +1722,8 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flink" data-lang="flink">$ more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-flink" data-lang="flink">more counts*
+</code></pre>
</div>
</div>
<div class='runner-flinkCluster snippet'>
@@ -1654,18 +1731,8 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-flinkCluster" data-lang="flinkCluster">$ more
/tmp/counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-flinkCluster" data-lang="flinkCluster">more
/tmp/counts*
+</code></pre>
</div>
</div>
<div class='runner-spark snippet'>
@@ -1673,18 +1740,8 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-spark" data-lang="spark">$ more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-spark" data-lang="spark">more counts*
+</code></pre>
</div>
</div>
<div class='runner-dataflow snippet'>
@@ -1692,18 +1749,8 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-dataflow" data-lang="dataflow">$ gsutil cat
gs://&lt;your-gcs-bucket&gt;/counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-dataflow" data-lang="dataflow">gsutil cat
gs://&lt;your-gcs-bucket&gt;/counts*
+</code></pre>
</div>
</div>
<div class='runner-samza snippet'>
@@ -1711,18 +1758,8 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-samza" data-lang="samza">$ more /tmp/counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-samza" data-lang="samza">more /tmp/counts*
+</code></pre>
</div>
</div>
<div class='runner-nemo snippet'>
@@ -1730,18 +1767,8 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-nemo" data-lang="nemo">$ more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-nemo" data-lang="nemo">more counts*
+</code></pre>
</div>
</div>
<div class='runner-jet snippet'>
@@ -1749,30 +1776,40 @@ single: 4
<a class="copy" type="button" data-bs-toggle="tooltip"
data-bs-placement="bottom" title="Copy to clipboard">
<img src="/images/copy-icon.svg"/>
</a>
-<pre><code class="language-jet" data-lang="jet">$ more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre>
+<pre><code class="language-jet" data-lang="jet">more counts*
+</code></pre>
</div>
</div>
+The order of elements is not guaranteed, to allow runners to optimize for
+efficiency. But the output should look something like this:
+<pre><code>...
+Think: 3
+slower: 1
+Having: 1
+revives: 1
+these: 33
+wipe: 1
+arrives: 1
+concluded: 1
+begins: 3
+...
+</code></pre></li>
+</ol>
<h2 id="next-steps">Next Steps</h2>
<ul>
<li>Learn more about the <a href="/documentation/sdks/java/">Beam SDK
for Java</a>
-and look through the <a
href="https://beam.apache.org/releases/javadoc">Java SDK API
reference</a>.</li>
-<li>Walk through these WordCount examples in the <a
href="/get-started/wordcount-example">WordCount Example
Walkthrough</a>.</li>
-<li>Take a self-paced tour through our <a
href="/documentation/resources/learning-resources">Learning
Resources</a>.</li>
-<li>Dive in to some of our favorite <a
href="/documentation/resources/videos-and-podcasts">Videos and
Podcasts</a>.</li>
+and look through the
+<a href="https://beam.apache.org/releases/javadoc">Java SDK API
reference</a>.</li>
+<li>Walk through the WordCount examples in the
+<a href="/get-started/wordcount-example">WordCount Example
Walkthrough</a>.</li>
+<li>Take a self-paced tour through our
+<a href="/documentation/resources/learning-resources">Learning
Resources</a>.</li>
+<li>Dive in to some of our favorite
+<a href="/documentation/resources/videos-and-podcasts">Videos and
Podcasts</a>.</li>
<li>Join the Beam <a href="/community/contact-us">users@</a> mailing
list.</li>
</ul>
-<p>Please don&rsquo;t hesitate to <a
href="/community/contact-us">reach out</a> if you encounter any
issues!</p></description></item><item><title>Get-Started: Beam Quickstart
for Python</title><link>/get-started/quickstart-py/</link><pubDate>Mon, 01 Jan
0001 00:00:00
+0000</pubDate><guid>/get-started/quickstart-py/</guid><description>
+<p>Please don&rsquo;t hesitate to <a
href="/community/contact-us">reach out</a> if you encounter any
+issues!</p></description></item><item><title>Get-Started: Beam Quickstart
for Python</title><link>/get-started/quickstart-py/</link><pubDate>Mon, 01 Jan
0001 00:00:00
+0000</pubDate><guid>/get-started/quickstart-py/</guid><description>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
diff --git a/website/generated-content/get-started/quickstart-java/index.html
b/website/generated-content/get-started/quickstart-java/index.html
index 5de01a2..7bc8b95 100644
--- a/website/generated-content/get-started/quickstart-java/index.html
+++ b/website/generated-content/get-started/quickstart-java/index.html
@@ -18,196 +18,143 @@
function addPlaceholder(){$('input:text').attr('placeholder',"What are you
looking for?");}
function endSearch(){var
search=document.querySelector(".searchBar");search.classList.add("disappear");var
icons=document.querySelector("#iconsBar");icons.classList.remove("disappear");}
function blockScroll(){$("body").toggleClass("fixedPosition");}
-function openMenu(){addPlaceholder();blockScroll();}</script><div
class="clearfix container-main-content"><div class="section-nav closed"
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list
data-section-nav><li><span class=section-nav-list-main-title>Get
started</span></li><li><a href=/get-started/beam-overview/>Beam
Overview</a></li><li><a href=/get-started/tour-of-beam/>Tour of
Beam</a></li><li><s [...]
- -DarchetypeGroupId=org.apache.beam \
- -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
- -DarchetypeVersion=2.37.0 \
- -DgroupId=org.example \
- -DartifactId=word-count-beam \
- -Dversion="0.1" \
- -Dpackage=org.apache.beam.examples \
- -DinteractiveMode=false</code></pre></div></div><div
class="shell-powerShell snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><div
class=highlight><pre class=chroma><code class=language-powerShell
data-lang=powerShell><span class=n>PS</span><span class=p>></span> <span
class=n>mvn</span> <span class=n>archetype</span><span class=err>:</span><span
[...]
- <span class=n>-D</span> <span class=n>archetypeGroupId</span><span
class=p>=</span><span class=n>org</span><span class=p>.</span><span
class=n>apache</span><span class=p>.</span><span class=n>beam</span> <span
class=p>`</span>
- <span class=n>-D</span> <span class=n>archetypeArtifactId</span><span
class=p>=</span><span class=n>beam-sdks-java-maven-archetypes-examples</span>
<span class=p>`</span>
- <span class=n>-D</span> <span class=n>archetypeVersion</span><span
class=p>=</span><span class=n>2</span><span class=p>.</span><span
class=n>37</span><span class=p>.</span><span class=n>0</span> <span
class=p>`</span>
- <span class=n>-D</span> <span class=n>groupId</span><span
class=p>=</span><span class=n>org</span><span class=p>.</span><span
class=n>example</span> <span class=p>`</span>
- <span class=n>-D</span> <span class=n>artifactId</span><span
class=p>=</span><span class=n>word-count-beam</span> <span class=p>`</span>
- <span class=n>-D</span> <span class=n>version</span><span
class=p>=</span><span class=s2>"0.1"</span> <span class=p>`</span>
- <span class=n>-D</span> <span class=n>package</span><span
class=p>=</span><span class=n>org</span><span class=p>.</span><span
class=n>apache</span><span class=p>.</span><span class=n>beam</span><span
class=p>.</span><span class=n>examples</span> <span class=p>`</span>
- <span class=n>-D</span> <span class=n>interactiveMode</span><span
class=p>=</span><span
class=n>false</span></code></pre></div></div></div><p>This will create a
<code>word-count-beam</code> directory that contains a <code>pom.xml</code> and
several example pipelines that count words in text files.</p><div
class="shell-unix snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/ [...]
-
-$ ls
-pom.xml src
-
-$ ls src/main/java/org/apache/beam/examples/
-DebuggingWordCount.java WindowedWordCount.java common
-MinimalWordCount.java WordCount.java</code></pre></div></div><div
class="shell-powerShell snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><div
class=highlight><pre class=chroma><code class=language-powerShell
data-lang=powerShell><span class=n>PS</span><span class=p>></span> <span
class=nb>cd </span><span class=p>.\</span><span class=n>word-count-beam</span>
-
-<span class=n>PS</span><span class=p>></span> <span class=nb>dir
-</span><span class=nb></span>
-<span class=p>...</span>
-
-<span class=n>Mode</span> <span class=n>LastWriteTime</span>
<span class=n>Length</span> <span class=n>Name</span>
-<span class=p>----</span> <span class=p>-------------</span>
<span class=p>------</span> <span class=p>----</span>
-<span class=n>d</span><span class=p>-----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>src</span>
-<span class=n>-a</span><span class=p>----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>16051</span> <span class=n>pom</span><span class=p>.</span><span
class=n>xml</span>
-
-<span class=n>PS</span><span class=p>></span> <span class=nb>dir
</span><span class=p>.\</span><span class=n>src</span><span
class=p>\</span><span class=n>main</span><span class=p>\</span><span
class=n>java</span><span class=p>\</span><span class=n>org</span><span
class=p>\</span><span class=n>apache</span><span class=p>\</span><span
class=n>beam</span><span class=p>\</span><span class=n>examples</span>
-
-<span class=p>...</span>
-<span class=n>Mode</span> <span class=n>LastWriteTime</span>
<span class=n>Length</span> <span class=n>Name</span>
-<span class=p>----</span> <span class=p>-------------</span>
<span class=p>------</span> <span class=p>----</span>
-<span class=n>d</span><span class=p>-----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>common</span>
-<span class=n>d</span><span class=p>-----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>complete</span>
-<span class=n>d</span><span class=p>-----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>subprocess</span>
-<span class=n>-a</span><span class=p>----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>7073</span> <span class=n>DebuggingWordCount</span><span
class=p>.</span><span class=n>java</span>
-<span class=n>-a</span><span class=p>----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>5945</span> <span class=n>MinimalWordCount</span><span
class=p>.</span><span class=n>java</span>
-<span class=n>-a</span><span class=p>----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>9490</span> <span class=n>WindowedWordCount</span><span
class=p>.</span><span class=n>java</span>
-<span class=n>-a</span><span class=p>----</span> <span
class=n>7</span><span class=p>/</span><span class=n>19</span><span
class=p>/</span><span class=n>2018</span> <span class=n>11</span><span
class=err>:</span><span class=n>00</span> <span class=n>PM</span>
<span class=n>7662</span> <span class=n>WordCount</span><span
class=p>.</span><span class=n>java</span></code></pre></div></div></div><p>For
a detailed introduction to the Beam concepts used in these examples, see t [...]
- mavenCentral()
- maven {
- url =
uri('https://repository.apache.org/content/repositories/snapshots/')
+function openMenu(){addPlaceholder();blockScroll();}</script><div
class="clearfix container-main-content"><div class="section-nav closed"
data-offset-top=90 data-offset-bottom=500><span class="section-nav-back
glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list
data-section-nav><li><span class=section-nav-list-main-title>Get
started</span></li><li><a href=/get-started/beam-overview/>Beam
Overview</a></li><li><a href=/get-started/tour-of-beam/>Tour of
Beam</a></li><li><s [...]
+an <a href=/get-started/wordcount-example>example pipeline</a> written with the
+<a href=/documentation/sdks/java>Apache Beam Java SDK</a>, using a
+<a href=/documentation#runners>runner</a> of your choice.</p><p>If
you’re interested in contributing to the Apache Beam Java codebase, see
the
+<a href=/contribute>Contribution Guide</a>.</p><p>On this page:</p><nav
id=TableOfContents><ul><li><a href=#set-up-your-development-environment>Set up
your development environment</a></li><li><a href=#get-the-example-code>Get the
example code</a></li><li><a
href=#optional-convert-from-maven-to-gradle>Optional: Convert from Maven to
Gradle</a></li><li><a href=#get-sample-text>Get sample text</a></li><li><a
href=#run-a-pipeline>Run a pipeline</a><ul><li><a
href=#run-wordcount-using-maven>R [...]
+<a
href=https://www.oracle.com/technetwork/java/javase/downloads/index.html>Java
Development Kit (JDK)</a>
+version 8, 11, or 17. Verify that the
+<a
href=https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/envvars001.html>JAVA_HOME</a>
+environment variable is set and points to your JDK
installation.</li><li>Download and install <a
href=https://maven.apache.org/download.cgi>Apache Maven</a> by
+following the <a href=https://maven.apache.org/install.html>installation
guide</a>
+for your operating system.</li><li>Optional: If you want to convert your Maven
project to Gradle, install
+<a href=https://gradle.org/install/>Gradle</a>.</li></ol><h2
id=get-the-example-code>Get the example code</h2><ol><li><p>Generate a Maven
example project that builds against the latest Beam release:<div
class="shell-unix snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-unix data-lang=unix>mvn archetype:generate \
+ -DarchetypeGroupId=org.apache.beam \
+ -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
+ -DarchetypeVersion=2.37.0 \
+ -DgroupId=org.example \
+ -DartifactId=word-count-beam \
+ -Dversion="0.1" \
+ -Dpackage=org.apache.beam.examples \
+ -DinteractiveMode=false
+ </code></pre></div></div><div class="shell-powerShell snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code
class=language-powerShell data-lang=powerShell><span class=n>mvn</span> <span
class=n>archetype</span><span class=err>:</span><span class=n>generate</span>
<span class=p>`</span>
+ <span class=n>-D</span> <span class=n>archetypeGroupId</span><span
class=p>=</span><span class=n>org</span><span class=p>.</span><span
class=n>apache</span><span class=p>.</span><span class=n>beam</span> <span
class=p>`</span>
+ <span class=n>-D</span> <span class=n>archetypeArtifactId</span><span
class=p>=</span><span class=n>beam-sdks-java-maven-archetypes-examples</span>
<span class=p>`</span>
+ <span class=n>-D</span> <span class=n>archetypeVersion</span><span
class=p>=</span><span class=n>2</span><span class=p>.</span><span
class=n>37</span><span class=p>.</span><span class=n>0</span> <span
class=p>`</span>
+ <span class=n>-D</span> <span class=n>groupId</span><span
class=p>=</span><span class=n>org</span><span class=p>.</span><span
class=n>example</span> <span class=p>`</span>
+ <span class=n>-D</span> <span class=n>artifactId</span><span
class=p>=</span><span class=n>word-count-beam</span> <span class=p>`</span>
+ <span class=n>-D</span> <span class=n>version</span><span
class=p>=</span><span class=s2>"0.1"</span> <span class=p>`</span>
+ <span class=n>-D</span> <span class=n>package</span><span
class=p>=</span><span class=n>org</span><span class=p>.</span><span
class=n>apache</span><span class=p>.</span><span class=n>beam</span><span
class=p>.</span><span class=n>examples</span> <span class=p>`</span>
+ <span class=n>-D</span> <span class=n>interactiveMode</span><span
class=p>=</span><span class=n>false</span>
+ </code></pre></div></div></div></p><p>Maven creates a new project in the
<strong>word-count-beam</strong> directory.</p></li><li><p>Change into
<strong>word-count-beam</strong>:<div class="shell-unix snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-unix data-lang=unix>cd
word-count-beam/
+ </code></pre></div></div><div class="shell-powerShell snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code
class=language-powerShell data-lang=powerShell><span class=nb>cd </span><span
class=p>.\</span><span class=n>word-count-beam</span>
+ </code></pre></div></div></div>The directory contains a
<strong>pom.xml</strong> and a <strong>src</strong> directory with example
+pipelines.</p></li><li><p>List the example pipelines:<div class="shell-unix
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-unix data-lang=unix>ls
src/main/java/org/apache/beam/examples/
+ </code></pre></div></div><div class="shell-powerShell snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code
class=language-powerShell data-lang=powerShell><span class=nb>dir </span><span
class=p>.\</span><span class=n>src</span><span class=p>\</span><span
class=n>main</span><span class=p>\</span><span class=n>jav [...]
+ </code></pre></div></div></div>You should see the following
examples:</p><ul><li><strong>DebuggingWordCount.java</strong> (<a
href=https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/DebuggingWordCount.java>GitHub</a>)</li><li><strong>MinimalWordCount.java</strong>
(<a
href=https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java>GitHub</a>)</li><li><strong>WindowedWordCount.java</
[...]
+Beam pipeline that counts words from an input file (by default, a
<strong>.txt</strong>
+file containing Shakespeare’s “King Lear”). To learn more
about the examples,
+see the <a href=/get-started/wordcount-example>WordCount Example
Walkthrough</a>.</p></li></ol><h2
id=optional-convert-from-maven-to-gradle>Optional: Convert from Maven to
Gradle</h2><p>The steps below explain how to convert the build from Maven to
Gradle for the
+following runners:</p><ul><li>Direct runner</li><li>Dataflow
runner</li></ul><p>The conversion process for other runners is similar. For
additional guidance,
+see
+<a
href=https://docs.gradle.org/current/userguide/migrating_from_maven.html>Migrating
Builds From Apache Maven</a>.</p><ol><li>In the directory with the
<strong>pom.xml</strong> file, run the automated Maven-to-Gradle
+conversion:<div class=snippet><div class="notebook-skip code-snippet
without_switcher"><a class=copy type=button data-bs-toggle=tooltip
data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code>gradle init
+ </code></pre></div></div>You’ll be asked if you want to generate a
Gradle build. Enter <strong>yes</strong>. You’ll
+also be prompted to choose a DSL (Groovy or Kotlin). For this tutorial, enter
+<strong>2</strong> for Kotlin.</li><li>Open the generated
<strong>build.gradle.kts</strong> file and make the following
changes:<ol><li>In <code>repositories</code>, replace <code>mavenLocal()</code>
with <code>mavenCentral()</code>.</li><li>In <code>repositories</code>, declare
a repository for Confluent Kafka dependencies:<div class=snippet><div
class="notebook-skip code-snippet without_switcher"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to c [...]
+ url = uri("https://packages.confluent.io/maven/")
+}
+ </code></pre></div></div></li><li>At the end of the build script, add
the following conditional dependency:<div class=snippet><div
class="notebook-skip code-snippet without_switcher"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code>if
(project.hasProperty("dataflow-runner")) {
+ dependencies {
+
runtimeOnly("org.apache.beam:beam-runners-google-cloud-dataflow-java:2.37.0")
}
-
- maven {
- url = uri('http://repo.maven.apache.org/maven2')
- }
-}</code></pre></div></div></li><li>Add the following task in
<code>build.gradle</code> to allow you to execute pipelines with Gradle:<div
class=snippet><div class="notebook-skip code-snippet without_switcher"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code>task
execute (type:JavaExec) {
- mainClass = System.getProperty("mainClass")
- classpath = sourceSets.main.runtimeClasspath
- systemProperties System.getProperties()
- args System.getProperty("exec.args", "").split()
-}</code></pre></div></div></li><li>Rebuild your project by running:<div
class=snippet><div class="notebook-skip code-snippet without_switcher"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code>$
gradle build</code></pre></div></div></li></ol><h2 id=get-sample-text>Get
sample text</h2><blockquote><p>If you’re planning to use the
DataflowRunner, you can skip this step. The runner will pull [...]
- -Dexec.args="--inputFile=sample.txt --output=counts"
-Pdirect-runner</code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink data-lang=flink>$
mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
- -Dexec.args="--runner=FlinkRunner --inputFile=sample.txt
--output=counts" -Pflink-runner</code></pre></div></div><div
class="runner-flinkCluster snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-flinkCluster data-lang=flinkCluster>$ mvn package exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
- -Dexec.args="--runner=FlinkRunner --flinkMaster=<flink master>
--filesToStage=target/word-count-beam-bundled-0.1.jar \
- --inputFile=sample.txt --output=/tmp/counts"
-Pflink-runner
-
-You can monitor the running job by visiting the Flink dashboard at
http://<flink master>:8081</code></pre></div></div><div
class="runner-spark snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-spark data-lang=spark>$ mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
- -Dexec.args="--runner=SparkRunner --inputFile=sample.txt
--output=counts" -Pspark-runner</code></pre></div></div><div
class="runner-dataflow snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-dataflow data-lang=dataflow>Make sure you complete the setup
steps at /documentation/runners/dataflow/#setup
-
-$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
- -Dexec.args="--runner=DataflowRunner
--project=<your-gcp-project> \
- --region=<your-gcp-region> \
- --gcpTempLocation=gs://<your-gcs-bucket>/tmp \
- --inputFile=gs://apache-beam-samples/shakespeare/*
--output=gs://<your-gcs-bucket>/counts" \
- -Pdataflow-runner</code></pre></div></div><div class="runner-samza
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza data-lang=samza>$
mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
- -Dexec.args="--inputFile=sample.txt --output=/tmp/counts
--runner=SamzaRunner" -Psamza-runner</code></pre></div></div><div
class="runner-nemo snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-nemo data-lang=nemo>$ mvn package -Pnemo-runner && java
-cp target/word-count-beam-bundled-0.1.jar org.apache.beam.exam [...]
- --runner=NemoRunner --inputFile=`pwd`/sample.txt
--output=counts</code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet data-lang=jet>$ mvn
package -Pjet-runner
-$ java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount \
- --runner=JetRunner --jetLocalMode=3 --inputFile=`pwd`/sample.txt
--output=counts</code></pre></div></div><p>For Windows PowerShell:</p><div
class="runner-direct snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-direct data-lang=direct>PS> mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
- -D exec.args="--inputFile=sample.txt --output=counts" -P
direct-runner</code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>PS> mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
- -D exec.args="--runner=FlinkRunner --inputFile=sample.txt
--output=counts" -P flink-runner</code></pre></div></div><div
class="runner-flinkCluster snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-flinkCluster data-lang=flinkCluster>PS> mvn package exec:java
-D exec.mainClass=org.apache.beam.examples.WordCount `
+}
+ </code></pre></div></div></li><li>At the end of the build script, add
the following task:<div class=snippet><div class="notebook-skip code-snippet
without_switcher"><a class=copy type=button data-bs-toggle=tooltip
data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code>task("execute",
JavaExec::class) {
+ classpath = sourceSets["main"].runtimeClasspath
+ mainClass.set(System.getProperty("mainClass"))
+}
+ </code></pre></div></div></li></ol></li><li>Build your project:<div
class=snippet><div class="notebook-skip code-snippet without_switcher"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code>gradle
build
+ </code></pre></div></div></li></ol><h2 id=get-sample-text>Get sample
text</h2><blockquote><p>If you’re planning to use the DataflowRunner, you
can skip this step. The
+runner will pull text directly from Google Cloud
Storage.</p></blockquote><ol><li>In the <strong>word-count-beam</strong>
directory, create a file called <strong>sample.txt</strong>.</li><li>Add some
text to the file. For this example, use the text of Shakespeare’s
+<a
href=https://storage.cloud.google.com/apache-beam-samples/shakespeare/kinglear.txt>King
Lear</a>.</li></ol><h2 id=run-a-pipeline>Run a pipeline</h2><p>A single Beam
pipeline can run on multiple Beam
+<a href=/documentation#runners>runners</a>. The
+<a href=/documentation/runners/direct>DirectRunner</a> is useful for getting
started,
+because it runs on your machine and requires no specific setup. If
you’re just
+trying out Beam and you’re not sure what to use, use the
+<a href=/documentation/runners/direct>DirectRunner</a>.</p><p>The general
process for running a pipeline goes like this:</p><ol><li>Complete any
runner-specific setup.</li><li>Build your command line:<ol><li>Specify a runner
with <code>--runner=<runner></code> (defaults to the
+<a href=/documentation/runners/direct>DirectRunner</a>).</li><li>Add any
runner-specific required options.</li><li>Choose input files and an output
location that are accessible to the
+runner. (For example, you can’t access a local file if you are running
+the pipeline on an external cluster.)</li></ol></li><li>Run the
command.</li></ol><p>To run the WordCount pipeline:</p><ol><li><p>Follow the
setup steps for your runner:</p><ul><li><a
href=/documentation/runners/flink>FlinkRunner</a></li><li><a
href=/documentation/runners/spark>SparkRunner</a></li><li><a
href=/documentation/runners/dataflow>DataflowRunner</a></li><li><a
href=/documentation/runners/samza>SamzaRunner</a></li><li><a
href=/documentation/runners/nemo>NemoRunner</a></li><li><a [...]
+ -Dexec.args="--inputFile=sample.txt --output=counts"
-Pdirect-runner</code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
+ -Dexec.args="--runner=FlinkRunner --inputFile=sample.txt
--output=counts" -Pflink-runner</code></pre></div></div><div
class="runner-flinkCluster snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-flinkCluster data-lang=flinkCluster>mvn package exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
+ -Dexec.args="--runner=FlinkRunner --flinkMaster=<flink master>
--filesToStage=target/word-count-beam-bundled-0.1.jar \
+ --inputFile=sample.txt --output=/tmp/counts"
-Pflink-runner</code></pre></div></div><div class="runner-spark snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-spark
data-lang=spark>mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
+ -Dexec.args="--runner=SparkRunner --inputFile=sample.txt
--output=counts" -Pspark-runner</code></pre></div></div><div
class="runner-dataflow snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-dataflow data-lang=dataflow>mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
+ -Dexec.args="--runner=DataflowRunner
--project=<your-gcp-project> \
+ --region=<your-gcp-region> \
+ --gcpTempLocation=gs://<your-gcs-bucket>/tmp \
+ --inputFile=gs://apache-beam-samples/shakespeare/*
--output=gs://<your-gcs-bucket>/counts" \
+ -Pdataflow-runner</code></pre></div></div><div class="runner-samza
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>mvn compile exec:java
-Dexec.mainClass=org.apache.beam.examples.WordCount \
+ -Dexec.args="--inputFile=sample.txt --output=/tmp/counts
--runner=SamzaRunner" -Psamza-runner</code></pre></div></div><div
class="runner-nemo snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-nemo data-lang=nemo>mvn package -Pnemo-runner && java
-cp target/word-count-beam-bundled-0.1.jar org.apache.beam.example [...]
+ --runner=NemoRunner --inputFile=`pwd`/sample.txt
--output=counts</code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet data-lang=jet>mvn
package -Pjet-runner
+java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount \
+ --runner=JetRunner --jetLocalMode=3 --inputFile=`pwd`/sample.txt
--output=counts</code></pre></div></div></p><p>For Windows
PowerShell:</p><p><div class="runner-direct snippet"><div class="notebook-skip
code-snippet"><a class=copy type=button data-bs-toggle=tooltip
data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-direct
data-lang=direct>mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+ -D exec.args="--inputFile=sample.txt --output=counts" -P
direct-runner</code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+ -D exec.args="--runner=FlinkRunner --inputFile=sample.txt
--output=counts" -P flink-runner</code></pre></div></div><div
class="runner-flinkCluster snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-flinkCluster data-lang=flinkCluster>mvn package exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args="--runner=FlinkRunner --flinkMaster=<flink master>
--filesToStage=.\target\word-count-beam-bundled-0.1.jar `
- --inputFile=C:\path\to\quickstart\sample.txt
--output=C:\tmp\counts" -P flink-runner
-
-You can monitor the running job by visiting the Flink dashboard at
http://<flink master>:8081</code></pre></div></div><div
class="runner-spark snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-spark data-lang=spark>PS> mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
- -D exec.args="--runner=SparkRunner --inputFile=sample.txt
--output=counts" -P spark-runner</code></pre></div></div><div
class="runner-dataflow snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-dataflow data-lang=dataflow>Make sure you complete the setup
steps at /documentation/runners/dataflow/#setup
-
-PS> mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+ --inputFile=C:\path\to\quickstart\sample.txt
--output=C:\tmp\counts" -P flink-runner</code></pre></div></div><div
class="runner-spark snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-spark data-lang=spark>mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+ -D exec.args="--runner=SparkRunner --inputFile=sample.txt
--output=counts" -P spark-runner</code></pre></div></div><div
class="runner-dataflow snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-dataflow data-lang=dataflow>mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
-D exec.args="--runner=DataflowRunner --project=<your-gcp-project> `
--region=<your-gcp-region> \
--gcpTempLocation=gs://<your-gcs-bucket>/tmp `
--inputFile=gs://apache-beam-samples/shakespeare/*
--output=gs://<your-gcs-bucket>/counts" `
- -P dataflow-runner</code></pre></div></div><div class="runner-samza
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>PS> mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
- -D exec.args="--inputFile=sample.txt --output=/tmp/counts
--runner=SamzaRunner" -P samza-runner</code></pre></div></div><div
class="runner-nemo snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-nemo data-lang=nemo>PS> mvn package -P nemo-runner -DskipTests
-PS> java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
- --runner=NemoRunner --inputFile=`pwd`/sample.txt
--output=counts</code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet
data-lang=jet>PS> mvn package -P jet-runner
-PS> java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
- --runner=JetRunner --jetLocalMode=3 --inputFile=$pwd/sample.txt
--output=counts</code></pre></div></div><h3 id=run-wordcount-using-gradle>Run
WordCount Using Gradle</h3><p>For Unix shells (Instructions currently only
available for Direct, Spark, and Dataflow):</p><div class="runner-direct
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code c [...]
- -Dexec.args="--inputFile=sample.txt --output=counts"
-Pdirect-runner</code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>We are working on adding the instruction for this
runner!</code></pre></div></div><div class="runner-flinkCluster snippet"><div
cl [...]
- -Dexec.args="--inputFile=sample.txt --output=counts"
-Pspark-runner</code></pre></div></div><div class="runner-dataflow
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-dataflow
data-lang=dataflow>$ gradle clean execute
-DmainClass=org.apache.beam.examples.WordCount \
- -Dexec.args="--project=<your-gcp-project>
--inputFile=gs://apache-beam-samples/shakespeare/* \
- --output=gs://<your-gcs-bucket>/counts"
-Pdataflow-runner</code></pre></div></div><div class="runner-samza
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>We are working on adding the instruction for this
runner!</code></pre></div></div><div class="runner-nemo snippet"><div
class="notebook-ski [...]
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink data-lang=flink>$
more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-flinkCluster snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flinkCluster
data-lang=flinkCluster>$ more /tmp/counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-spark snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-spark data-lang=spark>$
more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-dataflow snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-dataflow
data-lang=dataflow>$ gsutil cat gs://<your-gcs-bucket>/counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-samza snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza data-lang=samza>$
more /tmp/counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-nemo snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-nemo data-lang=nemo>$
more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet data-lang=jet>$
more counts*
-wrought: 2
-st: 32
-fresher: 1
-of: 351
-souls: 2
-CXVIII: 1
-reviewest: 1
-untold: 1
-th: 1
-single: 4
-...</code></pre></div></div><h2 id=next-steps>Next Steps</h2><ul><li>Learn
more about the <a href=/documentation/sdks/java/>Beam SDK for Java</a>
-and look through the <a href=https://beam.apache.org/releases/javadoc>Java SDK
API reference</a>.</li><li>Walk through these WordCount examples in the <a
href=/get-started/wordcount-example>WordCount Example
Walkthrough</a>.</li><li>Take a self-paced tour through our <a
href=/documentation/resources/learning-resources>Learning
Resources</a>.</li><li>Dive in to some of our favorite <a
href=/documentation/resources/videos-and-podcasts>Videos and
Podcasts</a>.</li><li>Join the Beam <a href= [...]
+ -P dataflow-runner</code></pre></div></div><div class="runner-samza
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>mvn compile exec:java -D
exec.mainClass=org.apache.beam.examples.WordCount `
+ -D exec.args="--inputFile=sample.txt --output=/tmp/counts
--runner=SamzaRunner" -P samza-runner</code></pre></div></div><div
class="runner-nemo snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-nemo data-lang=nemo>mvn package -P nemo-runner -DskipTests
+java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
+ --runner=NemoRunner --inputFile=`pwd`/sample.txt
--output=counts</code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet data-lang=jet>mvn
package -P jet-runner
+java -cp target/word-count-beam-bundled-0.1.jar
org.apache.beam.examples.WordCount `
+ --runner=JetRunner --jetLocalMode=3 --inputFile=$pwd/sample.txt
--output=counts</code></pre></div></div></p><h3
id=run-wordcount-using-gradle>Run WordCount using Gradle</h3><p>For Unix
shells:</p><p><div class="runner-direct snippet"><div class="notebook-skip
code-snippet"><a class=copy type=button data-bs-toggle=tooltip
data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-direct
data-lang=direct>gradle clean execute -DmainCl [...]
+ --args="--inputFile=sample.txt
--output=counts"</code></pre></div></div><div class="runner-flink
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>TODO: document Flink on Gradle:
https://issues.apache.org/jira/browse/BEAM-14057</code></pre></div></div><div
class="runner-flinkCluster snippet"><div [...]
+ --args="--project=<your-gcp-project>
--inputFile=gs://apache-beam-samples/shakespeare/* \
+ --output=gs://<your-gcs-bucket>/counts"
-Pdataflow-runner</code></pre></div></div><div class="runner-samza
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>TODO: document Samza on Gradle:
https://issues.apache.org/jira/browse/BEAM-14061</code></pre></div></div><div
class="runner-nemo snippet">< [...]
+multiple output files prefixed by <code>count</code>. The number of output
files is decided
+by the runner, giving it the flexibility to do efficient, distributed
execution.</p><ol><li>View the output files in a Unix shell:<div
class="runner-direct snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-direct data-lang=direct>ls counts*
+ </code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>ls counts*
+ </code></pre></div></div><div class="runner-flinkCluster snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flinkCluster
data-lang=flinkCluster>ls /tmp/counts*
+ </code></pre></div></div><div class="runner-spark snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-spark
data-lang=spark>ls counts*
+ </code></pre></div></div><div class="runner-dataflow snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-dataflow
data-lang=dataflow>gsutil ls gs://<your-gcs-bucket>/counts*
+ </code></pre></div></div><div class="runner-samza snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>ls /tmp/counts*
+ </code></pre></div></div><div class="runner-nemo snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-nemo data-lang=nemo>ls
counts*
+ </code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet data-lang=jet>ls
counts*
+ </code></pre></div></div>The output files contain unique words and the
number of occurrences of each
+word.</li><li>View the output content in a Unix shell:<div
class="runner-direct snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><pre><code
class=language-direct data-lang=direct>more counts*
+ </code></pre></div></div><div class="runner-flink snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flink
data-lang=flink>more counts*
+ </code></pre></div></div><div class="runner-flinkCluster snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-flinkCluster
data-lang=flinkCluster>more /tmp/counts*
+ </code></pre></div></div><div class="runner-spark snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-spark
data-lang=spark>more counts*
+ </code></pre></div></div><div class="runner-dataflow snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-dataflow
data-lang=dataflow>gsutil cat gs://<your-gcs-bucket>/counts*
+ </code></pre></div></div><div class="runner-samza snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-samza
data-lang=samza>more /tmp/counts*
+ </code></pre></div></div><div class="runner-nemo snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-nemo
data-lang=nemo>more counts*
+ </code></pre></div></div><div class="runner-jet snippet"><div
class="notebook-skip code-snippet"><a class=copy type=button
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img
src=/images/copy-icon.svg></a><pre><code class=language-jet data-lang=jet>more
counts*
+ </code></pre></div></div>The order of elements is not guaranteed, to allow
runners to optimize for
+efficiency. But the output should look something like this:<pre><code>...
+Think: 3
+slower: 1
+Having: 1
+revives: 1
+these: 33
+wipe: 1
+arrives: 1
+concluded: 1
+begins: 3
+...
+</code></pre></li></ol><h2 id=next-steps>Next Steps</h2><ul><li>Learn more
about the <a href=/documentation/sdks/java/>Beam SDK for Java</a>
+and look through the
+<a href=https://beam.apache.org/releases/javadoc>Java SDK API
reference</a>.</li><li>Walk through the WordCount examples in the
+<a href=/get-started/wordcount-example>WordCount Example
Walkthrough</a>.</li><li>Take a self-paced tour through our
+<a href=/documentation/resources/learning-resources>Learning
Resources</a>.</li><li>Dive in to some of our favorite
+<a href=/documentation/resources/videos-and-podcasts>Videos and
Podcasts</a>.</li><li>Join the Beam <a href=/community/contact-us>users@</a>
mailing list.</li></ul><p>Please don’t hesitate to <a
href=/community/contact-us>reach out</a> if you encounter any
+issues!</p><div class=feedback><p class=update>Last updated on
2022/03/15</p><h3>Have you found everything you were looking for?</h3><p
class=description>Was it all useful and clear? Is there anything that you would
like to change? Let us know!</p><button class=load-button><a
href="mailto:[email protected]?subject=Beam Website Feedback">SEND
FEEDBACK</a></button></div></div></div><footer class=footer><div
class=footer__contained><div class=footer__cols><div class="footer__cols__col
foo [...]
<a href=http://www.apache.org>The Apache Software Foundation</a>
| <a href=/privacy_policy>Privacy Policy</a>
| <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam
logo, and the Apache feather logo are either registered trademarks or
trademarks of The Apache Software Foundation. All other products or name brands
are trademarks of their respective holders, including The Apache Software
Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/sitemap.xml
b/website/generated-content/sitemap.xml
index 45bdf60..fa548fe 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.37.0/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/blog/u
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.37.0/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2022-03-04T10:14:02-08:00</lastmod></url><url><loc>/blog/u
[...]
\ No newline at end of file