Regenerate website

Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/394bfe70
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/394bfe70
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/394bfe70

Branch: refs/heads/asf-site
Commit: 394bfe70319d92ad68d1c13a40db936445e0bd99
Parents: f23d9cb
Author: Davor Bonaci <[email protected]>
Authored: Tue Apr 18 15:45:02 2017 -0700
Committer: Davor Bonaci <[email protected]>
Committed: Tue Apr 18 15:45:02 2017 -0700

----------------------------------------------------------------------
 .../documentation/runners/dataflow/index.html   | 79 ++++++++++++++++----
 content/documentation/runners/direct/index.html | 30 ++++++--
 2 files changed, 90 insertions(+), 19 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/394bfe70/content/documentation/runners/dataflow/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/runners/dataflow/index.html 
b/content/documentation/runners/dataflow/index.html
index 2f3d9b0..4dda742 100644
--- a/content/documentation/runners/dataflow/index.html
+++ b/content/documentation/runners/dataflow/index.html
@@ -153,6 +153,14 @@
       <div class="row">
         <h1 id="using-the-google-cloud-dataflow-runner">Using the Google Cloud 
Dataflow Runner</h1>
 
+<nav class="language-switcher">
+  <strong>Adapt for:</strong>
+  <ul>
+    <li data-type="language-java" class="active">Java SDK</li>
+    <li data-type="language-py">Python SDK</li>
+  </ul>
+</nav>
+
 <p>The Google Cloud Dataflow Runner uses the <a 
href="https://cloud.google.com/dataflow/service/dataflow-service-desc";>Cloud 
Dataflow managed service</a>. When you run your pipeline with the Cloud 
Dataflow service, the runner uploads your executable code and dependencies to a 
Google Cloud Storage bucket and creates a Cloud Dataflow job, which executes 
your pipeline on managed resources in Google Cloud Platform.</p>
 
 <p>The Cloud Dataflow Runner and service are suitable for large scale, 
continuous jobs, and provide:</p>
@@ -202,8 +210,7 @@
 
 <h3 id="specify-your-dependency">Specify your dependency</h3>
 
-<p>You must specify your dependency on the Cloud Dataflow Runner.</p>
-
+<p><span class="language-java">When using Java, you must specify your 
dependency on the Cloud Dataflow Runner in your <code 
class="highlighter-rouge">pom.xml</code>.</span></p>
 <div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="o">&lt;</span><span 
class="n">dependency</span><span class="o">&gt;</span>
   <span class="o">&lt;</span><span class="n">groupId</span><span 
class="o">&gt;</span><span class="n">org</span><span class="o">.</span><span 
class="na">apache</span><span class="o">.</span><span 
class="na">beam</span><span class="o">&lt;/</span><span 
class="n">groupId</span><span class="o">&gt;</span>
   <span class="o">&lt;</span><span class="n">artifactId</span><span 
class="o">&gt;</span><span class="n">beam</span><span class="o">-</span><span 
class="n">runners</span><span class="o">-</span><span 
class="n">google</span><span class="o">-</span><span 
class="n">cloud</span><span class="o">-</span><span 
class="n">dataflow</span><span class="o">-</span><span 
class="n">java</span><span class="o">&lt;/</span><span 
class="n">artifactId</span><span class="o">&gt;</span>
@@ -213,6 +220,8 @@
 </code></pre>
 </div>
 
+<p><span class="language-py">This section is not applicable to the Beam SDK 
for Python.</span></p>
+
 <h3 id="authentication">Authentication</h3>
 
 <p>Before running your pipeline, you must authenticate with the Google Cloud 
Platform. Run the following command to get <a 
href="https://developers.google.com/identity/protocols/application-default-credentials";>Application
 Default Credentials</a>.</p>
@@ -223,7 +232,8 @@
 
 <h2 id="pipeline-options-for-the-cloud-dataflow-runner">Pipeline options for 
the Cloud Dataflow Runner</h2>
 
-<p>When executing your pipeline with the Cloud Dataflow Runner, set these 
pipeline options.</p>
+<p><span class="language-java">When executing your pipeline with the Cloud 
Dataflow Runner (Java), consider these common pipeline options.</span>
+<span class="language-py">When executing your pipeline with the Cloud Dataflow 
Runner (Python), consider these common pipeline options.</span></p>
 
 <table class="table table-bordered">
 <tr>
@@ -231,39 +241,80 @@
   <th>Description</th>
   <th>Default Value</th>
 </tr>
+
 <tr>
   <td><code>runner</code></td>
   <td>The pipeline runner to use. This option allows you to determine the 
pipeline runner at runtime.</td>
-  <td>Set to <code>dataflow</code> to run on the Cloud Dataflow Service.</td>
+  <td>Set to <code>dataflow</code> or <code>DataflowRunner</code> to run on 
the Cloud Dataflow Service.</td>
 </tr>
+
 <tr>
   <td><code>project</code></td>
   <td>The project ID for your Google Cloud Project.</td>
   <td>If not set, defaults to the default project in the current environment. 
The default project is set via <code>gcloud</code>.</td>
 </tr>
-<tr>
+
+<!-- Only show for Java -->
+<tr class="language-java">
   <td><code>streaming</code></td>
   <td>Whether streaming mode is enabled or disabled; <code>true</code> if 
enabled. Set to <code>true</code> if running pipelines with unbounded 
<code>PCollection</code>s.</td>
   <td><code>false</code></td>
 </tr>
+
 <tr>
-  <td><code>tempLocation</code></td>
-  <td>Optional. Path for temporary files. If set to a valid Google Cloud 
Storage URL that begins with <code>gs://</code>, <code>tempLocation</code> is 
used as the default value for <code>gcpTempLocation</code>.</td>
+  <td>
+    <span class="language-java"><code>tempLocation</code></span>
+    <span class="language-py"><code>temp_location</code></span>
+  </td>
+  <td>
+    <span class="language-java">Optional.</span>
+    <span class="language-py">Required.</span>
+    Path for temporary files. Must be a valid Google Cloud Storage URL that 
begins with <code>gs://</code>.
+    <span class="language-java">If set, <code>tempLocation</code> is used as 
the default value for <code>gcpTempLocation</code>.</span>
+  </td>
   <td>No default value.</td>
 </tr>
-<tr>
+
+<!-- Only show for Java -->
+<tr class="language-java">
   <td><code>gcpTempLocation</code></td>
   <td>Cloud Storage bucket path for temporary files. Must be a valid Cloud 
Storage URL that begins with <code>gs://</code>.</td>
   <td>If not set, defaults to the value of <code>tempLocation</code>, provided 
that <code>tempLocation</code> is a valid Cloud Storage URL. If 
<code>tempLocation</code> is not a valid Cloud Storage URL, you must set 
<code>gcpTempLocation</code>.</td>
 </tr>
+
 <tr>
-  <td><code>stagingLocation</code></td>
+  <td>
+    <span class="language-java"><code>stagingLocation</code></span>
+    <span class="language-py"><code>staging_location</code></span>
+  </td>
   <td>Optional. Cloud Storage bucket path for staging your binary and any 
temporary files. Must be a valid Cloud Storage URL that begins with 
<code>gs://</code>.</td>
-  <td>If not set, defaults to a staging directory within 
<code>gcpTempLocation</code>.</td>
+  <td>
+    <span class="language-java">If not set, defaults to a staging directory 
within <code>gcpTempLocation</code>.</span>
+    <span class="language-py">If not set, defaults to a staging directory 
within <code>temp_location</code>.</span>
+  </td>
+</tr>
+
+<!-- Only show for Python -->
+<tr class="language-py">
+  <td><code>save_main_session</code></td>
+  <td>Save the main session state so that pickled functions and classes 
defined in <code>__main__</code> (e.g. interactive session) can be unpickled. 
Some workflows do not need the session state if, for instance, all of their 
functions/classes are defined in proper modules (not <code>__main__</code>) and 
the modules are importable in the worker.</td>
+  <td><code>false</code></td>
 </tr>
+
+<!-- Only show for Python -->
+<tr class="language-py">
+  <td><code>sdk_location</code></td>
+  <td>Override the default location from where the Beam SDK is downloaded. 
This value can be an URL, a Cloud Storage path, or a local path to an SDK 
tarball. Workflow submissions will download or copy the SDK tarball from this 
location. If set to the string <code>default</code>, a standard SDK location is 
used. If empty, no SDK is copied.</td>
+  <td><code>default</code></td>
+</tr>
+
+
 </table>
 
-<p>See the reference documentation for the  <span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.html">DataflowPipelineOptions</a></span><span
 class="language-python"><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/pipeline_options.py";>PipelineOptions</a></span>
 interface (and its subinterfaces) for the complete list of pipeline 
configuration options.</p>
+<p>See the reference documentation for the
+<span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.html">DataflowPipelineOptions</a></span>
+<span class="language-py"><a 
href="/documentation/sdks/pydoc/0.6.0/apache_beam.utils.html#apache_beam.utils.pipeline_options.PipelineOptions"><code
 class="highlighter-rouge">PipelineOptions</code></a></span>
+interface (and any subinterfaces) for additional pipeline configuration 
options.</p>
 
 <h2 id="additional-information-and-caveats">Additional information and 
caveats</h2>
 
@@ -273,11 +324,13 @@
 
 <h3 id="blocking-execution">Blocking Execution</h3>
 
-<p>To connect to your job and block until it is completed, call <code 
class="highlighter-rouge">waitToFinish</code> on the <code 
class="highlighter-rouge">PipelineResult</code> returned from <code 
class="highlighter-rouge">pipeline.run()</code>. The Cloud Dataflow Runner 
prints job status updates and console messages while it waits. While the result 
is connected to the active job, note that pressing <strong>Ctrl+C</strong> from 
the command line does not cancel your job. To cancel the job, you can use the 
<a 
href="https://cloud.google.com/dataflow/pipelines/dataflow-monitoring-intf";>Dataflow
 Monitoring Interface</a> or the <a 
href="https://cloud.google.com/dataflow/pipelines/dataflow-command-line-intf";>Dataflow
 Command-line Interface</a>.</p>
+<p>To block until your job completes, call <span 
class="language-java"><code>waitToFinish</code></span><span 
class="language-py"><code>wait_until_finish</code></span> on the <code 
class="highlighter-rouge">PipelineResult</code> returned from <code 
class="highlighter-rouge">pipeline.run()</code>. The Cloud Dataflow Runner 
prints job status updates and console messages while it waits. While the result 
is connected to the active job, note that pressing <strong>Ctrl+C</strong> from 
the command line does not cancel your job. To cancel the job, you can use the 
<a 
href="https://cloud.google.com/dataflow/pipelines/dataflow-monitoring-intf";>Dataflow
 Monitoring Interface</a> or the <a 
href="https://cloud.google.com/dataflow/pipelines/dataflow-command-line-intf";>Dataflow
 Command-line Interface</a>.</p>
 
 <h3 id="streaming-execution">Streaming Execution</h3>
 
-<p>If your pipeline uses an unbounded data source or sink, you must set the 
<code class="highlighter-rouge">streaming</code> option to <code 
class="highlighter-rouge">true</code>.</p>
+<p><span class="language-java">If your pipeline uses an unbounded data source 
or sink, you must set the <code class="highlighter-rouge">streaming</code> 
option to <code class="highlighter-rouge">true</code>.</span>
+<span class="language-py">The Beam SDK for Python does not currently support 
streaming pipelines.</span></p>
+
 
       </div>
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/394bfe70/content/documentation/runners/direct/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/runners/direct/index.html 
b/content/documentation/runners/direct/index.html
index dd1151a..15c2d8b 100644
--- a/content/documentation/runners/direct/index.html
+++ b/content/documentation/runners/direct/index.html
@@ -153,6 +153,14 @@
       <div class="row">
         <h1 id="using-the-direct-runner">Using the Direct Runner</h1>
 
+<nav class="language-switcher">
+  <strong>Adapt for:</strong>
+  <ul>
+    <li data-type="language-java" class="active">Java SDK</li>
+    <li data-type="language-py">Python SDK</li>
+  </ul>
+</nav>
+
 <p>The Direct Runner executes pipelines on your machine and is designed to 
validate that pipelines adhere to the Apache Beam model as closely as possible. 
Instead of focusing on efficient pipeline execution, the Direct Runner performs 
additional checks to ensure that users do not rely on semantics that are not 
guaranteed by the model. Some of these checks include:</p>
 
 <ul>
@@ -166,14 +174,19 @@
 
 <p>Here are some resources with information about how to test your 
pipelines.</p>
 <ul>
-  <li><a href="/blog/2016/10/20/test-stream.html">Testing Unbounded Pipelines 
in Apache Beam</a> talks about the use of Java classes <a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/testing/PAssert.html"><code
 class="highlighter-rouge">PAssert</code></a> and <a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/testing/TestStream.html"><code
 class="highlighter-rouge">TestStream</code></a> to test your pipelines.</li>
-  <li>The <a href="/get-started/wordcount-example/">Apache Beam WordCount 
Example</a> contains an example of logging and testing a pipeline with <a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/testing/PAssert.html"><code
 class="highlighter-rouge">PAssert</code></a>.</li>
+  <!-- Java specific links -->
+  <li class="language-java"><a 
href="/blog/2016/10/20/test-stream.html">Testing Unbounded Pipelines in Apache 
Beam</a> talks about the use of Java classes <a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/testing/PAssert.html">PAssert</a>
 and <a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/testing/TestStream.html">TestStream</a>
 to test your pipelines.</li>
+  <li class="language-java">The <a 
href="/get-started/wordcount-example/#testing-your-pipeline-via-passert">Apache 
Beam WordCount Example</a> contains an example of logging and testing a 
pipeline with <a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/testing/PAssert.html"><code>PAssert</code></a>.</li>
+
+  <!-- Python specific links -->
+  <li class="language-py">You can use <a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/util.py#L206";>assert_that</a>
 to test your pipeline. The Python <a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_debugging.py";>WordCount
 Debugging Example</a> contains an example of logging and testing with <a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/util.py#L206";><code>assert_that</code></a>.</li>
 </ul>
 
 <h2 id="direct-runner-prerequisites-and-setup">Direct Runner prerequisites and 
setup</h2>
 
-<p>You must specify your dependency on the Direct Runner.</p>
+<h3 id="specify-your-dependency">Specify your dependency</h3>
 
+<p><span class="language-java">When using Java, you must specify your 
dependency on the Direct Runner in your <code 
class="highlighter-rouge">pom.xml</code>.</span></p>
 <div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="o">&lt;</span><span 
class="n">dependency</span><span class="o">&gt;</span>
    <span class="o">&lt;</span><span class="n">groupId</span><span 
class="o">&gt;</span><span class="n">org</span><span class="o">.</span><span 
class="na">apache</span><span class="o">.</span><span 
class="na">beam</span><span class="o">&lt;/</span><span 
class="n">groupId</span><span class="o">&gt;</span>
    <span class="o">&lt;</span><span class="n">artifactId</span><span 
class="o">&gt;</span><span class="n">beam</span><span class="o">-</span><span 
class="n">runners</span><span class="o">-</span><span 
class="n">direct</span><span class="o">-</span><span class="n">java</span><span 
class="o">&lt;/</span><span class="n">artifactId</span><span 
class="o">&gt;</span>
@@ -183,15 +196,20 @@
 </code></pre>
 </div>
 
+<p><span class="language-py">This section is not applicable to the Beam SDK 
for Python.</span></p>
+
 <h2 id="pipeline-options-for-the-direct-runner">Pipeline options for the 
Direct Runner</h2>
 
-<p>When executing your pipeline from the command-line, set <code 
class="highlighter-rouge">runner</code> to <code 
class="highlighter-rouge">direct</code>. The default values for the other 
pipeline options are generally sufficient.</p>
+<p>When executing your pipeline from the command-line, set <code 
class="highlighter-rouge">runner</code> to <code 
class="highlighter-rouge">direct</code> or <code 
class="highlighter-rouge">DirectRunner</code>. The default values for the other 
pipeline options are generally sufficient.</p>
 
-<p>See the reference documentation for the  <span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/runners/direct/DirectOptions.html"><code
 class="highlighter-rouge">DirectOptions</code></a></span><span 
class="language-python"><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/pipeline_options.py";><code
 class="highlighter-rouge">PipelineOptions</code></a></span> interface (and its 
subinterfaces) for defaults and the complete list of pipeline configuration 
options.</p>
+<p>See the reference documentation for the
+<span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/runners/direct/DirectOptions.html"><code
 class="highlighter-rouge">DirectOptions</code></a></span>
+<span class="language-py"><a 
href="/documentation/sdks/pydoc/0.6.0/apache_beam.utils.html#apache_beam.utils.pipeline_options.DirectOptions"><code
 class="highlighter-rouge">DirectOptions</code></a></span>
+interface for defaults and additional pipeline configuration options.</p>
 
 <h2 id="additional-information-and-caveats">Additional information and 
caveats</h2>
 
-<p>Local execution is limited by the memory available in your local 
environment. It is highly recommended that you run your pipeline with data sets 
small enough to fit in local memory. You can create a small in-memory data set 
using a <span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/transforms/Create.html"><code
 class="highlighter-rouge">Create</code></a></span><span 
class="language-python"><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py";><code
 class="highlighter-rouge">Create</code></a></span> transform, or you can use a 
<span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/io/Read.html"><code
 class="highlighter-rouge">Read</code></a></span><span 
class="language-python"><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/iobase.py";><code
 class="highlighter-rouge">Read</code></a></span> transform to work with s
 mall local or remote files.</p>
+<p>Local execution is limited by the memory available in your local 
environment. It is highly recommended that you run your pipeline with data sets 
small enough to fit in local memory. You can create a small in-memory data set 
using a <span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/transforms/Create.html"><code
 class="highlighter-rouge">Create</code></a></span><span class="language-py"><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py";><code
 class="highlighter-rouge">Create</code></a></span> transform, or you can use a 
<span class="language-java"><a 
href="/documentation/sdks/javadoc/0.6.0/index.html?org/apache/beam/sdk/io/Read.html"><code
 class="highlighter-rouge">Read</code></a></span><span class="language-py"><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/iobase.py";><code
 class="highlighter-rouge">Read</code></a></span> transform to work with small 
loc
 al or remote files.</p>
 
 
       </div>

Reply via email to