Regenerate website

Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/00c736ce
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/00c736ce
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/00c736ce

Branch: refs/heads/asf-site
Commit: 00c736ce6dd742d2673007d68edb6920e9795f9f
Parents: 303fb26
Author: Davor Bonaci <da...@google.com>
Authored: Tue Dec 27 17:54:19 2016 -0800
Committer: Davor Bonaci <da...@google.com>
Committed: Tue Dec 27 17:54:19 2016 -0800

----------------------------------------------------------------------
 content/documentation/index.html                |  14 +++-
 content/documentation/resources/index.html      |   2 +-
 content/documentation/sdks/java/index.html      |  32 +++++++-
 content/documentation/sdks/python/index.html    |   2 -
 .../mobile-gaming-example/index.html            |  37 +++++----
 .../get-started/wordcount-example/index.html    |  77 ++++++-------------
 content/images/gaming-example-basic.png         | Bin 0 -> 63121 bytes
 7 files changed, 88 insertions(+), 76 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/documentation/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/index.html b/content/documentation/index.html
index 1259a66..4767f70 100644
--- a/content/documentation/index.html
+++ b/content/documentation/index.html
@@ -146,7 +146,7 @@
       <div class="row">
         <h1 id="apache-beam-documentation">Apache Beam Documentation</h1>
 
-<p>Get in-depth conceptual information and reference material for the Beam 
Model, SDKs and Runners:</p>
+<p>This section provides in-depth conceptual information and reference 
material for the Beam Model, SDKs, and Runners:</p>
 
 <h2 id="concepts">Concepts</h2>
 
@@ -157,6 +157,14 @@
   <li>Visit <a href="/documentation/resources/">Additional Resources</a> for 
some of our favorite articles and talks about Beam.</li>
 </ul>
 
+<h2 id="pipeline-fundamentals">Pipeline Fundamentals</h2>
+
+<ul>
+  <li><a href="/documentation/pipelines/design-your-pipeline/">Design Your 
Pipeline</a> by planning your pipeline’s structure, choosing transforms to 
apply to your data, and determining your input and output methods.</li>
+  <li><a href="/documentation/pipelines/create-your-pipeline/">Create Your 
Pipeline</a> using the classes in the Beam SDKs.</li>
+  <li><a href="/documentation/pipelines/test-your-pipeline/">Test Your 
Pipeline</a> to minimize debugging a pipeline’s remote execution.</li>
+</ul>
+
 <h2 id="sdks">SDKs</h2>
 
 <p>Find status and reference information on all of the available Beam SDKs.</p>
@@ -183,9 +191,9 @@
 
 <h3 id="choosing-a-runner">Choosing a Runner</h3>
 
-<p>Beam is designed to enable pipelines to be portable across different 
runners. However, given every runner has different capabilities, they also have 
different abilities to implement the core concepts in the Beam model. The <a 
href="/documentation/runners/capability-matrix">Capability Matrix</a> provides 
a detailed comparison of runner functionality.</p>
+<p>Beam is designed to enable pipelines to be portable across different 
runners. However, given every runner has different capabilities, they also have 
different abilities to implement the core concepts in the Beam model. The <a 
href="/documentation/runners/capability-matrix/">Capability Matrix</a> provides 
a detailed comparison of runner functionality.</p>
 
-<p>Once you have chosen which runner to use, see that runner’s page for more 
information about any initial runner-specific setup as well as any required or 
optional <code class="highlighter-rouge">PipelineOptions</code> for configuring 
it’s execution. You may also want to refer back to the <a 
href="/get-started/quickstart">Quickstart</a> for instructions on executing the 
sample WordCount pipeline.</p>
+<p>Once you have chosen which runner to use, see that runner’s page for more 
information about any initial runner-specific setup as well as any required or 
optional <code class="highlighter-rouge">PipelineOptions</code> for configuring 
it’s execution. You may also want to refer back to the <a 
href="/get-started/quickstart/">Quickstart</a> for instructions on executing 
the sample WordCount pipeline.</p>
 
       </div>
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/documentation/resources/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/resources/index.html 
b/content/documentation/resources/index.html
index 21a707f..9aceea2 100644
--- a/content/documentation/resources/index.html
+++ b/content/documentation/resources/index.html
@@ -187,7 +187,7 @@
 
 <p>Hadoop Summit, San Jose, CA, 2016</p>
 
-<p>Presented by Davor Bonacci, <em>Apache Beam PPMC member</em></p>
+<p>Presented by Davor Bonaci, <em>Apache Beam PPMC member</em></p>
 
 <iframe width="560" height="315" 
src="https://www.youtube.com/embed/7DZ8ONmeP5A"; frameborder="0" 
allowfullscreen=""></iframe>
 <p><br /></p>

http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/documentation/sdks/java/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/sdks/java/index.html 
b/content/documentation/sdks/java/index.html
index 3b3ead4..bd29d00 100644
--- a/content/documentation/sdks/java/index.html
+++ b/content/documentation/sdks/java/index.html
@@ -146,11 +146,37 @@
       <div class="row">
         <h1 id="apache-beam-java-sdk">Apache Beam Java SDK</h1>
 
-<p>This page is under construction (<a 
href="https://issues.apache.org/jira/browse/BEAM-504";>BEAM-504</a>).</p>
+<p>The Java SDK for Apache Beam provides a simple, powerful API for building 
both batch and streaming parallel data processing pipelines in Java.</p>
 
-<p>Get started with the <a href="/learn/programming-guide">Beam Programming 
Guide</a> to learn the basic concepts that hold for all SDKs in the Beam 
Model.</p>
+<h2 id="get-started-with-the-java-sdk">Get Started with the Java SDK</h2>
+
+<p>Get started with the <a href="/learn/programming-guide/">Beam Programming 
Model</a> to learn the basic concepts that apply to all SDKs in Beam.</p>
+
+<p>See the <a href="/learn/sdks/javadoc/">Java API Reference</a> for more 
information on individual APIs.</p>
+
+<h2 id="supported-features">Supported Features</h2>
+
+<p>The Java SDK supports all features currently supported by the Beam 
model.</p>
+
+<h2 id="supported-io-connectors">Supported IO Connectors</h2>
+
+<ul>
+  <li>Amazon Kinesis</li>
+  <li>Apache Hadoop’s <code class="highlighter-rouge">FileInputFormat</code> 
in Hadoop Distributed File System (HDFS)</li>
+  <li>Apache Kafka</li>
+  <li>Avro Files</li>
+  <li>Google BigQuery</li>
+  <li>Google Cloud Bigtable</li>
+  <li>Google Cloud Datastore</li>
+  <li>Google Cloud Pub/Sub</li>
+  <li>Google Cloud Storage</li>
+  <li>Java Database Connectivity (JDBC)</li>
+  <li>Java Message Service (JMS)</li>
+  <li>MongoDB</li>
+  <li>Text Files</li>
+  <li>XML Files</li>
+</ul>
 
-<p>See the <a href="/learn/sdks/javadoc/">Java API Reference</a>.</p>
 
       </div>
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/documentation/sdks/python/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/sdks/python/index.html 
b/content/documentation/sdks/python/index.html
index 5c39a0d..5816dec 100644
--- a/content/documentation/sdks/python/index.html
+++ b/content/documentation/sdks/python/index.html
@@ -146,8 +146,6 @@
       <div class="row">
         <h1 id="apache-beam-python-sdk-under-development">Apache Beam Python 
SDK <em>[Under Development]</em></h1>
 
-<p>This page is under construction (<a 
href="https://issues.apache.org/jira/browse/BEAM-977";>BEAM-977</a>).</p>
-
 <p>The Beam Python SDK is currently under development on a feature branch. 
Would you like to contribute? See the Beam <a 
href="/contribute/work-in-progress/#feature-branches">Work in Progress</a> page 
to find out how you can help!</p>
 
 <p>Get started with the <a href="/learn/programming-guide">Beam Programming 
Guide</a> to learn the basic concepts that hold for all SDKs in the Beam 
Model.</p>

http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/get-started/mobile-gaming-example/index.html
----------------------------------------------------------------------
diff --git a/content/get-started/mobile-gaming-example/index.html 
b/content/get-started/mobile-gaming-example/index.html
index ead7405..839beb6 100644
--- a/content/get-started/mobile-gaming-example/index.html
+++ b/content/get-started/mobile-gaming-example/index.html
@@ -196,9 +196,16 @@
   <li>A timestamp that records when the particular instance of play 
happened–this is the event time for each game data event.</li>
 </ul>
 
-<p>When the user completes an instance of the game, their phone sends the data 
event to a game server, where the data is logged and stored in a file. 
Generally the data is sent to the game server immediately upon completion. 
However, users can play the game “offline”, when their phones are out of 
contact with the server (such as on an airplane, or outside network coverage 
area). When the user’s phone comes back into contact with the game server, 
the phone will send all accumulated game data.</p>
+<p>When the user completes an instance of the game, their phone sends the data 
event to a game server, where the data is logged and stored in a file. 
Generally the data is sent to the game server immediately upon completion. 
However, sometimes delays happen in the network or users play the game 
“offline”, when their phones are out of contact with the server (such as on 
an airplane, or outside network coverage area). When the user’s phone comes 
back into contact with the game server, the phone will send all accumulated 
game data. This means that some data events may arrive delayed and out of 
order.</p>
 
-<p>This means that some data events might be received by the game server 
significantly later than users generate them. This time difference can have 
processing implications for pipelines that make calculations that consider when 
each score was generated. Such pipelines might track scores generated during 
each hour of a day, for example, or they calculate the length of time that 
users are continuously playing the game—both of which depend on each data 
record’s event time.</p>
+<p>The following diagram shows the ideal situation vs reality. The X-axis 
represents event time: the actual time a game event occurred. The Y-axis 
represents processing time: the time at which a game event was processed. 
Ideally, events should be processed as they occur, depicted by the dotted line 
in the diagram. However, in reality that is not the case and reality looks more 
like what is depicted by the red squiggly line.</p>
+
+<figure id="fig1">
+    <img src="/images/gaming-example-basic.png" width="264" height="260" 
alt="Score data for three users." />
+</figure>
+<p>Figure 1: Ideally, events are processed when they occur, with no delays.</p>
+
+<p>The data events might be received by the game server significantly later 
than users generate them. This time difference (called <strong>skew</strong>) 
can have processing implications for pipelines that make calculations that 
consider when each score was generated. Such pipelines might track scores 
generated during each hour of a day, for example, or they calculate the length 
of time that users are continuously playing the game—both of which depend on 
each data record’s event time.</p>
 
 <p>Because some of our example pipelines use data files (like logs from the 
game server) as input, the event timestamp for each game might be embedded in 
the data–that is, it’s a field in each data record. Those pipelines need to 
parse the event timestamp from each data record after reading it from the input 
file.</p>
 
@@ -234,12 +241,12 @@
   <li>Write the result data to a <a 
href="https://cloud.google.com/bigquery/";>Google Cloud BigQuery</a> table.</li>
 </ol>
 
-<p>The following diagram shows score data for a several users over the 
pipeline analysis period. In the diagram, each data point is an event that 
results in one user/score pair:</p>
+<p>The following diagram shows score data for several users over the pipeline 
analysis period. In the diagram, each data point is an event that results in 
one user/score pair:</p>
 
-<figure id="fig1">
+<figure id="fig2">
     <img src="/images/gaming-example.gif" width="900" height="263" alt="Score 
data for three users." />
 </figure>
-<p>Figure 1: Score data for three users.</p>
+<p>Figure 2: Score data for three users.</p>
 
 <p>This example uses batch processing, and the diagram’s Y axis represents 
processing time: the pipeline processes events lower on the Y-axis first, and 
events higher up the axis later. The diagram’s X axis represents the event 
time for each game event, as denoted by that event’s timestamp. Note that the 
individual events in the diagram are not processed by the pipeline in the same 
order as they occurred (according to their timestamps).</p>
 
@@ -355,10 +362,10 @@
 
 <p>The following diagram shows how the pipeline processes a day’s worth of a 
single team’s scoring data after applying fixed-time windowing:</p>
 
-<figure id="fig2">
+<figure id="fig3">
     <img src="/images/gaming-example-team-scores-narrow.gif" width="900" 
height="390" alt="Score data for two teams." />
 </figure>
-<p>Figure 2: Score data for two teams. Each team’s scores are divided into 
logical windows based on when those scores occurred in event time.</p>
+<p>Figure 3: Score data for two teams. Each team’s scores are divided into 
logical windows based on when those scores occurred in event time.</p>
 
 <p>Notice that as processing time advances, the sums are now <em>per 
window</em>; each window represents an hour of <em>event time</em> during the 
day in which the scores occurred.</p>
 
@@ -493,12 +500,12 @@
 
 <p>Because we want all the data that has arrived in the pipeline every time we 
update our calculation, we have the pipeline consider all of the user score 
data in a <strong>single global window</strong>. The single global window is 
unbounded, but we can specify a kind of temporary cut-off point for each 
ten-minute calculation by using a processing time <a 
href="/documentation/programming-guide/#triggers">trigger</a>.</p>
 
-<p>When we specify a ten-minute processing time trigger for the single global 
window, the effect is that the pipeline effectively takes a “snapshot” of 
the contents of the window every time the trigger fires (which it does at 
ten-minute intervals). Since we’re using a single global window, each 
snapshot contains all the data collected <em>to that point in time</em>. The 
following diagram shows the effects of using a processing time trigger on the 
single global window:</p>
+<p>When we specify a ten-minute processing time trigger for the single global 
window, the pipeline effectively takes a “snapshot” of the contents of the 
window every time the trigger fires. This snapshot happens at ten-minute 
intervals as long as data has arrived. If no data has arrived, the pipeline 
will take its next “snapshot” 10 minutes past an element arriving. Since 
we’re using a single global window, each snapshot contains all the data 
collected <em>to that point in time</em>. The following diagram shows the 
effects of using a processing time trigger on the single global window:</p>
 
-<figure id="fig3">
+<figure id="fig4">
     <img src="/images/gaming-example-proc-time-narrow.gif" width="900" 
height="263" alt="Score data for for three users." />
 </figure>
-<p>Figure 3: Score data for for three users. Each user’s scores are grouped 
together in a single global window, with a trigger that generates a snapshot 
for output every ten minutes.</p>
+<p>Figure 4: Score data for for three users. Each user’s scores are grouped 
together in a single global window, with a trigger that generates a snapshot 
for output every ten minutes.</p>
 
 <p>As processing time advances and more scores are processed, the trigger 
outputs the updated sum for each user.</p>
 
@@ -547,12 +554,12 @@
 
 <p>The following diagram shows the relationship between ongoing processing 
time and each score’s event time for two teams:</p>
 
-<figure id="fig4">
+<figure id="fig5">
     <img src="/images/gaming-example-event-time-narrow.gif" width="900" 
height="390" alt="Score data by team, windowed by event time." />
 </figure>
-<p>Figure 4: Score data by team, windowed by event time. A trigger based on 
processing time causes the window to emit speculative early results and include 
late results.</p>
+<p>Figure 5: Score data by team, windowed by event time. A trigger based on 
processing time causes the window to emit speculative early results and include 
late results.</p>
 
-<p>The dotted line in the diagram is the “ideal” 
<strong>watermark</strong>: Beam’s notion of when all data in a given window 
can reasonably considered to have arrived. The irregular solid line represents 
the actual watermark, as determined by the data source.</p>
+<p>The dotted line in the diagram is the “ideal” 
<strong>watermark</strong>: Beam’s notion of when all data in a given window 
can reasonably be considered to have arrived. The irregular solid line 
represents the actual watermark, as determined by the data source.</p>
 
 <p>Data arriving above the solid watermark line is <em>late data</em>—this 
is a score event that was delayed (perhaps generated offline) and arrived after 
the window to which it belongs had closed. Our pipeline’s late-firing trigger 
ensures that this late data is still included in the sum.</p>
 
@@ -698,10 +705,10 @@
 
 <p>The following diagram shows how data might look when grouped into session 
windows. Unlike fixed windows, session windows are <em>different for each 
user</em> and is dependent on each individual user’s play pattern:</p>
 
-<figure id="fig5">
+<figure id="fig6">
     <img src="/images/gaming-example-session-windows.png" width="662" 
height="521" alt="User sessions, with a minimum gap duration." />
 </figure>
-<p>Figure 5: User sessions, with a minimum gap duration. Note how each user 
has different sessions, according to how many instances they play and how long 
their breaks between instances are.</p>
+<p>Figure 6: User sessions, with a minimum gap duration. Note how each user 
has different sessions, according to how many instances they play and how long 
their breaks between instances are.</p>
 
 <p>We can use the session-windowed data to determine the average length of 
uninterrupted play time for all of our users, as well as the total score they 
achieve during each session. We can do this in the code by first applying 
session windows, summing the score per user and session, and then using a 
transform to calculate the length of each individual session:</p>
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/get-started/wordcount-example/index.html
----------------------------------------------------------------------
diff --git a/content/get-started/wordcount-example/index.html 
b/content/get-started/wordcount-example/index.html
index ab8723f..aadd03e 100644
--- a/content/get-started/wordcount-example/index.html
+++ b/content/get-started/wordcount-example/index.html
@@ -180,10 +180,6 @@
   </li>
 </ul>
 
-<blockquote>
-  <p><strong>Note:</strong> This walkthrough is still in progress. Detailed 
instructions for running the example pipelines across multiple runners are yet 
to be added. There is an open issue to finish the walkthrough (<a 
href="https://issues.apache.org/jira/browse/BEAM-664";>BEAM-664</a>).</p>
-</blockquote>
-
 <p>The WordCount examples demonstrate how to set up a processing pipeline that 
can read text, tokenize the text lines into individual words, and perform a 
frequency count on each of those words. The Beam SDKs contain a series of these 
four successively more detailed WordCount examples that build on each other. 
The input text for all the examples is a set of Shakespeare’s texts.</p>
 
 <p>Each WordCount example introduces different concepts in the Beam 
programming model. Begin by understanding Minimal WordCount, the simplest of 
the examples. Once you feel comfortable with the basic principles in building a 
pipeline, continue on to learn more concepts in the other examples.</p>
@@ -250,7 +246,7 @@
 
 <p>Each transform takes some kind of input (data or otherwise), and produces 
some output data. The input and output data is represented by the SDK class 
<code class="highlighter-rouge">PCollection</code>. <code 
class="highlighter-rouge">PCollection</code> is a special class, provided by 
the Beam SDK, that you can use to represent a data set of virtually any size, 
including infinite data sets.</p>
 
-<p>&lt;img src=”/images/wordcount-pipeline.png alt=”Word Count pipeline 
diagram””&gt;
+<p><img src="/images/wordcount-pipeline.png" alt="Word Count pipeline diagram" 
/>
 Figure 1: The pipeline data flow.</p>
 
 <p>The Minimal WordCount pipeline contains five transforms:</p>
@@ -305,7 +301,7 @@ Figure 1: The pipeline data flow.</p>
   <li>
     <p>A text file <code class="highlighter-rouge">Write</code>. This 
transform takes the final <code class="highlighter-rouge">PCollection</code> of 
formatted Strings as input and writes each element to an output text file. Each 
element in the input <code class="highlighter-rouge">PCollection</code> 
represents one line of text in the resulting output file.</p>
 
-    <div class="language-java highlighter-rouge"><pre class="highlight"><code> 
       <span class="o">.</span><span class="na">apply</span><span 
class="o">(</span><span class="n">TextIO</span><span class="o">.</span><span 
class="na">Write</span><span class="o">.</span><span class="na">to</span><span 
class="o">(</span><span 
class="s">"gs://YOUR_OUTPUT_BUCKET/AND_OUTPUT_PREFIX"</span><span 
class="o">));</span>
+    <div class="language-java highlighter-rouge"><pre class="highlight"><code> 
       <span class="o">.</span><span class="na">apply</span><span 
class="o">(</span><span class="n">TextIO</span><span class="o">.</span><span 
class="na">Write</span><span class="o">.</span><span class="na">to</span><span 
class="o">(</span><span class="s">"wordcounts"</span><span class="o">));</span>
 </code></pre>
     </div>
   </li>
@@ -317,10 +313,12 @@ Figure 1: The pipeline data flow.</p>
 
 <p>Run the pipeline by calling the <code class="highlighter-rouge">run</code> 
method, which sends your pipeline to be executed by the pipeline runner that 
you specified when you created your pipeline.</p>
 
-<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="n">p</span><span class="o">.</span><span 
class="na">run</span><span class="o">();</span>
+<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="n">p</span><span class="o">.</span><span 
class="na">run</span><span class="o">().</span><span 
class="na">waitUntilFinish</span><span class="o">();</span>
 </code></pre>
 </div>
 
+<p>Note that the <code class="highlighter-rouge">run</code> method is 
asynchronous. For a blocking execution instead, run your pipeline appending the 
<code class="highlighter-rouge">waitUntilFinish</code> method.</p>
+
 <h2 id="wordcount-example">WordCount Example</h2>
 
 <p>This WordCount example introduces a few recommended programming practices 
that can make your pipeline easier to read, write, and maintain. While not 
explicitly required, they can make your pipeline’s execution more flexible, 
aid in testing your pipeline, and help make your pipeline’s code reusable.</p>
@@ -465,7 +463,6 @@ Figure 1: The pipeline data flow.</p>
     <span class="o">}</span>
   <span class="o">}</span>
 <span class="o">}</span>
-
 </code></pre>
 </div>
 
@@ -527,9 +524,6 @@ Figure 1: The pipeline data flow.</p>
     <span class="n">Options</span> <span class="n">options</span> <span 
class="o">=</span> <span class="o">...</span>
     <span class="n">Pipeline</span> <span class="n">pipeline</span> <span 
class="o">=</span> <span class="n">Pipeline</span><span class="o">.</span><span 
class="na">create</span><span class="o">(</span><span 
class="n">options</span><span class="o">);</span>
 
-    <span class="cm">/**
-     * Concept #1: The Beam SDK allows running the same pipeline with a 
bounded or unbounded input source.
-     */</span>
     <span class="n">PCollection</span><span class="o">&lt;</span><span 
class="n">String</span><span class="o">&gt;</span> <span class="n">input</span> 
<span class="o">=</span> <span class="n">pipeline</span>
       <span class="o">.</span><span class="na">apply</span><span 
class="o">(</span><span class="n">TextIO</span><span class="o">.</span><span 
class="na">Read</span><span class="o">.</span><span class="na">from</span><span 
class="o">(</span><span class="n">options</span><span class="o">.</span><span 
class="na">getInputFile</span><span class="o">()))</span>
 
@@ -540,39 +534,30 @@ Figure 1: The pipeline data flow.</p>
 
 <p>Each element in a <code class="highlighter-rouge">PCollection</code> has an 
associated <strong>timestamp</strong>. The timestamp for each element is 
initially assigned by the source that creates the <code 
class="highlighter-rouge">PCollection</code> and can be adjusted by a <code 
class="highlighter-rouge">DoFn</code>. In this example the input is bounded. 
For the purpose of the example, the <code class="highlighter-rouge">DoFn</code> 
method named <code class="highlighter-rouge">AddTimestampsFn</code> (invoked by 
<code class="highlighter-rouge">ParDo</code>) will set a timestamp for each 
element in the <code class="highlighter-rouge">PCollection</code>.</p>
 
-<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="c1">// Concept #2: Add an element 
timestamp, using an artificial time just to show windowing.</span>
-<span class="c1">// See AddTimestampFn for more details on this.</span>
-<span class="o">.</span><span class="na">apply</span><span 
class="o">(</span><span class="n">ParDo</span><span class="o">.</span><span 
class="na">of</span><span class="o">(</span><span class="k">new</span> <span 
class="n">AddTimestampFn</span><span class="o">()));</span>
+<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="o">.</span><span 
class="na">apply</span><span class="o">(</span><span 
class="n">ParDo</span><span class="o">.</span><span class="na">of</span><span 
class="o">(</span><span class="k">new</span> <span 
class="n">AddTimestampFn</span><span class="o">()));</span>
 </code></pre>
 </div>
 
 <p>Below is the code for <code 
class="highlighter-rouge">AddTimestampFn</code>, a <code 
class="highlighter-rouge">DoFn</code> invoked by <code 
class="highlighter-rouge">ParDo</code>, that sets the data element of the 
timestamp given the element itself. For example, if the elements were log 
lines, this <code class="highlighter-rouge">ParDo</code> could parse the time 
out of the log string and set it as the element’s timestamp. There are no 
timestamps inherent in the works of Shakespeare, so in this case we’ve made 
up random timestamps just to illustrate the concept. Each line of the input 
text will get a random associated timestamp sometime in a 2-hour period.</p>
 
-<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="cm">/**
-   * Concept #2: A DoFn that sets the data element timestamp. This is a silly 
method, just for
-   * this example, for the bounded data case. Imagine that many ghosts of 
Shakespeare are all 
-   * typing madly at the same time to recreate his masterworks. Each line of 
the corpus will 
-   * get a random associated timestamp somewhere in a 2-hour period.
-   */</span>
-  <span class="kd">static</span> <span class="kd">class</span> <span 
class="nc">AddTimestampFn</span> <span class="kd">extends</span> <span 
class="n">DoFn</span><span class="o">&lt;</span><span 
class="n">String</span><span class="o">,</span> <span 
class="n">String</span><span class="o">&gt;</span> <span class="o">{</span>
-    <span class="kd">private</span> <span class="kd">static</span> <span 
class="kd">final</span> <span class="n">Duration</span> <span 
class="n">RAND_RANGE</span> <span class="o">=</span> <span 
class="n">Duration</span><span class="o">.</span><span 
class="na">standardHours</span><span class="o">(</span><span 
class="mi">2</span><span class="o">);</span>
-    <span class="kd">private</span> <span class="kd">final</span> <span 
class="n">Instant</span> <span class="n">minTimestamp</span><span 
class="o">;</span>
-
-    <span class="n">AddTimestampFn</span><span class="o">()</span> <span 
class="o">{</span>
-      <span class="k">this</span><span class="o">.</span><span 
class="na">minTimestamp</span> <span class="o">=</span> <span 
class="k">new</span> <span class="n">Instant</span><span 
class="o">(</span><span class="n">System</span><span class="o">.</span><span 
class="na">currentTimeMillis</span><span class="o">());</span>
-    <span class="o">}</span>
+<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="kd">static</span> <span 
class="kd">class</span> <span class="nc">AddTimestampFn</span> <span 
class="kd">extends</span> <span class="n">DoFn</span><span 
class="o">&lt;</span><span class="n">String</span><span class="o">,</span> 
<span class="n">String</span><span class="o">&gt;</span> <span 
class="o">{</span>
+  <span class="kd">private</span> <span class="kd">static</span> <span 
class="kd">final</span> <span class="n">Duration</span> <span 
class="n">RAND_RANGE</span> <span class="o">=</span> <span 
class="n">Duration</span><span class="o">.</span><span 
class="na">standardHours</span><span class="o">(</span><span 
class="mi">2</span><span class="o">);</span>
+  <span class="kd">private</span> <span class="kd">final</span> <span 
class="n">Instant</span> <span class="n">minTimestamp</span><span 
class="o">;</span>
 
-    <span class="nd">@ProcessElement</span>
-    <span class="kd">public</span> <span class="kt">void</span> <span 
class="nf">processElement</span><span class="o">(</span><span 
class="n">ProcessContext</span> <span class="n">c</span><span 
class="o">)</span> <span class="o">{</span>
-      <span class="c1">// Generate a timestamp that falls somewhere in the 
past two hours.</span>
-      <span class="kt">long</span> <span class="n">randMillis</span> <span 
class="o">=</span> <span class="o">(</span><span class="kt">long</span><span 
class="o">)</span> <span class="o">(</span><span class="n">Math</span><span 
class="o">.</span><span class="na">random</span><span class="o">()</span> <span 
class="o">*</span> <span class="n">RAND_RANGE</span><span 
class="o">.</span><span class="na">getMillis</span><span class="o">());</span>
-      <span class="n">Instant</span> <span class="n">randomTimestamp</span> 
<span class="o">=</span> <span class="n">minTimestamp</span><span 
class="o">.</span><span class="na">plus</span><span class="o">(</span><span 
class="n">randMillis</span><span class="o">);</span>
-      <span class="cm">/**
-       * Set the data element with that timestamp.
-       */</span>
-      <span class="n">c</span><span class="o">.</span><span 
class="na">outputWithTimestamp</span><span class="o">(</span><span 
class="n">c</span><span class="o">.</span><span class="na">element</span><span 
class="o">(),</span> <span class="k">new</span> <span 
class="n">Instant</span><span class="o">(</span><span 
class="n">randomTimestamp</span><span class="o">));</span>
-    <span class="o">}</span>
+  <span class="n">AddTimestampFn</span><span class="o">()</span> <span 
class="o">{</span>
+    <span class="k">this</span><span class="o">.</span><span 
class="na">minTimestamp</span> <span class="o">=</span> <span 
class="k">new</span> <span class="n">Instant</span><span 
class="o">(</span><span class="n">System</span><span class="o">.</span><span 
class="na">currentTimeMillis</span><span class="o">());</span>
+  <span class="o">}</span>
+
+  <span class="nd">@ProcessElement</span>
+  <span class="kd">public</span> <span class="kt">void</span> <span 
class="nf">processElement</span><span class="o">(</span><span 
class="n">ProcessContext</span> <span class="n">c</span><span 
class="o">)</span> <span class="o">{</span>
+    <span class="c1">// Generate a timestamp that falls somewhere in the past 
two hours.</span>
+    <span class="kt">long</span> <span class="n">randMillis</span> <span 
class="o">=</span> <span class="o">(</span><span class="kt">long</span><span 
class="o">)</span> <span class="o">(</span><span class="n">Math</span><span 
class="o">.</span><span class="na">random</span><span class="o">()</span> <span 
class="o">*</span> <span class="n">RAND_RANGE</span><span 
class="o">.</span><span class="na">getMillis</span><span class="o">());</span>
+    <span class="n">Instant</span> <span class="n">randomTimestamp</span> 
<span class="o">=</span> <span class="n">minTimestamp</span><span 
class="o">.</span><span class="na">plus</span><span class="o">(</span><span 
class="n">randMillis</span><span class="o">);</span>
+
+    <span class="c1">// Set the data element with that timestamp.</span>
+    <span class="n">c</span><span class="o">.</span><span 
class="na">outputWithTimestamp</span><span class="o">(</span><span 
class="n">c</span><span class="o">.</span><span class="na">element</span><span 
class="o">(),</span> <span class="k">new</span> <span 
class="n">Instant</span><span class="o">(</span><span 
class="n">randomTimestamp</span><span class="o">));</span>
   <span class="o">}</span>
+<span class="o">}</span>
 </code></pre>
 </div>
 
@@ -582,11 +567,7 @@ Figure 1: The pipeline data flow.</p>
 
 <p>The <code class="highlighter-rouge">WindowingWordCount</code> example 
applies fixed-time windowing, wherein each window represents a fixed time 
interval. The fixed window size for this example defaults to 1 minute (you can 
change this with a command-line option).</p>
 
-<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="cm">/**
- * Concept #3: Window into fixed windows. The fixed window size for this 
example defaults to 1
- * minute (you can change this with a command-line option).
- */</span>
-<span class="n">PCollection</span><span class="o">&lt;</span><span 
class="n">String</span><span class="o">&gt;</span> <span 
class="n">windowedWords</span> <span class="o">=</span> <span 
class="n">input</span>
+<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="n">PCollection</span><span 
class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> 
<span class="n">windowedWords</span> <span class="o">=</span> <span 
class="n">input</span>
   <span class="o">.</span><span class="na">apply</span><span 
class="o">(</span><span class="n">Window</span><span 
class="o">.&lt;</span><span class="n">String</span><span 
class="o">&gt;</span><span class="n">into</span><span class="o">(</span>
     <span class="n">FixedWindows</span><span class="o">.</span><span 
class="na">of</span><span class="o">(</span><span 
class="n">Duration</span><span class="o">.</span><span 
class="na">standardMinutes</span><span class="o">(</span><span 
class="n">options</span><span class="o">.</span><span 
class="na">getWindowSize</span><span class="o">()))));</span>
 </code></pre>
@@ -596,11 +577,7 @@ Figure 1: The pipeline data flow.</p>
 
 <p>You can reuse existing <code class="highlighter-rouge">PTransform</code>s, 
that were created for manipulating simple <code 
class="highlighter-rouge">PCollection</code>s, over windowed <code 
class="highlighter-rouge">PCollection</code>s as well.</p>
 
-<div class="highlighter-rouge"><pre class="highlight"><code>/**
- * Concept #4: Re-use our existing CountWords transform that does not have 
knowledge of
- * windows over a PCollection containing windowed values.
- */
-PCollection&lt;KV&lt;String, Long&gt;&gt; wordCounts = windowedWords.apply(new 
WordCount.CountWords());
+<div class="highlighter-rouge"><pre 
class="highlight"><code>PCollection&lt;KV&lt;String, Long&gt;&gt; wordCounts = 
windowedWords.apply(new WordCount.CountWords());
 </code></pre>
 </div>
 
@@ -610,11 +587,7 @@ PCollection&lt;KV&lt;String, Long&gt;&gt; wordCounts = 
windowedWords.apply(new W
 
 <p>In this example, we stream the results to a BigQuery table. The results are 
then formatted for a BigQuery table, and then written to BigQuery using 
BigQueryIO.Write.</p>
 
-<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="cm">/**
- * Concept #5: Format the results for a BigQuery table, then write to BigQuery.
- * The BigQuery output source supports both bounded and unbounded data.
- */</span>
-<span class="n">wordCounts</span><span class="o">.</span><span 
class="na">apply</span><span class="o">(</span><span 
class="n">ParDo</span><span class="o">.</span><span class="na">of</span><span 
class="o">(</span><span class="k">new</span> <span 
class="n">FormatAsTableRowFn</span><span class="o">()))</span>
+<div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="n">wordCounts</span><span 
class="o">.</span><span class="na">apply</span><span class="o">(</span><span 
class="n">ParDo</span><span class="o">.</span><span class="na">of</span><span 
class="o">(</span><span class="k">new</span> <span 
class="n">FormatAsTableRowFn</span><span class="o">()))</span>
     <span class="o">.</span><span class="na">apply</span><span 
class="o">(</span><span class="n">BigQueryIO</span><span 
class="o">.</span><span class="na">Write</span>
       <span class="o">.</span><span class="na">to</span><span 
class="o">(</span><span class="n">getTableReference</span><span 
class="o">(</span><span class="n">options</span><span class="o">))</span>
       <span class="o">.</span><span class="na">withSchema</span><span 
class="o">(</span><span class="n">getSchema</span><span class="o">())</span>

http://git-wip-us.apache.org/repos/asf/beam-site/blob/00c736ce/content/images/gaming-example-basic.png
----------------------------------------------------------------------
diff --git a/content/images/gaming-example-basic.png 
b/content/images/gaming-example-basic.png
new file mode 100644
index 0000000..6f48fe5
Binary files /dev/null and b/content/images/gaming-example-basic.png differ

Reply via email to