This is an automated email from the ASF dual-hosted git repository.

jkff pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 30a5df4d85ad236a7d83e05589917a9a03d2cd37
Author: Eugene Kirpichov <kirpic...@google.com>
AuthorDate: Thu Aug 10 16:54:41 2017 -0700

    Regenerates website
---
 .../contribute/ptransform-style-guide/index.html   | 79 ++++++++++++++++------
 1 file changed, 60 insertions(+), 19 deletions(-)

diff --git a/content/contribute/ptransform-style-guide/index.html 
b/content/contribute/ptransform-style-guide/index.html
index 56381bb..f351250 100644
--- a/content/contribute/ptransform-style-guide/index.html
+++ b/content/contribute/ptransform-style-guide/index.html
@@ -183,7 +183,11 @@
           <li><a href="#immutability" 
id="markdown-toc-immutability">Immutability</a></li>
           <li><a href="#serialization" 
id="markdown-toc-serialization">Serialization</a></li>
           <li><a href="#validation" 
id="markdown-toc-validation">Validation</a></li>
-          <li><a href="#coders" id="markdown-toc-coders">Coders</a></li>
+          <li><a href="#coders" id="markdown-toc-coders">Coders</a>            
<ul>
+              <li><a href="#providing-default-coders-for-types" 
id="markdown-toc-providing-default-coders-for-types">Providing default coders 
for types</a></li>
+              <li><a href="#setting-coders-on-output-collections" 
id="markdown-toc-setting-coders-on-output-collections">Setting coders on output 
collections</a></li>
+            </ul>
+          </li>
         </ul>
       </li>
     </ul>
@@ -684,32 +688,56 @@ Strive to make such incompatible behavior changes cause a 
compile error (e.g. it
 <h4 id="validation">Validation</h4>
 
 <ul>
-  <li>Validate individual parameters in <code 
class="highlighter-rouge">.withBlah()</code> methods. Error messages should 
mention the method being called, the actual value and the range of valid 
values.</li>
-  <li>Validate inter-parameter invariants in the <code 
class="highlighter-rouge">PTransform</code>’s <code 
class="highlighter-rouge">.validate()</code> method.</li>
+  <li>Validate individual parameters in <code 
class="highlighter-rouge">.withBlah()</code> methods using <code 
class="highlighter-rouge">checkArgument()</code>. Error messages should mention 
the name of the parameter, the actual value, and the range of valid values.</li>
+  <li>Validate parameter combinations and missing required parameters in the 
<code class="highlighter-rouge">PTransform</code>’s <code 
class="highlighter-rouge">.expand()</code> method.</li>
+  <li>Validate parameters that the <code 
class="highlighter-rouge">PTransform</code> takes from <code 
class="highlighter-rouge">PipelineOptions</code> in the <code 
class="highlighter-rouge">PTransform</code>’s <code 
class="highlighter-rouge">.validate(PipelineOptions)</code> method.
+These validations will be executed when the pipeline is already fully 
constructed/expanded and is about to be run with a particular <code 
class="highlighter-rouge">PipelineOptions</code>.
+Most <code class="highlighter-rouge">PTransform</code>s do not use <code 
class="highlighter-rouge">PipelineOptions</code> and thus don’t need a <code 
class="highlighter-rouge">validate()</code> method - instead, they should 
perform their validation via the two other methods above.</li>
 </ul>
 
 <div class="language-java highlighter-rouge"><pre 
class="highlight"><code><span class="nd">@AutoValue</span>
 <span class="kd">public</span> <span class="kd">abstract</span> <span 
class="kd">class</span> <span class="nc">TwiddleThumbs</span>
     <span class="kd">extends</span> <span class="n">PTransform</span><span 
class="o">&lt;</span><span class="n">PCollection</span><span 
class="o">&lt;</span><span class="n">Foo</span><span class="o">&gt;,</span> 
<span class="n">PCollection</span><span class="o">&lt;</span><span 
class="n">Bar</span><span class="o">&gt;&gt;</span> <span class="o">{</span>
   <span class="kd">abstract</span> <span class="kt">int</span> <span 
class="nf">getMoo</span><span class="o">();</span>
-  <span class="kd">abstract</span> <span class="kt">int</span> <span 
class="nf">getBoo</span><span class="o">();</span>
+  <span class="kd">abstract</span> <span class="n">String</span> <span 
class="nf">getBoo</span><span class="o">();</span>
 
   <span class="o">...</span>
   <span class="c1">// Validating individual parameters</span>
   <span class="kd">public</span> <span class="n">TwiddleThumbs</span> <span 
class="nf">withMoo</span><span class="o">(</span><span class="kt">int</span> 
<span class="n">moo</span><span class="o">)</span> <span class="o">{</span>
-    <span class="n">checkArgument</span><span class="o">(</span><span 
class="n">moo</span> <span class="o">&gt;=</span> <span class="mi">0</span> 
<span class="o">&amp;&amp;</span> <span class="n">moo</span> <span 
class="o">&lt;</span> <span class="mi">100</span><span class="o">,</span>
-      <span class="s">"TwiddleThumbs.withMoo() called with an invalid moo of 
%s. "</span>
-              <span class="o">+</span> <span class="s">"Valid values are 0 
(exclusive) to 100 (exclusive)"</span><span class="o">,</span>
-              <span class="n">moo</span><span class="o">);</span>
-        <span class="k">return</span> <span class="nf">toBuilder</span><span 
class="o">().</span><span class="na">setMoo</span><span class="o">(</span><span 
class="n">moo</span><span class="o">).</span><span class="na">build</span><span 
class="o">();</span>
+    <span class="n">checkArgument</span><span class="o">(</span>
+        <span class="n">moo</span> <span class="o">&gt;=</span> <span 
class="mi">0</span> <span class="o">&amp;&amp;</span> <span 
class="n">moo</span> <span class="o">&lt;</span> <span 
class="mi">100</span><span class="o">,</span>
+        <span class="s">"Moo must be between 0 (inclusive) and 100 
(exclusive), but was: %s"</span><span class="o">,</span>
+        <span class="n">moo</span><span class="o">);</span>
+    <span class="k">return</span> <span class="nf">toBuilder</span><span 
class="o">().</span><span class="na">setMoo</span><span class="o">(</span><span 
class="n">moo</span><span class="o">).</span><span class="na">build</span><span 
class="o">();</span>
+  <span class="o">}</span>
+
+  <span class="kd">public</span> <span class="n">TwiddleThumbs</span> <span 
class="nf">withBoo</span><span class="o">(</span><span class="n">String</span> 
<span class="n">boo</span><span class="o">)</span> <span class="o">{</span>
+    <span class="n">checkArgument</span><span class="o">(</span><span 
class="n">boo</span> <span class="o">!=</span> <span 
class="kc">null</span><span class="o">,</span> <span class="s">"Boo can not be 
null"</span><span class="o">);</span>
+    <span class="n">checkArgument</span><span class="o">(!</span><span 
class="n">boo</span><span class="o">.</span><span 
class="na">isEmpty</span><span class="o">(),</span> <span class="s">"Boo can 
not be empty"</span><span class="o">);</span>
+    <span class="k">return</span> <span class="nf">toBuilder</span><span 
class="o">().</span><span class="na">setBoo</span><span class="o">(</span><span 
class="n">boo</span><span class="o">).</span><span class="na">build</span><span 
class="o">();</span>
   <span class="o">}</span>
 
-  <span class="c1">// Validating cross-parameter invariants</span>
-  <span class="kd">public</span> <span class="kt">void</span> <span 
class="nf">validate</span><span class="o">(</span><span 
class="n">PCollection</span><span class="o">&lt;</span><span 
class="n">Foo</span><span class="o">&gt;</span> <span 
class="n">input</span><span class="o">)</span> <span class="o">{</span>
-    <span class="n">checkArgument</span><span class="o">(</span><span 
class="n">getMoo</span><span class="o">()</span> <span class="o">==</span> 
<span class="mi">0</span> <span class="o">||</span> <span 
class="n">getBoo</span><span class="o">()</span> <span class="o">==</span> 
<span class="mi">0</span><span class="o">,</span>
-      <span class="s">"TwiddleThumbs created with both .withMoo(%s) and 
.withBoo(%s). "</span>
-      <span class="o">+</span> <span class="s">"Only one of these must be 
specified."</span><span class="o">,</span>
-      <span class="n">getMoo</span><span class="o">(),</span> <span 
class="n">getBoo</span><span class="o">());</span>
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="kt">void</span> <span 
class="nf">validate</span><span class="o">(</span><span 
class="n">PipelineOptions</span> <span class="n">options</span><span 
class="o">)</span> <span class="o">{</span>
+    <span class="kt">int</span> <span class="n">woo</span> <span 
class="o">=</span> <span class="n">options</span><span class="o">.</span><span 
class="na">as</span><span class="o">(</span><span 
class="n">TwiddleThumbsOptions</span><span class="o">.</span><span 
class="na">class</span><span class="o">).</span><span 
class="na">getWoo</span><span class="o">();</span>
+    <span class="n">checkArgument</span><span class="o">(</span>
+       <span class="n">woo</span> <span class="o">&gt;</span> <span 
class="n">getMoo</span><span class="o">(),</span>
+      <span class="s">"Woo (%s) must be smaller than moo (%s)"</span><span 
class="o">,</span>
+      <span class="n">woo</span><span class="o">,</span> <span 
class="n">getMoo</span><span class="o">());</span>
+  <span class="o">}</span>
+
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="n">PCollection</span><span 
class="o">&lt;</span><span class="n">Bar</span><span class="o">&gt;</span> 
<span class="nf">expand</span><span class="o">(</span><span 
class="n">PCollection</span><span class="o">&lt;</span><span 
class="n">Foo</span><span class="o">&gt;</span> <span 
class="n">input</span><span class="o">)</span> <span class="o">{</span>
+    <span class="c1">// Validating that a required parameter is present</span>
+    <span class="n">checkArgument</span><span class="o">(</span><span 
class="n">getBoo</span><span class="o">()</span> <span class="o">!=</span> 
<span class="kc">null</span><span class="o">,</span> <span class="s">"Must 
specify boo"</span><span class="o">);</span>
+
+    <span class="c1">// Validating a combination of parameters</span>
+    <span class="n">checkArgument</span><span class="o">(</span>
+        <span class="n">getMoo</span><span class="o">()</span> <span 
class="o">==</span> <span class="mi">0</span> <span class="o">||</span> <span 
class="n">getBoo</span><span class="o">()</span> <span class="o">==</span> 
<span class="kc">null</span><span class="o">,</span>
+        <span class="s">"Must specify at most one of moo or boo, but was: moo 
= %s, boo = %s"</span><span class="o">,</span>
+        <span class="n">getMoo</span><span class="o">(),</span> <span 
class="n">getBoo</span><span class="o">());</span>
+
+    <span class="o">...</span>
   <span class="o">}</span>
 <span class="o">}</span>
 </code></pre>
@@ -717,13 +745,26 @@ Strive to make such incompatible behavior changes cause a 
compile error (e.g. it
 
 <h4 id="coders">Coders</h4>
 
+<p><code class="highlighter-rouge">Coder</code>s are a way for a Beam runner 
to materialize intermediate data or transmit it between workers when necessary. 
<code class="highlighter-rouge">Coder</code> should not be used as a 
general-purpose API for parsing or writing binary formats because the 
particular binary encoding of a <code class="highlighter-rouge">Coder</code> is 
intended to be its private implementation detail.</p>
+
+<h5 id="providing-default-coders-for-types">Providing default coders for 
types</h5>
+
+<p>Provide default <code class="highlighter-rouge">Coder</code>s for all new 
data types. Use <code class="highlighter-rouge">@DefaultCoder</code> 
annotations or <code class="highlighter-rouge">CoderProviderRegistrar</code> 
classes annotated with <code class="highlighter-rouge">@AutoService</code>: see 
usages of these classes in the SDK for examples. If performance is not 
important, you can use <code class="highlighter-rouge">SerializableCoder</code> 
or <code class="highlighter-rouge">Avr [...]
+
+<h5 id="setting-coders-on-output-collections">Setting coders on output 
collections</h5>
+
+<p>All <code class="highlighter-rouge">PCollection</code>s created by your 
<code class="highlighter-rouge">PTransform</code> (both output and intermediate 
collections) must have a <code class="highlighter-rouge">Coder</code> set on 
them: a user should never need to call <code 
class="highlighter-rouge">.setCoder()</code> to “fix up” a coder on a <code 
class="highlighter-rouge">PCollection</code> produced by your <code 
class="highlighter-rouge">PTransform</code> (in fact, Beam intends to e [...]
+
+<p>If the collection is of a concrete type, that type usually has a 
corresponding coder. Use a specific most efficient coder (e.g. <code 
class="highlighter-rouge">StringUtf8Coder.of()</code> for strings, <code 
class="highlighter-rouge">ByteArrayCoder.of()</code> for byte arrays, etc.), 
rather than a general-purpose coder like <code 
class="highlighter-rouge">SerializableCoder</code>.</p>
+
+<p>If the type of the collection involves generic type variables, the 
situation is more complex:</p>
 <ul>
-  <li>Use <code class="highlighter-rouge">Coder</code>s only for setting the 
coder on a <code class="highlighter-rouge">PCollection</code> or a mutable 
state cell.</li>
-  <li>When available, use a specific most efficient coder for the datatype 
(e.g. <code class="highlighter-rouge">StringUtf8Coder.of()</code> for strings, 
<code class="highlighter-rouge">ByteArrayCoder.of()</code> for byte arrays, 
etc.), rather than using a generic coder like <code 
class="highlighter-rouge">SerializableCoder</code>. Develop efficient coders 
for types that can be elements of <code 
class="highlighter-rouge">PCollection</code>s.</li>
-  <li>Do not use coders as a general serialization or parsing mechanism for 
arbitrary raw byte data. (anti-examples that should be fixed: <code 
class="highlighter-rouge">TextIO</code>, <code 
class="highlighter-rouge">KafkaIO</code>).</li>
-  <li>In general, any transform that outputs a user-controlled type (that is 
not its input type) needs to accept a coder in the transform configuration 
(example: the <code class="highlighter-rouge">Create.of()</code> transform). 
This gives the user the ability to control the coder no matter how the 
transform is structured: e.g., purely letting the user specify the coder on the 
output <code class="highlighter-rouge">PCollection</code> of the transform is 
insufficient in case the transform [...]
+  <li>If it coincides with the transform’s input type or is a simple wrapper 
over it, you can reuse the coder of the input <code 
class="highlighter-rouge">PCollection</code>, available via <code 
class="highlighter-rouge">input.getCoder()</code>.</li>
+  <li>Attempt to infer the coder via <code 
class="highlighter-rouge">input.getPipeline().getCoderRegistry().getCoder(TypeDescriptor)</code>.
 Use utilities in <code class="highlighter-rouge">TypeDescriptors</code> to 
obtain the <code class="highlighter-rouge">TypeDescriptor</code> for the 
generic type. For an example of this approach, see the implementation of <code 
class="highlighter-rouge">AvroIO.parseGenericRecords()</code>. However, coder 
inference for generic types is best-effort and [...]
+  <li>Always make it possible for the user to explicitly specify a <code 
class="highlighter-rouge">Coder</code> for the relevant type variable(s) as a 
configuration parameter of your <code 
class="highlighter-rouge">PTransform</code>. (e.g. <code 
class="highlighter-rouge">AvroIO.&lt;T&gt;parseGenericRecords().withCoder(Coder&lt;T&gt;)</code>).
 Fall back to inference if the coder was not explicitly specified.</li>
 </ul>
 
+
     </div>
     <footer class="footer">
   <div class="footer__contained">

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" <commits@beam.apache.org>.

Reply via email to