This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 1f44beb4c24 Publishing website 2022/08/18 22:16:53 at commit 062a0d2
1f44beb4c24 is described below
commit 1f44beb4c2405795ebf1e3546afd87ffd4e261fc
Author: jenkins <[email protected]>
AuthorDate: Thu Aug 18 22:16:53 2022 +0000
Publishing website 2022/08/18 22:16:53 at commit 062a0d2
---
.../get-started/from-spark/index.html | 4 +-
website/generated-content/get-started/index.xml | 48 ++++++++++------------
website/generated-content/sitemap.xml | 2 +-
3 files changed, 25 insertions(+), 29 deletions(-)
diff --git a/website/generated-content/get-started/from-spark/index.html
b/website/generated-content/get-started/from-spark/index.html
index ee66d4b22aa..1acd6de5c8b 100644
--- a/website/generated-content/get-started/from-spark/index.html
+++ b/website/generated-content/get-started/from-spark/index.html
@@ -69,7 +69,7 @@ This serves both as comments and makes your pipeline easier
to debug.</p><p>This
<span class=o>|</span> <span class=s1>'Multiply by two'</span>
<span class=o>>></span> <span class=n>beam</span><span
class=o>.</span><span class=n>Map</span><span class=p>(</span><span
class=k>lambda</span> <span class=n>x</span><span class=p>:</span> <span
class=n>x</span> <span class=o>*</span> <span class=mi>2</span><span
class=p>)</span>
<span class=o>|</span> <span class=s1>'Sum everything'</span>
<span class=o>>></span> <span class=n>beam</span><span
class=o>.</span><span class=n>CombineGlobally</span><span class=p>(</span><span
class=nb>sum</span><span class=p>)</span>
<span class=o>|</span> <span class=s1>'Print results'</span>
<span class=o>>></span> <span class=n>beam</span><span
class=o>.</span><span class=n>Map</span><span class=p>(</span><span
class=k>print</span><span class=p>)</span>
- <span class=p>)</span></code></pre></div></div></div><h2
id=setup>Setup</h2><p>Here’s a comparison on how to get started both in
PySpark and Beam.</p><div
class=table-wrapper><table><tr><th></th><th>PySpark</th><th>Beam</th></tr><tr><td><b>Install</b></td><td><code>$
pip install pyspark</code></td><td><code>$ pip install
apache-beam</code></td></tr><tr><td><b>Imports</b></td><td><code>import
pyspark</code></td><td><code>import apache_beam as
beam</code></td></tr><tr><td><b>Crea [...]
+ <span class=p>)</span></code></pre></div></div></div><h2
id=setup>Setup</h2><p>Here’s a comparison on how to get started both in
PySpark and Beam.</p><div class=table-wrapper><table style=width:100%><tr><th
style=width:20%></th><th style=width:40%>PySpark</th><th
style=width:40%>Beam</th></tr><tr><td><b>Install</b></td><td><code>$ pip
install pyspark</code></td><td><code>$ pip install
apache-beam</code></td></tr><tr><td><b>Imports</b></td><td><code>import
pyspark</code></td><td [...]
<a href=/documentation/transforms/python/overview>Python transform
gallery</a>.</p></blockquote><h2 id=using-calculated-values>Using calculated
values</h2><p>Since we are working in potentially distributed environments,
we can’t guarantee that the results we’ve calculated are available
at any given machine.</p><p>In PySpark, we can get a result from a collection
of elements (RDD) by using
<code>data.collect()</code>, or other aggregations such as
<code>reduce()</code>, <code>count()</code>, and more.</p><p>Here’s an
example to scale numbers into a range between zero and one.</p><div
class="language-py snippet"><div class="notebook-skip code-snippet"><a
class=copy type=button data-bs-toggle=tooltip data-bs-placement=bottom
title="Copy to clipboard"><img src=/images/copy-icon.svg></a><div
class=highlight><pre class=chroma><code class=language-py data-lang=py><span
cla [...]
@@ -108,7 +108,7 @@ and access them as an <a
href=https://docs.python.org/3/glossary.html#term-itera
<span class=n>scaled_values</span> <span class=o>|</span> <span
class=n>beam</span><span class=o>.</span><span class=n>Map</span><span
class=p>(</span><span class=k>print</span><span
class=p>)</span></code></pre></div></div></div><blockquote><p>ℹ️ In Beam we
need to pass a side input explicitly, but we get the
benefit that a reduction or aggregation does <em>not</em> have to fit into
memory.
Lazily computing side inputs also allows us to compute <code>values</code>
only once,
-rather than for each distinct reduction (or requiring explicit caching of the
RDD).</p></blockquote><h2 id=next-steps>Next Steps</h2><ul><li>Take a look at
all the available transforms in the <a
href=/documentation/transforms/python/overview>Python transform
gallery</a>.</li><li>Learn how to read from and write to files in the <a
href=/documentation/programming-guide/#pipeline-io><em>Pipeline I/O</em>
section of the <em>Programming guide</em></a></li><li>Walk through additional
WordCount [...]
+rather than for each distinct reduction (or requiring explicit caching of the
RDD).</p></blockquote><h2 id=next-steps>Next Steps</h2><ul><li>Take a look at
all the available transforms in the <a
href=/documentation/transforms/python/overview>Python transform
gallery</a>.</li><li>Learn how to read from and write to files in the <a
href=/documentation/programming-guide/#pipeline-io><em>Pipeline I/O</em>
section of the <em>Programming guide</em></a></li><li>Walk through additional
WordCount [...]
<a href=http://www.apache.org>The Apache Software Foundation</a>
| <a href=/privacy_policy>Privacy Policy</a>
| <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam
logo, and the Apache feather logo are either registered trademarks or
trademarks of The Apache Software Foundation. All other products or name brands
are trademarks of their respective holders, including The Apache Software
Foundation.</div></div><div class="footer__cols__col
footer__cols__col__logos"><div class=footer__cols__col--group><div
class=footer__cols__col__logo><a href=https://github.com/apache/beam><im [...]
\ No newline at end of file
diff --git a/website/generated-content/get-started/index.xml
b/website/generated-content/get-started/index.xml
index 618f639b594..c9afd0482c5 100644
--- a/website/generated-content/get-started/index.xml
+++ b/website/generated-content/get-started/index.xml
@@ -4415,11 +4415,11 @@ This serves both as comments and makes your pipeline
easier to debug.</p>
</div>
<h2 id="setup">Setup</h2>
<p>Here&rsquo;s a comparison on how to get started both in PySpark and
Beam.</p>
-<div class="table-wrapper"><table>
+<div class="table-wrapper"><table style="width:100%">
<tr>
-<th></th>
-<th>PySpark</th>
-<th>Beam</th>
+<th style="width:20%"></th>
+<th style="width:40%">PySpark</th>
+<th style="width:40%">Beam</th>
</tr>
<tr>
<td><b>Install</b></td>
@@ -4472,86 +4472,82 @@ This serves both as comments and makes your pipeline
easier to debug.</p>
</table></div>
<h2 id="transforms">Transforms</h2>
<p>Here are the equivalents of some common transforms in both PySpark and
Beam.</p>
-<div class="table-wrapper"><table>
-<thead>
+<div class="table-wrapper"><table style="width:100%">
<tr>
-<th></th>
-<th>PySpark</th>
-<th>Beam</th>
+<th style="width:20%"></th>
+<th style="width:40%">PySpark</th>
+<th style="width:40%">Beam</th>
</tr>
-</thead>
-<tbody>
<tr>
-<td><a
href="/documentation/transforms/python/elementwise/map/"><strong>Map</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/elementwise/map/">Map</a></b></td>
<td><code>values.map(lambda x: x * 2)</code></td>
<td><code>values | beam.Map(lambda x: x * 2)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/elementwise/filter/"><strong>Filter</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/elementwise/filter/">Filter</a></b></td>
<td><code>values.filter(lambda x: x % 2 == 0)</code></td>
<td><code>values | beam.Filter(lambda x: x % 2 == 0)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/elementwise/flatmap/"><strong>FlatMap</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/elementwise/flatmap/">FlatMap</a></b></td>
<td><code>values.flatMap(lambda x: range(x))</code></td>
<td><code>values | beam.FlatMap(lambda x: range(x))</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/groupbykey/"><strong>Group
by key</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/groupbykey/">Group by
key</a></b></td>
<td><code>pairs.groupByKey()</code></td>
<td><code>pairs | beam.GroupByKey()</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/combineglobally/"><strong>Reduce</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/combineglobally/">Reduce</a></b></td>
<td><code>values.reduce(lambda x, y: x+y)</code></td>
<td><code>values | beam.CombineGlobally(sum)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/combineperkey/"><strong>Reduce
by key</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/combineperkey/">Reduce by
key</a></b></td>
<td><code>pairs.reduceByKey(lambda x, y: x+y)</code></td>
<td><code>pairs | beam.CombinePerKey(sum)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/distinct/"><strong>Distinct</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/distinct/">Distinct</a></b></td>
<td><code>values.distinct()</code></td>
<td><code>values | beam.Distinct()</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/count/"><strong>Count</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/count/">Count</a></b></td>
<td><code>values.count()</code></td>
<td><code>values | beam.combiners.Count.Globally()</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/count/"><strong>Count by
key</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/count/">Count by
key</a></b></td>
<td><code>pairs.countByKey()</code></td>
<td><code>pairs | beam.combiners.Count.PerKey()</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/top/"><strong>Take
smallest</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/top/">Take
smallest</a></b></td>
<td><code>values.takeOrdered(3)</code></td>
<td><code>values | beam.combiners.Top.Smallest(3)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/top/"><strong>Take
largest</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/top/">Take
largest</a></b></td>
<td><code>values.takeOrdered(3, lambda x: -x)</code></td>
<td><code>values | beam.combiners.Top.Largest(3)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/sample/"><strong>Random
sample</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/sample/">Random
sample</a></b></td>
<td><code>values.takeSample(False, 3)</code></td>
<td><code>values |
beam.combiners.Sample.FixedSizeGlobally(3)</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/other/flatten/"><strong>Union</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/other/flatten/">Union</a></b></td>
<td><code>values.union(otherValues)</code></td>
<td><code>(values, otherValues) | beam.Flatten()</code></td>
</tr>
<tr>
-<td><a
href="/documentation/transforms/python/aggregation/cogroupbykey/"><strong>Co-group</strong></a></td>
+<td><b><a
href="/documentation/transforms/python/aggregation/cogroupbykey/">Co-group</a></b></td>
<td><code>pairs.cogroup(otherPairs)</code></td>
<td><code>{'Xs': pairs, 'Ys': otherPairs} |
beam.CoGroupByKey()</code></td>
</tr>
-</tbody>
</table></div>
<blockquote>
<p>ℹ️ To learn more about the transforms available in Beam, check the
diff --git a/website/generated-content/sitemap.xml
b/website/generated-content/sitemap.xml
index babe8bdfd75..93820975b1a 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/case-studies/intuit/</loc><lastmod>2022-08-18T01:27:08+06:00</lastmod></url><url><loc>/blog/go-2.40/</loc><lastmod>2022-07-06T14:03:32-04:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-07-06T14:03:32-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-07-06T14:03:32-04:00</lastmod></url><url><loc>/c
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/case-studies/intuit/</loc><lastmod>2022-08-18T01:27:08+06:00</lastmod></url><url><loc>/blog/go-2.40/</loc><lastmod>2022-07-06T14:03:32-04:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2022-07-06T14:03:32-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2022-07-06T14:03:32-04:00</lastmod></url><url><loc>/c
[...]
\ No newline at end of file