This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push: new 992049a4ae2 Publishing website 2023/07/20 16:16:04 at commit 3d501ee 992049a4ae2 is described below commit 992049a4ae2b8fe65b464544ec280932c48dca1f Author: jenkins <bui...@apache.org> AuthorDate: Thu Jul 20 16:16:04 2023 +0000 Publishing website 2023/07/20 16:16:04 at commit 3d501ee --- website/generated-content/blog/beam-2.33.0/index.html | 4 +++- website/generated-content/blog/beam-2.34.0/index.html | 4 +++- website/generated-content/blog/index.xml | 9 +++++++++ website/generated-content/categories/blog/index.xml | 9 +++++++++ website/generated-content/categories/release/index.xml | 9 +++++++++ website/generated-content/documentation/runners/spark/index.html | 6 +++--- website/generated-content/sitemap.xml | 2 +- 7 files changed, 37 insertions(+), 6 deletions(-) diff --git a/website/generated-content/blog/beam-2.33.0/index.html b/website/generated-content/blog/beam-2.33.0/index.html index 1cbe3d204dd..b40bed87fd9 100644 --- a/website/generated-content/blog/beam-2.33.0/index.html +++ b/website/generated-content/blog/beam-2.33.0/index.html @@ -31,7 +31,9 @@ TableRows to Beam Rows (Java) (<a href=https://issues.apache.org/jira/browse/BEAM-12479>BEAM-12479</a>).</li><li>SDFBoundedSourceReader behaves much slower compared with the original behavior of BoundedSource (Python) (<a href=https://issues.apache.org/jira/browse/BEAM-12781>BEAM-12781</a>).</li><li>ORDER BY column not in SELECT crashes (ZetaSQL) -(<a href=https://issues.apache.org/jira/browse/BEAM-12759>BEAM-12759</a>).</li></ul><h3 id=known-issues>Known Issues</h3><ul><li>Spark 2.x users will need to update Spark’s Jackson runtime dependencies (<code>spark.jackson.version</code>) to at least version 2.9.2, due to Beam updating its dependencies.</li><li>See a full list of open <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20u [...] +(<a href=https://issues.apache.org/jira/browse/BEAM-12759>BEAM-12759</a>).</li></ul><h3 id=known-issues>Known Issues</h3><ul><li>Spark 2.x users will need to update Spark’s Jackson runtime dependencies (<code>spark.jackson.version</code>) to at least version 2.9.2, due to Beam updating its dependencies.</li><li>See a full list of open <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20u [...] +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (> 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li></ul><h2 id=list-of-contributors>List of Contributors</h2><p>According to git shortlog, the following people contributed to the 2.33.0 release. Thank you to all contributors!</p><p>Ahmet Altay, Alex Amato, Alexey Romanenko, Andreas Bergmeier, diff --git a/website/generated-content/blog/beam-2.34.0/index.html b/website/generated-content/blog/beam-2.34.0/index.html index 016ee04024d..2eb66e55ac6 100644 --- a/website/generated-content/blog/beam-2.34.0/index.html +++ b/website/generated-content/blog/beam-2.34.0/index.html @@ -28,7 +28,9 @@ This release includes both improvements and new functionality. See the <a href=/get-started/downloads/#2340-2021-11-11>download page</a> for this release.</p><p>For more information on changes in 2.34.0, check out the <a href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12350405">detailed release notes</a>.</p><h2 id=highlights>Highlights</h2><ul><li>The Beam Java API for Calcite SqlTransform is no longer experimental (<a href=https://issues.apache.org/jira/browse/BEAM-12680>BEAM-12680</a>).</li><li>Python’s ParDo (Map, FlatMap, etc.) transforms now suport a <code>with_exception_handling</code> option for easily ignoring bad records and implementing the dead letter pattern.</li></ul><h2 id=ios>I/Os</h2><ul><li><code>ReadFromBigQuery</code> and <code>ReadAllFromBigQuery</cod [...] we’ve verified compatibility with. We now recommend installing Beam with <code>pip install apache-beam[dataframe]</code> when you intend to use the DataFrame API -(<a href=https://issues.apache.org/jira/browse/BEAM-12906>BEAM-12906</a>).</li><li>Add an <a href=https://github.com/cometta/python-apache-beam-spark>example</a> of deploying Python Apache Beam job with Spark Cluster</li></ul><h2 id=breaking-changes>Breaking Changes</h2><ul><li>SQL Rows are no longer flattened (<a href=https://issues.apache.org/jira/browse/BEAM-5505>BEAM-5505</a>).</li><li>[Go SDK] beam.TryCrossLanguage’s signature now matches beam.CrossLanguage. Like other Try fun [...] +(<a href=https://issues.apache.org/jira/browse/BEAM-12906>BEAM-12906</a>).</li><li>Add an <a href=https://github.com/cometta/python-apache-beam-spark>example</a> of deploying Python Apache Beam job with Spark Cluster</li></ul><h2 id=breaking-changes>Breaking Changes</h2><ul><li>SQL Rows are no longer flattened (<a href=https://issues.apache.org/jira/browse/BEAM-5505>BEAM-5505</a>).</li><li>[Go SDK] beam.TryCrossLanguage’s signature now matches beam.CrossLanguage. Like other Try fun [...] +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (> 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li></ul><h2 id=list-of-contributors>List of Contributors</h2><p>According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!</p><p>Ahmet Altay, Aizhamal Nurmamat kyzy, Alex Amato, Alexander Chermenin, diff --git a/website/generated-content/blog/index.xml b/website/generated-content/blog/index.xml index 3819d6e42eb..1bf192cb4e2 100644 --- a/website/generated-content/blog/index.xml +++ b/website/generated-content/blog/index.xml @@ -3383,6 +3383,12 @@ we&rsquo;ve verified compatibility with. We now recommend installing Beam wi <li>Fixed error when importing the DataFrame API with pandas 1.0.x installed (<a href="https://issues.apache.org/jira/browse/BEAM-12945">BEAM-12945</a>).</li> <li>Fixed top.SmallestPerKey implementation in the Go SDK (<a href="https://issues.apache.org/jira/browse/BEAM-12946">BEAM-12946</a>).</li> </ul> +<h3 id="known-issues">Known Issues</h3> +<ul> +<li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used). +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (&gt; 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li> +</ul> <h2 id="list-of-contributors">List of Contributors</h2> <p>According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!</p> <p>Ahmet Altay, @@ -3704,6 +3710,9 @@ of BoundedSource (Python) <li>Spark 2.x users will need to update Spark&rsquo;s Jackson runtime dependencies (<code>spark.jackson.version</code>) to at least version 2.9.2, due to Beam updating its dependencies.</li> <li>See a full list of open <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect</a> this version.</li> <li>Go SDK jobs may produce &ldquo;Failed to deduce Step from MonitoringInfo&rdquo; messages following successful job execution. The messages are benign and don&rsquo;t indicate job failure. These are due to not yet handling PCollection metrics.</li> +<li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used). +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (&gt; 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li> </ul> <h2 id="list-of-contributors">List of Contributors</h2> <p>According to git shortlog, the following people contributed to the 2.33.0 release. Thank you to all contributors!</p> diff --git a/website/generated-content/categories/blog/index.xml b/website/generated-content/categories/blog/index.xml index 03fa6ed3778..b6c77189a0e 100644 --- a/website/generated-content/categories/blog/index.xml +++ b/website/generated-content/categories/blog/index.xml @@ -3383,6 +3383,12 @@ we&rsquo;ve verified compatibility with. We now recommend installing Beam wi <li>Fixed error when importing the DataFrame API with pandas 1.0.x installed (<a href="https://issues.apache.org/jira/browse/BEAM-12945">BEAM-12945</a>).</li> <li>Fixed top.SmallestPerKey implementation in the Go SDK (<a href="https://issues.apache.org/jira/browse/BEAM-12946">BEAM-12946</a>).</li> </ul> +<h3 id="known-issues">Known Issues</h3> +<ul> +<li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used). +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (&gt; 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li> +</ul> <h2 id="list-of-contributors">List of Contributors</h2> <p>According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!</p> <p>Ahmet Altay, @@ -3704,6 +3710,9 @@ of BoundedSource (Python) <li>Spark 2.x users will need to update Spark&rsquo;s Jackson runtime dependencies (<code>spark.jackson.version</code>) to at least version 2.9.2, due to Beam updating its dependencies.</li> <li>See a full list of open <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect</a> this version.</li> <li>Go SDK jobs may produce &ldquo;Failed to deduce Step from MonitoringInfo&rdquo; messages following successful job execution. The messages are benign and don&rsquo;t indicate job failure. These are due to not yet handling PCollection metrics.</li> +<li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used). +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (&gt; 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li> </ul> <h2 id="list-of-contributors">List of Contributors</h2> <p>According to git shortlog, the following people contributed to the 2.33.0 release. Thank you to all contributors!</p> diff --git a/website/generated-content/categories/release/index.xml b/website/generated-content/categories/release/index.xml index b3620c9bee9..145121ec1b2 100644 --- a/website/generated-content/categories/release/index.xml +++ b/website/generated-content/categories/release/index.xml @@ -2044,6 +2044,12 @@ we&rsquo;ve verified compatibility with. We now recommend installing Beam wi <li>Fixed error when importing the DataFrame API with pandas 1.0.x installed (<a href="https://issues.apache.org/jira/browse/BEAM-12945">BEAM-12945</a>).</li> <li>Fixed top.SmallestPerKey implementation in the Go SDK (<a href="https://issues.apache.org/jira/browse/BEAM-12946">BEAM-12946</a>).</li> </ul> +<h3 id="known-issues">Known Issues</h3> +<ul> +<li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used). +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (&gt; 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li> +</ul> <h2 id="list-of-contributors">List of Contributors</h2> <p>According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!</p> <p>Ahmet Altay, @@ -2206,6 +2212,9 @@ of BoundedSource (Python) <li>Spark 2.x users will need to update Spark&rsquo;s Jackson runtime dependencies (<code>spark.jackson.version</code>) to at least version 2.9.2, due to Beam updating its dependencies.</li> <li>See a full list of open <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC">issues that affect</a> this version.</li> <li>Go SDK jobs may produce &ldquo;Failed to deduce Step from MonitoringInfo&rdquo; messages following successful job execution. The messages are benign and don&rsquo;t indicate job failure. These are due to not yet handling PCollection metrics.</li> +<li>Large Java BigQueryIO writes with the FILE_LOADS method will fail in batch mode (specifically, when copy jobs are used). +This results in the error message: <code>IllegalArgumentException: Attempting to access unknown side input</code>. +Please upgrade to a newer version (&gt; 2.34.0) or use another write method (e.g. <code>STORAGE_WRITE_API</code>).</li> </ul> <h2 id="list-of-contributors">List of Contributors</h2> <p>According to git shortlog, the following people contributed to the 2.33.0 release. Thank you to all contributors!</p> diff --git a/website/generated-content/documentation/runners/spark/index.html b/website/generated-content/documentation/runners/spark/index.html index a690a21768f..e5c7d2ad6d6 100644 --- a/website/generated-content/documentation/runners/spark/index.html +++ b/website/generated-content/documentation/runners/spark/index.html @@ -31,7 +31,7 @@ architecture of the Runners had to be changed significantly to support executing pipelines written in other languages.</p><p>If your applications only use Java, then you should currently go with one of the java based runners. If you want to run Python or Go pipelines with Beam on Spark, you need to use the portable Runner. For more information on portability, please visit the -<a href=/roadmap/portability/>Portability page</a>.</p><nav class=language-switcher><strong>Adapt for:</strong><ul><li data-value=java>Non portable (Java)</li><li data-value=py>Portable (Java/Python/Go)</li></ul></nav><h2 id=spark-runner-prerequisites-and-setup>Spark Runner prerequisites and setup</h2><p>The Spark runner currently supports Spark’s 3.1.x branch.</p><blockquote><p><strong>Note:</strong> Support for Spark 2.4.x was deprecated as of Beam 2.41.0 and finally dropped with [...] +<a href=/roadmap/portability/>Portability page</a>.</p><nav class=language-switcher><strong>Adapt for:</strong><ul><li data-value=java>Non portable (Java)</li><li data-value=py>Portable (Java/Python/Go)</li></ul></nav><h2 id=spark-runner-prerequisites-and-setup>Spark Runner prerequisites and setup</h2><p>The Spark runner currently supports Spark’s 3.2.x branch.</p><blockquote><p><strong>Note:</strong> Support for Spark 2.4.x was dropped with Beam 2.46.0.</p></blockquote><p class=la [...] <span class=o><</span><span class=n>groupId</span><span class=o>></span><span class=n>org</span><span class=o>.</span><span class=na>apache</span><span class=o>.</span><span class=na>beam</span><span class=o></</span><span class=n>groupId</span><span class=o>></span> <span class=o><</span><span class=n>artifactId</span><span class=o>></span><span class=n>beam</span><span class=o>-</span><span class=n>runners</span><span class=o>-</span><span class=n>spark</span><span class=o>-</span><span class=n>3</span><span class=o></</span><span class=n>artifactId</span><span class=o>></span> <span class=o><</span><span class=n>version</span><span class=o>></span><span class=n>2</span><span class=o>.</span><span class=na>49</span><span class=o>.</span><span class=na>0</span><span class=o></</span><span class=n>version</span><span class=o>></span> @@ -92,12 +92,12 @@ provided with the Spark master address.</p><p class=language-py><ol start=2><li> <span class=s2>"--job_endpoint=localhost:8099"</span><span class=p>,</span> <span class=s2>"--environment_type=LOOPBACK"</span> <span class=p>])</span> -<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>(</span><span class=n>options</span><span class=o>=</span><span class=n>options</span><span class=p>)</span> <span class=k>as</span> <span class=n>p</span><span class=p>:</span> +<span class=k>with</span> <span class=n>beam</span><span class=o>.</span><span class=n>Pipeline</span><span class=p>(</span><span class=n>options</span><span class=p>)</span> <span class=k>as</span> <span class=n>p</span><span class=p>:</span> <span class=o>...</span></code></pre></div></div></div><h3 id=running-on-a-pre-deployed-spark-cluster>Running on a pre-deployed Spark cluster</h3><p>Deploying your Beam pipeline on a cluster that already has a Spark deployment (Spark classes are available in container classpath) does not require any additional dependencies. For more details on the different deployment modes see: <a href=https://spark.apache.org/docs/latest/spark-standalone.html>Standalone</a>, <a href=https://spark.apache.org/docs/latest/running-on-yarn.html>YARN</a>, or <a href=https://spark.apache.org/docs/latest/running-on-mesos.html>Mesos</a>.</p><p class=language-py><ol><li>Start a Spark cluster which exposes the master on port 7077 by default.</li></ol></p><p class=language-py><ol start=2><li>Start JobService that will connect with th [...] Note however that <code>environment_type=LOOPBACK</code> is only intended for local testing. See <a href=/roadmap/portability/#sdk-harness-config>here</a> for details.</li></ol></p><p class=language-py>(Note that, depending on your cluster setup, you may need to change the <code>environment_type</code> option. -See <a href=/roadmap/portability/#sdk-harness-config>here</a> for details.)</p><h3 id=running-on-dataproc-cluster-yarn-backed>Running on Dataproc cluster (YARN backed)</h3><p>To run Beam jobs written in Python, Go, and other supported languages, you can use the <code>SparkRunner</code> and <code>PortableRunner</code> as described on the Beam’s <a href=/documentation/runners/spark/>Spark Runner</a> page (also see <a href=/roadmap/portability/>Portability Framework Roadmap</a>).</p>< [...] +See <a href=/roadmap/portability/#sdk-harness-config>here</a> for details.)</p><h3 id=running-on-dataproc-cluster-yarn-backed>Running on Dataproc cluster (YARN backed)</h3><p>To run Beam jobs written in Python, Go, and other supported languages, you can use the <code>SparkRunner</code> and <code>PortableRunner</code> as described on the Beam’s <a href=https://beam.apache.org/documentation/runners/spark/>Spark Runner</a> page (also see <a href=https://beam.apache.org/roadmap/portabi [...] gcloud dataproc clusters create <b><i>CLUSTER_NAME</i></b> \ --optional-components=DOCKER \ --image-version=<b><i>DATAPROC_IMAGE_VERSION</i></b> \ diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml index 44d729da083..9d2f96dc227 100644 --- a/website/generated-content/sitemap.xml +++ b/website/generated-content/sitemap.xml @@ -1 +1 @@ -<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.49.0/</loc><lastmod>2023-07-20T10:02:54+10:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2023-07-20T10:02:54+10:00</lastmod></url><url><loc>/blog/</loc><lastmod>2023-07-20T10:02:54+10:00</lastmod></url><url><loc>/categories/</loc><lastmod>2023-07-20T10:02:54+10:00</lastmod></url><url><loc>/catego [...] \ No newline at end of file +<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.49.0/</loc><lastmod>2023-07-20T14:45:34+00:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2023-07-20T14:45:34+00:00</lastmod></url><url><loc>/blog/</loc><lastmod>2023-07-20T14:45:34+00:00</lastmod></url><url><loc>/categories/</loc><lastmod>2023-07-20T14:45:34+00:00</lastmod></url><url><loc>/catego [...] \ No newline at end of file