11 22:16:02 at commit 33518fb

git-site-role Tue, 11 Jul 2023 15:17:41 -0700

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new d90bb3bc550 Publishing website 2023/07/11 22:16:02 at commit 33518fb
d90bb3bc550 is described below

commit d90bb3bc5504632b0875bc75b703150a6efd9b19
Author: jenkins <bui...@apache.org>
AuthorDate: Tue Jul 11 22:16:02 2023 +0000

    Publishing website 2023/07/11 22:16:02 at commit 33518fb
---
 website/generated-content/contribute/index.xml     |  5 +++-
 .../contribute/release-guide/index.html            |  6 +++--
 .../io/built-in/google-bigquery/index.html         | 28 ++++++++++++++++++----
 website/generated-content/sitemap.xml              |  2 +-
 4 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/website/generated-content/contribute/index.xml 
b/website/generated-content/contribute/index.xml
index 0af6e7da522..47ea65bedc0 100644
--- a/website/generated-content/contribute/index.xml
+++ b/website/generated-content/contribute/index.xml
@@ -1201,6 +1201,7 @@ You don&amp;rsquo;t need to wait for the action to 
complete to start running the
 &lt;ol>
 &lt;li>Clone the repo at the selected RC tag.&lt;/li>
 &lt;li>Run gradle publish to push java artifacts into Maven staging 
repo.&lt;/li>
+&lt;li>Stage SDK docker images to &lt;a 
href="https://hub.docker.com/search?q=apache%2Fbeam&amp;amp;type=image";>docker 
hub Apache organization&lt;/a>.&lt;/li>
 &lt;/ol>
 &lt;/li>
 &lt;/ul>
@@ -1235,9 +1236,11 @@ Some additional validation should be done during the rc 
validation step.&lt;/li>
 &lt;p>&lt;strong>The script will:&lt;/strong>&lt;/p>
 &lt;ol>
 &lt;li>Clone the repo at the selected RC tag.&lt;/li>
-&lt;li>Stage source release into dist.apache.org dev &lt;a 
href="https://dist.apache.org/repos/dist/dev/beam/";>repo&lt;/a>.&lt;/li>
+&lt;li>Stage source release into dist.apache.org dev &lt;a 
href="https://dist.apache.org/repos/dist/dev/beam/";>repo&lt;/a>.
+Skip this step if you already did it with the build_release_candidate GitHub 
Actions workflow.&lt;/li>
 &lt;li>Stage, sign and hash python source distribution and wheels into 
dist.apache.org dev repo python dir&lt;/li>
 &lt;li>Stage SDK docker images to &lt;a 
href="https://hub.docker.com/search?q=apache%2Fbeam&amp;amp;type=image";>docker 
hub Apache organization&lt;/a>.
+Skip this step if you already did it with the build_release_candidate GitHub 
Actions workflow.
 Note: if you are not a member of the &lt;a 
href="https://hub.docker.com/orgs/apache/teams/beam";>&lt;code>beam&lt;/code> 
DockerHub team&lt;/a> you will need
 help with this step. Please email &lt;code>dev@&lt;/code> and ask a member of 
the &lt;code>beam&lt;/code> DockerHub team for help.&lt;/li>
 &lt;li>Create a PR to update beam-site, changes includes:
diff --git a/website/generated-content/contribute/release-guide/index.html 
b/website/generated-content/contribute/release-guide/index.html
index 31108ece97c..28af8361a77 100644
--- a/website/generated-content/contribute/release-guide/index.html
+++ b/website/generated-content/contribute/release-guide/index.html
@@ -142,12 +142,14 @@ The final state of the repository should match this 
diagram:</p><p><img src=/ima
 adjust the version, and add the tag locally. If it looks good, run it again 
with <code>--push-tag</code>.
 If you already have a clone that includes the <code>${COMMIT_REF}</code> then 
you can omit <code>--clone</code>. This
 is perfectly safe since the script does not depend on the current working 
tree.</p><p>See the source of the script for more details, or to run commands 
manually in case of a problem.</p><h3 
id=run-build_release_candidate-github-action-to-create-a-release-candidate>Run 
build_release_candidate GitHub Action to create a release 
candidate</h3><p>Note: This step is partially automated (in progress), so part 
of the rc creation is done by GitHub Actions and the rest is done by a script.
-You don&rsquo;t need to wait for the action to complete to start running the 
script.</p><ul><li><p><strong>Action</strong> <a 
href=https://github.com/apache/beam/actions/workflows/build_release_candidate.yml>build_release_candidate</a>
 (click <code>run workflow</code>)</p></li><li><p><strong>The script 
will:</strong></p><ol><li>Clone the repo at the selected RC tag.</li><li>Run 
gradle publish to push java artifacts into Maven staging 
repo.</li></ol></li></ul><h4 id=tasks-you-need-to-do-m [...]
+You don&rsquo;t need to wait for the action to complete to start running the 
script.</p><ul><li><p><strong>Action</strong> <a 
href=https://github.com/apache/beam/actions/workflows/build_release_candidate.yml>build_release_candidate</a>
 (click <code>run workflow</code>)</p></li><li><p><strong>The script 
will:</strong></p><ol><li>Clone the repo at the selected RC tag.</li><li>Run 
gradle publish to push java artifacts into Maven staging repo.</li><li>Stage 
SDK docker images to <a href="http [...]
 They should contain all relevant parts for each module, including 
<code>pom.xml</code>, jar, test jar, javadoc, etc.
 Artifact names should follow <a 
href=https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.beam%22>the 
existing format</a> in which artifact name mirrors directory structure, e.g., 
<code>beam-sdks-java-io-kafka</code>.
 Carefully review any new artifacts.
 Some additional validation should be done during the rc validation 
step.</li></ol></li></ol><h3 
id=run-build_release_candidatesh-to-create-a-release-candidate>Run 
build_release_candidate.sh to create a release 
candidate</h3><ul><li><p><strong>Script:</strong> <a 
href=https://github.com/apache/beam/blob/master/release/src/main/scripts/build_release_candidate.sh>build_release_candidate.sh</a></p></li><li><p><strong>Usage</strong></p><pre><code>./beam/release/src/main/scripts/build_release_
 [...]
-</code></pre></li><li><p><strong>The script will:</strong></p><ol><li>Clone 
the repo at the selected RC tag.</li><li>Stage source release into 
dist.apache.org dev <a 
href=https://dist.apache.org/repos/dist/dev/beam/>repo</a>.</li><li>Stage, sign 
and hash python source distribution and wheels into dist.apache.org dev repo 
python dir</li><li>Stage SDK docker images to <a 
href="https://hub.docker.com/search?q=apache%2Fbeam&type=image";>docker hub 
Apache organization</a>.
+</code></pre></li><li><p><strong>The script will:</strong></p><ol><li>Clone 
the repo at the selected RC tag.</li><li>Stage source release into 
dist.apache.org dev <a 
href=https://dist.apache.org/repos/dist/dev/beam/>repo</a>.
+Skip this step if you already did it with the build_release_candidate GitHub 
Actions workflow.</li><li>Stage, sign and hash python source distribution and 
wheels into dist.apache.org dev repo python dir</li><li>Stage SDK docker images 
to <a href="https://hub.docker.com/search?q=apache%2Fbeam&type=image";>docker 
hub Apache organization</a>.
+Skip this step if you already did it with the build_release_candidate GitHub 
Actions workflow.
 Note: if you are not a member of the <a 
href=https://hub.docker.com/orgs/apache/teams/beam><code>beam</code> DockerHub 
team</a> you will need
 help with this step. Please email <code>dev@</code> and ask a member of the 
<code>beam</code> DockerHub team for help.</li><li>Create a PR to update 
beam-site, changes includes:<ul><li>Copy python doc into beam-site</li><li>Copy 
java doc into beam-site</li><li><strong>NOTE</strong>: Do not merge this PR 
until after an RC has been approved (see &ldquo;Finalize the 
Release&rdquo;).</li></ul></li></ol></li></ul><h4 
id=tasks-you-need-to-do-manually-1>Tasks you need to do manually</h4><ol><li 
[...]
 Please note that dependencies for the SDKs with different Python versions vary.
diff --git 
a/website/generated-content/documentation/io/built-in/google-bigquery/index.html
 
b/website/generated-content/documentation/io/built-in/google-bigquery/index.html
index 896c92d6904..5d99d128df3 100644
--- 
a/website/generated-content/documentation/io/built-in/google-bigquery/index.html
+++ 
b/website/generated-content/documentation/io/built-in/google-bigquery/index.html
@@ -327,7 +327,11 @@ GitHub</a>.</p><div class="language-java snippet"><div 
class="notebook-skip code
 
     <span class=k>return</span> <span class=n>rows</span><span class=o>;</span>
   <span class=o>}</span>
-<span class=o>}</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=c1># The SDK for Python does not 
support the BigQuery Storage API.</span></code></pre></div></div></div><p>The 
following code snippet reads wit [...]
+<span class=o>}</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=n>max_temperatures</span> <span 
class=o>=</span> <span class=p>(</span>
+    <span class=n>pipeline</span>
+    <span class=o>|</span> <span 
class=s1>&#39;ReadTableWithStorageAPI&#39;</span> <span class=o>&gt;&gt;</span> 
<span class=n>beam</span><span class=o>.</span><span class=n>io</span><span 
class=o>.</span><span class=n>ReadFromBigQuery</span><span class=p>(</span>
+        <span class=n>table</span><span class=o>=</span><span 
class=n>table_spec</span><span class=p>,</span> <span 
class=n>method</span><span class=o>=</span><span class=n>beam</span><span 
class=o>.</span><span class=n>io</span><span class=o>.</span><span 
class=n>ReadFromBigQuery</span><span class=o>.</span><span 
class=n>Method</span><span class=o>.</span><span 
class=n>DIRECT_READ</span><span class=p>)</span>
+    <span class=o>|</span> <span class=n>beam</span><span 
class=o>.</span><span class=n>Map</span><span class=p>(</span><span 
class=k>lambda</span> <span class=n>elem</span><span class=p>:</span> <span 
class=n>elem</span><span class=p>[</span><span 
class=s1>&#39;max_temperature&#39;</span><span 
class=p>]))</span></code></pre></div></div></div><p>The following code snippet 
reads with a query string.</p><div class="language-java snippet"><div 
class="notebook-skip code-snippet"><a class=cop [...]
 <span class=kn>import</span> <span 
class=nn>org.apache.beam.sdk.Pipeline</span><span class=o>;</span>
 <span class=kn>import</span> <span 
class=nn>org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO</span><span 
class=o>;</span>
 <span class=kn>import</span> <span 
class=nn>org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.Method</span><span
 class=o>;</span>
@@ -623,7 +627,7 @@ be replaced.</p><div class="language-java snippet"><div 
class="notebook-skip cod
 You can either keep retrying, or return the failed records in a separate
 <code>PCollection</code> using the <code>WriteResult.getFailedInserts()</code> 
method.</p><h3 id=storage-write-api>Using the Storage Write API</h3><p>Starting 
with version 2.36.0 of the Beam SDK for Java, you can use the
 <a href=https://cloud.google.com/bigquery/docs/write-api>BigQuery Storage 
Write API</a>
-from the BigQueryIO connector.</p><h4 id=exactly-once-semantics>Exactly-once 
semantics</h4><p>To write to BigQuery using the Storage Write API, set 
<code>withMethod</code> to
+from the BigQueryIO connector.</p><p>Also after version 2.47.0 of Beam SDK for 
Python, SDK supports BigQuery Storage Write API.</p><p 
class=language-py>BigQuery Storage Write API for Python SDK currently has some 
limitations on supported data types. As this method makes use of cross-language 
transforms, we are limited to the types supported at the cross-language 
boundary. For example, <code>apache_beam.utils.timestamp.Timestamp</code> is 
needed to write a <code>TIMESTAMP</code> BigQuery  [...]
 <a 
href=https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.Method.html#STORAGE_WRITE_API><code>Method.STORAGE_WRITE_API</code></a>.
 Here’s an example transform that writes to BigQuery using the Storage Write 
API and exactly-once semantics:</p><p><div class="language-java snippet"><div 
class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-java data-lang=java><span class=n>WriteResult</span> <span 
class=n>writeResult</span> <span class=o>=</span> [...]
 <span class=n>BigQueryIO</span><span class=o>.</span><span 
class=na>writeTableRows</span><span class=o>()</span>
@@ -631,7 +635,10 @@ Here’s an example transform that writes to BigQuery using 
the Storage Write AP
         <span class=o>.</span><span class=na>withWriteDisposition</span><span 
class=o>(</span><span class=n>WriteDisposition</span><span 
class=o>.</span><span class=na>WRITE_APPEND</span><span class=o>)</span>
         <span class=o>.</span><span class=na>withCreateDisposition</span><span 
class=o>(</span><span class=n>CreateDisposition</span><span 
class=o>.</span><span class=na>CREATE_NEVER</span><span class=o>)</span>
         <span class=o>.</span><span class=na>withMethod</span><span 
class=o>(</span><span class=n>Method</span><span class=o>.</span><span 
class=na>STORAGE_WRITE_API</span><span class=o>)</span>
-<span class=o>);</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=c1># The SDK for Python does not 
support the BigQuery Storage 
API.</span></code></pre></div></div></div></p><p>If you want to change the 
behav [...]
+<span class=o>);</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=n>quotes</span> <span 
class=o>|</span> <span class=s2>&#34;WriteTableWithStorageAPI&#34;</span> <span 
class=o>&gt;&gt;</span> <span class=n>be [...]
+    <span class=n>table_spec</span><span class=p>,</span>
+    <span class=n>schema</span><span class=o>=</span><span 
class=n>table_schema</span><span class=p>,</span>
+    <span class=n>method</span><span class=o>=</span><span 
class=n>beam</span><span class=o>.</span><span class=n>io</span><span 
class=o>.</span><span class=n>WriteToBigQuery</span><span class=o>.</span><span 
class=n>Method</span><span class=o>.</span><span 
class=n>STORAGE_WRITE_API</span><span 
class=p>)</span></code></pre></div></div></div></p><p>If you want to change the 
behavior of BigQueryIO so that all the BigQuery sinks
 for your pipeline use the Storage Write API by default, set the
 <a 
href=https://github.com/apache/beam/blob/2c18ce0ccd7705473aa9ecc443dcdbe223dd9449/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryOptions.java#L82-L86><code>UseStorageWriteApi</code>
 option</a>.</p><p>If your pipeline needs to create the table (in case it 
doesn’t exist and you
 specified the create disposition as <code>CREATE_IF_NEEDED</code>), you must 
provide a
@@ -645,12 +652,23 @@ binary protocol.</p><p><div class="language-java 
snippet"><div class="notebook-s
             <span class=k>new</span> <span 
class=n>TableFieldSchema</span><span class=o>()</span>
                 <span class=o>.</span><span class=na>setName</span><span 
class=o>(</span><span class=s>&#34;user_name&#34;</span><span class=o>)</span>
                 <span class=o>.</span><span class=na>setType</span><span 
class=o>(</span><span class=s>&#34;STRING&#34;</span><span class=o>)</span>
-                <span class=o>.</span><span class=na>setMode</span><span 
class=o>(</span><span class=s>&#34;REQUIRED&#34;</span><span 
class=o>)));</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=c1># The SDK [...]
+                <span class=o>.</span><span class=na>setMode</span><span 
class=o>(</span><span class=s>&#34;REQUIRED&#34;</span><span 
class=o>)));</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=n>table_sche [...]
+    <span class=s1>&#39;fields&#39;</span><span class=p>:</span> <span 
class=p>[{</span>
+        <span class=s1>&#39;name&#39;</span><span class=p>:</span> <span 
class=s1>&#39;source&#39;</span><span class=p>,</span> <span 
class=s1>&#39;type&#39;</span><span class=p>:</span> <span 
class=s1>&#39;STRING&#39;</span><span class=p>,</span> <span 
class=s1>&#39;mode&#39;</span><span class=p>:</span> <span 
class=s1>&#39;NULLABLE&#39;</span>
+    <span class=p>},</span> <span class=p>{</span>
+        <span class=s1>&#39;name&#39;</span><span class=p>:</span> <span 
class=s1>&#39;quote&#39;</span><span class=p>,</span> <span 
class=s1>&#39;type&#39;</span><span class=p>:</span> <span 
class=s1>&#39;STRING&#39;</span><span class=p>,</span> <span 
class=s1>&#39;mode&#39;</span><span class=p>:</span> <span 
class=s1>&#39;REQUIRED&#39;</span>
+    <span class=p>}]</span>
+<span class=p>}</span></code></pre></div></div></div></p><p>For streaming 
pipelines, you need to set two additional parameters: the number
 of streams and the triggering frequency.</p><p><div class="language-java 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-java data-lang=java><span class=n>BigQueryIO</span><span 
class=o>.</span><span class=na>writeTableRows</span><span class=o>()</span>
         <span class=c1>// ...
 </span><span class=c1></span>        <span class=o>.</span><span 
class=na>withTriggeringFrequency</span><span class=o>(</span><span 
class=n>Duration</span><span class=o>.</span><span 
class=na>standardSeconds</span><span class=o>(</span><span 
class=n>5</span><span class=o>))</span>
         <span class=o>.</span><span 
class=na>withNumStorageWriteApiStreams</span><span class=o>(</span><span 
class=n>3</span><span class=o>)</span>
-<span class=o>);</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=c1># The SDK for Python does not 
support the BigQuery Storage 
API.</span></code></pre></div></div></div></p><p>The number of streams defines 
t [...]
+<span class=o>);</span></code></pre></div></div></div><div class="language-py 
snippet"><div class="notebook-skip code-snippet"><a class=copy type=button 
data-bs-toggle=tooltip data-bs-placement=bottom title="Copy to clipboard"><img 
src=/images/copy-icon.svg></a><div class=highlight><pre class=chroma><code 
class=language-py data-lang=py><span class=c1># The Python SDK doesn&#39;t 
currently support setting the number of write streams</span>
+<span class=n>quotes</span> <span class=o>|</span> <span 
class=s2>&#34;StorageWriteAPIWithFrequency&#34;</span> <span 
class=o>&gt;&gt;</span> <span class=n>beam</span><span class=o>.</span><span 
class=n>io</span><span class=o>.</span><span 
class=n>WriteToBigQuery</span><span class=p>(</span>
+    <span class=n>table_spec</span><span class=p>,</span>
+    <span class=n>schema</span><span class=o>=</span><span 
class=n>table_schema</span><span class=p>,</span>
+    <span class=n>method</span><span class=o>=</span><span 
class=n>beam</span><span class=o>.</span><span class=n>io</span><span 
class=o>.</span><span class=n>WriteToBigQuery</span><span class=o>.</span><span 
class=n>Method</span><span class=o>.</span><span 
class=n>STORAGE_WRITE_API</span><span class=p>,</span>
+    <span class=n>triggering_frequency</span><span class=o>=</span><span 
class=mi>5</span><span 
class=p>)</span></code></pre></div></div></div></p><p>The number of streams 
defines the parallelism of the BigQueryIO Write transform
 and roughly corresponds to the number of Storage Write API streams that the
 pipeline uses. You can set it explicitly on the transform via
 <a 
href=https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.html#withNumStorageWriteApiStreams-int-><code>withNumStorageWriteApiStreams</code></a>
diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index 75596dc1504..43a9b580b16 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/categories/blog/</loc><lastmod>2023-07-11T11:31:17-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2023-07-11T11:31:17-04:00</lastmod></url><url><loc>/categories/</loc><lastmod>2023-07-11T11:31:17-04:00</lastmod></url><url><loc>/blog/managing-beam-dependencies-in-java/</loc><lastmod>2023-07-11T11:31:17-04:00</lastmod>
 [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/categories/blog/</loc><lastmod>2023-07-11T15:34:39-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2023-07-11T15:34:39-04:00</lastmod></url><url><loc>/categories/</loc><lastmod>2023-07-11T15:34:39-04:00</lastmod></url><url><loc>/blog/managing-beam-dependencies-in-java/</loc><lastmod>2023-07-11T15:34:39-04:00</lastmod>
 [...]
\ No newline at end of file

[beam] branch asf-site updated: Publishing website 2023/07/11 22:16:02 at commit 33518fb

Reply via email to