This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new b5db0e5368d Publishing website 2025/07/16 17:46:04 at commit 84d423f
b5db0e5368d is described below

commit b5db0e5368df00e42f699ee7118346f3f038e45b
Author: runner <runner@main-runner-frrkx-7cnqr>
AuthorDate: Wed Jul 16 17:46:05 2025 +0000

    Publishing website 2025/07/16 17:46:04 at commit 84d423f
---
 .../extensions/create-external-table/index.html    | 63 +++++++++++++++++++++-
 website/generated-content/sitemap.xml              |  2 +-
 2 files changed, 62 insertions(+), 3 deletions(-)

diff --git 
a/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
 
b/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
index c562b194beb..1c5a5122d8a 100644
--- 
a/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
+++ 
b/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
@@ -35,7 +35,7 @@
 <img class=banner-img-mobile 
src=/images/banners/tour-of-beam/tour-of-beam-mobile.png alt="Start Tour of 
Beam"></a></div><div class=swiper-slide><a 
href=https://beam.apache.org/documentation/ml/overview/><img 
class=banner-img-desktop 
src=/images/banners/machine-learning/machine-learning-desktop.jpg alt="Machine 
Learning">
 <img class=banner-img-mobile 
src=/images/banners/machine-learning/machine-learning-mobile.jpg alt="Machine 
Learning"></a></div></div><div class=swiper-pagination></div><div 
class=swiper-button-prev></div><div 
class=swiper-button-next></div></div><script 
src=/js/swiper-bundle.min.min.e0e8f81b0b15728d35ff73c07f42ddbb17a108d6f23df4953cb3e60df7ade675.js></script>
 <script 
src=/js/sliders/top-banners.min.afa7d0a19acf7a3b28ca369490b3d401a619562a2a4c9612577be2f66a4b9855.js></script>
-<script>function showSearch(){addPlaceholder();var 
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
 addPlaceholder(){$("input:text").attr("placeholder","What are you looking 
for?")}function endSearch(){var 
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
 blockScroll(){$("body").toggleClass(" [...]
+<script>function showSearch(){addPlaceholder();var 
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
 addPlaceholder(){$("input:text").attr("placeholder","What are you looking 
for?")}function endSearch(){var 
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
 blockScroll(){$("body").toggleClass(" [...]
 <a href=/documentation/io/built-in/>external storage system</a>.
 For some storage systems, <code>CREATE EXTERNAL TABLE</code> does not create a 
physical table until
 a write occurs. After the physical table exists, you can access the table with
@@ -256,7 +256,66 @@ See the following table:</li></ul></li></ul><div 
class=table-container-wrapper><
 types specified in the schema using 
org.apache.commons.csv.</li></ul></li></ul><h3 id=schema-5>Schema</h3><p>Only 
simple types are supported.</p><h3 id=example-6>Example</h3><pre 
tabindex=0><code>CREATE EXTERNAL TABLE orders (id INTEGER, price INTEGER)
 TYPE text
 LOCATION &#39;/home/admin/orders&#39;
-</code></pre><h2 id=generic-payload-handling>Generic Payload 
Handling</h2><p>Certain data sources and sinks support generic payload 
handling. This handling
+</code></pre><h2 id=datagen>DataGen</h2><p>The <strong>DataGen</strong> 
connector allows for creating tables based on in-memory data generation. This 
is useful for developing and testing queries locally without requiring access 
to external systems. The DataGen connector is built-in; no additional 
dependencies are required.It is available for Beam 2.67.0+</p><p>Tables can be 
either <strong>bounded</strong> (generating a fixed number of rows) or 
<strong>unbounded</strong> (generating a str [...]
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=k>TYPE</span><span class=w> </span><span 
class=n>datagen</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=p>[</span><span class=n>TBLPROPERTIES</span><span 
class=w> </span><span class=n>tblProperties</span><span class=p>]</span><span 
class=w>
+</span></span></span></code></pre></div><h3 
id=table-properties-tblproperties>Table Properties 
(<code>TBLPROPERTIES</code>)</h3><p>The <code>TBLPROPERTIES</code> JSON object 
is used to configure the generator&rsquo;s behavior.</p><h4 
id=general-options>General Options</h4><table><thead><tr><th 
style=text-align:left>Key</th><th style=text-align:left>Required</th><th 
style=text-align:left>Description</th></tr></thead><tbody><tr><td 
style=text-align:left><code>number-of-rows</code></td><td  [...]
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>id</span><span class=w> </span><span 
class=nb>BIGINT</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>product_name</span><span class=w> </span><span 
class=nb>VARCHAR</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=k>TYPE</span><span class=w> </span><span 
class=n>datagen</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span 
class=s1>&#39;{
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;number-of-rows&#34;: &#34;1000&#34;
+</span></span></span><span class=line><span class=cl><span 
class=s1>}&#39;</span><span class=w>
+</span></span></span></code></pre></div><h4 
id=unbounded-streaming-table>Unbounded Streaming Table</h4><p>This example 
creates a streaming table that generates 10 rows per second.</p><div 
class=highlight><pre tabindex=0 class=chroma><code class=language-sql 
data-lang=sql><span class=line><span class=cl><span class=k>CREATE</span><span 
class=w> </span><span class=k>EXTERNAL</span><span class=w> </span><span 
class=k>TABLE</span><span class=w> </span><span 
class=n>user_impressions</span><sp [...]
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>user_id</span><span class=w> </span><span 
class=nb>VARCHAR</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>impression_time</span><span class=w> </span><span 
class=k>TIMESTAMP</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=k>TYPE</span><span class=w> </span><span 
class=n>datagen</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span 
class=s1>&#39;{
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;rows-per-second&#34;: &#34;10&#34;
+</span></span></span><span class=line><span class=cl><span 
class=s1>}&#39;</span><span class=w>
+</span></span></span></code></pre></div><hr><h4 
id=bounded-table-with-custom-field-generation>Bounded Table with Custom Field 
Generation</h4><p>This is a comprehensive example demonstrating various 
field-level customizations. The table is bounded because a sequence generator 
is used.</p><div class=highlight><pre tabindex=0 class=chroma><code 
class=language-sql data-lang=sql><span class=line><span class=cl><span 
class=k>CREATE</span><span class=w> </span><span class=k>EXTERNAL</span><span 
[...]
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>event_id</span><span class=w> </span><span 
class=nb>BIGINT</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>user_id</span><span class=w> </span><span 
class=nb>VARCHAR</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>click_timestamp</span><span class=w> </span><span 
class=k>TIMESTAMP</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>score</span><span class=w> </span><span 
class=n>DOUBLE</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=k>TYPE</span><span class=w> </span><span 
class=s1>&#39;datagen&#39;</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span 
class=s1>&#39;{
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;number-of-rows&#34;: &#34;1000000&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.event_id.kind&#34;: &#34;sequence&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.event_id.start&#34;: &#34;1&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.event_id.end&#34;: &#34;1000000&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.user_id.kind&#34;: &#34;random&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.user_id.length&#34;: &#34;12&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.click_timestamp.kind&#34;: &#34;random&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.click_timestamp.max-past&#34;: &#34;60000&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.score.kind&#34;: &#34;random&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.score.min&#34;: &#34;0.0&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.score.max&#34;: &#34;1.0&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.score.null-rate&#34;: &#34;0.1&#34;
+</span></span></span><span class=line><span class=cl><span 
class=s1>}&#39;</span><span class=w>
+</span></span></span></code></pre></div><h4 
id=unbounded-streaming-table-with-event-time>Unbounded Streaming Table with 
Event Time</h4><p>This example creates a streaming table that generates 10 rows 
per second. It uses the <code>click_timestamp</code> column to drive the 
event-time watermark, allowing for up to 5 seconds of out-of-order data. The 
<code>ingestion_timestamp</code> column is populated separately with the 
processing time.</p><div class=highlight><pre tabindex=0 class=chroma [...]
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>event_id</span><span class=w> </span><span 
class=nb>BIGINT</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>user_id</span><span class=w> </span><span 
class=nb>VARCHAR</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>click_timestamp</span><span class=w> </span><span 
class=k>TIMESTAMP</span><span class=p>,</span><span class=w>
+</span></span></span><span class=line><span class=cl><span class=w>    
</span><span class=n>ingestion_timestamp</span><span class=w> </span><span 
class=k>TIMESTAMP</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=p>)</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=k>TYPE</span><span class=w> </span><span 
class=s1>&#39;datagen&#39;</span><span class=w>
+</span></span></span><span class=line><span class=cl><span 
class=w></span><span class=n>TBLPROPERTIES</span><span class=w> </span><span 
class=s1>&#39;{
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;rows-per-second&#34;: &#34;10&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;timestamp.behavior&#34;: &#34;event-time&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;event-time.timestamp-column&#34;: &#34;click_timestamp&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;event-time.max-out-of-orderness&#34;: &#34;5000&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.event_id.kind&#34;: &#34;sequence&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.event_id.start&#34;: &#34;1&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.event_id.end&#34;: &#34;1000000&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.user_id.kind&#34;: &#34;random&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.user_id.length&#34;: &#34;12&#34;,
+</span></span></span><span class=line><span class=cl><span class=s1>  
&#34;fields.ingestion_timestamp.kind&#34;: &#34;timestamp&#34;
+</span></span></span><span class=line><span class=cl><span 
class=s1>}&#39;</span><span class=w>
+</span></span></span></code></pre></div><h2 
id=generic-payload-handling>Generic Payload Handling</h2><p>Certain data 
sources and sinks support generic payload handling. This handling
 parses a byte array payload field into a table schema. The following schemas 
are
 supported by this handling. All require at least setting <code>"format": 
"&lt;type>"</code>,
 and may require other properties.</p><ul><li><code>avro</code>: Avro<ul><li>An 
Avro schema is automatically generated from the specified field
diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index 8441193e28c..e3ef8e8b1c2 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/categories/blog/</loc><lastmod>2025-07-16T07:22:40-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2025-07-16T07:22:40-04:00</lastmod></url><url><loc>/categories/</loc><lastmod>2025-07-16T07:22:40-04:00</lastmod></url><url><loc>/blog/beam-summit-2025-hackathon-pcollectors-blog/</loc><lastmod>2025-07-16T07:22:40-04:00<
 [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/categories/blog/</loc><lastmod>2025-07-16T12:42:51-04:00</lastmod></url><url><loc>/blog/</loc><lastmod>2025-07-16T12:42:51-04:00</lastmod></url><url><loc>/categories/</loc><lastmod>2025-07-16T12:42:51-04:00</lastmod></url><url><loc>/blog/beam-summit-2025-hackathon-pcollectors-blog/</loc><lastmod>2025-07-16T12:42:51-04:00<
 [...]
\ No newline at end of file

Reply via email to