This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 6ca65070cf2 Publishing website 2024/02/22 23:37:08 at commit 11f9bce
6ca65070cf2 is described below

commit 6ca65070cf243fdd5cf0591ac6609a59a893c7d9
Author: runner <runner@main-runner-zt478-cqkl7>
AuthorDate: Thu Feb 22 23:37:08 2024 +0000

    Publishing website 2024/02/22 23:37:08 at commit 11f9bce
---
 .../sdks/yaml-inline-python/index.html             | 128 ++++++++++++++++++++-
 .../documentation/sdks/yaml/index.html             |   8 +-
 website/generated-content/sitemap.xml              |   2 +-
 3 files changed, 132 insertions(+), 6 deletions(-)

diff --git 
a/website/generated-content/documentation/sdks/yaml-inline-python/index.html 
b/website/generated-content/documentation/sdks/yaml-inline-python/index.html
index 86b5625fce2..5afa8335c32 100644
--- a/website/generated-content/documentation/sdks/yaml-inline-python/index.html
+++ b/website/generated-content/documentation/sdks/yaml-inline-python/index.html
@@ -36,8 +36,132 @@
 <img class=banner-img-mobile 
src=/images/banners/tour-of-beam/tour-of-beam-mobile.png alt="Start Tour of 
Beam"></a></div><div class=swiper-slide><a 
href=https://beam.apache.org/documentation/ml/overview/><img 
class=banner-img-desktop 
src=/images/banners/machine-learning/machine-learning-desktop.jpg alt="Machine 
Learning">
 <img class=banner-img-mobile 
src=/images/banners/machine-learning/machine-learning-mobile.jpg alt="Machine 
Learning"></a></div></div><div class=swiper-pagination></div></div><script 
src=/js/swiper-bundle.min.min.e0e8f81b0b15728d35ff73c07f42ddbb17a108d6f23df4953cb3e60df7ade675.js></script>
 <script 
src=/js/sliders/top-banners.min.91104c476b3d8123ebee5ed9a8168556ec546abb698549551b38a0cee187ee1c.js></script>
-<script>function showSearch(){addPlaceholder();var 
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
 addPlaceholder(){$("input:text").attr("placeholder","What are you looking 
for?")}function endSearch(){var 
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
 blockScroll(){$("body").toggleClass(" [...]
-<a 
href=https://beam.apache.org/documentation/sdks/yaml-inline-python/>https://beam.apache.org/documentation/sdks/yaml-inline-python/</a></p></div></div><footer
 class=footer><div class=footer__contained><div class=footer__cols><div 
class="footer__cols__col footer__cols__col__logos"><div 
class=footer__cols__col__logo><img src=/images/beam_logo_circle.svg 
class=footer__logo alt="Beam logo"></div><div 
class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg 
class=footer__logo a [...]
+<script>function showSearch(){addPlaceholder();var 
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
 addPlaceholder(){$("input:text").attr("placeholder","What are you looking 
for?")}function endSearch(){var 
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
 blockScroll(){$("body").toggleClass(" [...]
+<code>PyTransform</code> type, simply referencing them by fully qualified name.
+For example,</p><pre tabindex=0><code>- type: PyTransform
+  config:
+    constructor: apache_beam.pkg.module.SomeTransform
+    args: [1, &#39;foo&#39;]
+    kwargs:
+       baz: 3
+</code></pre><p>will invoke the transform 
<code>apache_beam.pkg.mod.SomeTransform(1, 'foo', baz=3)</code>.
+This fully qualified name can be any PTransform class or other callable that
+returns a PTransform. Note, however, that PTransforms that do not accept or
+return schema&rsquo;d data may not be as useable to use from YAML.
+Restoring the schema-ness after a non-schema returning transform can be done
+by using the <code>callable</code> option on <code>MapToFields</code> which 
takes the entire element
+as an input, e.g.</p><pre tabindex=0><code>- type: PyTransform
+  config:
+    constructor: apache_beam.pkg.module.SomeTransform
+    args: [1, &#39;foo&#39;]
+    kwargs:
+       baz: 3
+- type: MapToFields
+  config:
+    language: python
+    fields:
+      col1:
+        callable: &#39;lambda element: element.col1&#39;
+        output_type: string
+      col2:
+        callable: &#39;lambda element: element.col2&#39;
+        output_type: integer
+</code></pre><p>This can be used to call arbitrary transforms in the Beam SDK, 
e.g.</p><pre tabindex=0><code>pipeline:
+  transforms:
+    - type: PyTransform
+      name: ReadFromTsv
+      input: {}
+      config:
+        constructor: apache_beam.io.ReadFromCsv
+        kwargs:
+           path: &#39;/path/to/*.tsv&#39;
+           sep: &#39;\t&#39;
+           skip_blank_lines: True
+           true_values: [&#39;yes&#39;]
+           false_values: [&#39;no&#39;]
+           comment: &#39;#&#39;
+           on_bad_lines: &#39;skip&#39;
+           binary: False
+           splittable: False
+</code></pre><h2 id=defining-a-transform-inline-using-__constructor__>Defining 
a transform inline using <code>__constructor__</code></h2><p>If the desired 
transform does not exist, one can define it inline as well.
+This is done with the special <code>__constructor__</code> keywords,
+similar to how cross-language transforms are done.</p><p>With the 
<code>__constuctor__</code> keyword, one defines a Python callable that, on
+invocation, <em>returns</em> the desired transform. The first argument (or 
<code>source</code>
+keyword argument, if there are no positional arguments)
+is interpreted as the Python code. For example</p><pre tabindex=0><code>- 
type: PyTransform
+  config:
+    constructor: __constructor__
+    kwargs:
+      source: |
+        import apache_beam as beam
+
+        def create_my_transform(inc):
+          return beam.Map(lambda x: beam.Row(a=x.col2 + inc))
+
+      inc: 10
+</code></pre><p>will apply <code>beam.Map(lambda x: beam.Row(a=x.col2 + 
10))</code> to the incoming
+PCollection.</p><p>As a class object can be invoked as its own constructor, 
this allows one to
+define a <code>beam.PTransform</code> inline, e.g.</p><pre tabindex=0><code>- 
type: PyTransform
+  config:
+    constructor: __constructor__
+    kwargs:
+      source: |
+        class MyPTransform(beam.PTransform):
+          def __init__(self, inc):
+            self._inc = inc
+          def expand(self, pcoll):
+            return pcoll | beam.Map(lambda x: beam.Row(a=x.col2 + self._inc))
+
+      inc: 10
+</code></pre><p>which works exactly as one would expect.</p><h2 
id=defining-a-transform-inline-using-__callable__>Defining a transform inline 
using <code>__callable__</code></h2><p>The <code>__callable__</code> keyword 
works similarly, but instead of defining a
+callable that returns an applicable <code>PTransform</code> one simply defines 
the
+expansion to be performed as a callable. This is analogous to 
BeamPython&rsquo;s
+<code>ptransform.ptransform_fn</code> decorator.</p><p>In this case one can 
simply write</p><pre tabindex=0><code>- type: PyTransform
+  config:
+    constructor: __callable__
+    kwargs:
+      source: |
+        def my_ptransform(pcoll, inc):
+          return pcoll | beam.Map(lambda x: beam.Row(a=x.col2 + inc))
+
+      inc: 10
+</code></pre><h1 id=external-transforms>External transforms</h1><p>One can 
also invoke PTransforms define elsewhere via a <code>python</code> provider,
+for example</p><pre tabindex=0><code>pipeline:
+  transforms:
+    - ...
+    - type: MyTransform
+      config:
+        kwarg: whatever
+
+providers:
+  - ...
+  - type: python
+    input: ...
+    config:
+      packages:
+        - &#39;some_pypi_package&gt;=version&#39;
+    transforms:
+      MyTransform: &#39;pkg.module.MyTransform&#39;
+</code></pre><p>These can be defined inline as well, with or without 
dependencies, e.g.</p><pre tabindex=0><code>pipeline:
+  transforms:
+    - ...
+    - type: ToCase
+      input: ...
+      config:
+        upper: True
+
+providers:
+  - type: python
+    config: {}
+    transforms:
+      &#39;ToCase&#39;: |
+        @beam.ptransform_fn
+        def ToCase(pcoll, upper):
+          if upper:
+            return pcoll | beam.Map(lambda x: str(x).upper())
+          else:
+            return pcoll | beam.Map(lambda x: str(x).lower())
+</code></pre></div></div><footer class=footer><div 
class=footer__contained><div class=footer__cols><div class="footer__cols__col 
footer__cols__col__logos"><div class=footer__cols__col__logo><img 
src=/images/beam_logo_circle.svg class=footer__logo alt="Beam logo"></div><div 
class=footer__cols__col__logo><img src=/images/apache_logo_circle.svg 
class=footer__logo alt="Apache logo"></div></div><div class=footer-wrapper><div 
class=wrapper-grid><div class=footer__cols__col><div class=footer__c [...]
 <a href=https://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam 
logo, and the Apache feather logo are either registered trademarks or 
trademarks of The Apache Software Foundation. All other products or name brands 
are trademarks of their respective holders, including The Apache Software 
Foundation.</div></div><div class="footer__cols__col 
footer__cols__col__logos"><div class=footer__cols__col--group><div 
class=footer__cols__col__logo><a href=https://github.com/apache/beam><im [...]
\ No newline at end of file
diff --git a/website/generated-content/documentation/sdks/yaml/index.html 
b/website/generated-content/documentation/sdks/yaml/index.html
index a05b94675ac..ff48cf71c28 100644
--- a/website/generated-content/documentation/sdks/yaml/index.html
+++ b/website/generated-content/documentation/sdks/yaml/index.html
@@ -69,8 +69,10 @@ runner such as Flink or Dataflow.</p><p>Once the 
prerequisites are installed, yo
 in a yaml file as</p><pre tabindex=0><code>python -m apache_beam.yaml.main 
--yaml_pipeline_file=/path/to/pipeline.yaml [other pipeline options such as the 
runner]
 </code></pre><p>You can do a dry-run of your pipeline using the render runner 
to see what the
 execution graph is, e.g.</p><pre tabindex=0><code>python -m 
apache_beam.yaml.main --yaml_pipeline_file=/path/to/pipeline.yaml 
--runner=apache_beam.runners.render.RenderRunner --render_output=out.png 
[--render_port=0]
-</code></pre><p>(This requires <a 
href=https://graphviz.org/download/>Graphviz</a> to be installed to render the 
pipeline.)</p><p>We intend to support running a pipeline on Dataflow by 
directly passing the
-yaml specification to a template, no local installation of the Beam SDKs 
required.</p><h2 id=example-pipelines>Example pipelines</h2><p>Here is a simple 
pipeline that reads some data from csv files and
+</code></pre><p>(This requires <a 
href=https://graphviz.org/download/>Graphviz</a> to be installed to render the 
pipeline.)</p><p>You can also submit a YAML pipeline directly by using the 
Dataflow CLI command
+<a 
href=https://cloud.google.com/sdk/gcloud/reference/beta/dataflow/yaml/run><code>gcloud
 beta dataflow yaml run</code></a>.
+When you use the <code>gcloud</code> CLI, you don&rsquo;t need to install the 
Beam SDKs locally.</p><pre tabindex=0><code>gcloud beta dataflow yaml run 
job_name --yaml-pipeline-file=/path/to/pipeline.yaml --region=europe-west1
+</code></pre><h2 id=example-pipelines>Example pipelines</h2><p>Here is a 
simple pipeline that reads some data from csv files and
 writes it out in json format.</p><pre tabindex=0><code>pipeline:
   transforms:
     - type: ReadFromCsv
@@ -463,7 +465,7 @@ providers:
         - /path/to/local/package.zip
     transforms:
        MyCustomTransform: &#34;pkg.subpkg.PTransformClassOrCallable&#34;
-</code></pre><h2 id=other-resources>Other Resources</h2><ul><li><a 
href=https://gist.github.com/robertwb/2cb26973f1b1203e8f5f8f88c5764da0>Example 
pipelines</a></li><li><a 
href=https://github.com/Polber/beam/tree/jkinard/bug-bash/sdks/python/apache_beam/yaml/examples>More
 examples</a></li><li><a 
href=https://gist.github.com/robertwb/64e2f51ff88320eeb6ffd96634202df7>Transform
 glossary</a></li></ul><p>Additional documentation in this 
directory</p><ul><li><a href=yaml_mapping.md>Mapping</a>< [...]
+</code></pre><h2 id=other-resources>Other Resources</h2><ul><li><a 
href=https://gist.github.com/robertwb/2cb26973f1b1203e8f5f8f88c5764da0>Example 
pipelines</a></li><li><a 
href=https://github.com/Polber/beam/tree/jkinard/bug-bash/sdks/python/apache_beam/yaml/examples>More
 examples</a></li><li><a 
href=https://gist.github.com/robertwb/64e2f51ff88320eeb6ffd96634202df7>Transform
 glossary</a></li></ul></div></div><footer class=footer><div 
class=footer__contained><div class=footer__cols><div cl [...]
 <a href=https://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam 
logo, and the Apache feather logo are either registered trademarks or 
trademarks of The Apache Software Foundation. All other products or name brands 
are trademarks of their respective holders, including The Apache Software 
Foundation.</div></div><div class="footer__cols__col 
footer__cols__col__logos"><div class=footer__cols__col--group><div 
class=footer__cols__col__logo><a href=https://github.com/apache/beam><im [...]
\ No newline at end of file
diff --git a/website/generated-content/sitemap.xml 
b/website/generated-content/sitemap.xml
index af889f77d2b..622e7a7e6ca 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.54.0/</loc><lastmod>2024-02-22T16:51:24+01:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2024-02-22T16:51:24+01:00</lastmod></url><url><loc>/blog/</loc><lastmod>2024-02-22T16:51:24+01:00</lastmod></url><url><loc>/categories/</loc><lastmod>2024-02-22T16:51:24+01:00</lastmod></url><url><loc>/catego
 [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"; 
xmlns:xhtml="http://www.w3.org/1999/xhtml";><url><loc>/blog/beam-2.54.0/</loc><lastmod>2024-02-22T16:13:55-05:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2024-02-22T16:13:55-05:00</lastmod></url><url><loc>/blog/</loc><lastmod>2024-02-22T16:13:55-05:00</lastmod></url><url><loc>/categories/</loc><lastmod>2024-02-22T16:13:55-05:00</lastmod></url><url><loc>/catego
 [...]
\ No newline at end of file

Reply via email to