This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 8802c4d9a6f Publishing website 2024/03/09 05:37:34 at commit 00526dd
8802c4d9a6f is described below
commit 8802c4d9a6f444b7f7e7b5f54ffbaf79d4a2dffb
Author: runner <runner@main-runner-zt478-27v85>
AuthorDate: Sat Mar 9 05:37:34 2024 +0000
Publishing website 2024/03/09 05:37:34 at commit 00526dd
---
.../sdks/python-pipeline-dependencies/index.html | 15 +++++++++++++--
website/generated-content/sitemap.xml | 2 +-
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git
a/website/generated-content/documentation/sdks/python-pipeline-dependencies/index.html
b/website/generated-content/documentation/sdks/python-pipeline-dependencies/index.html
index 222bd7e30d6..2a6bc936d4c 100644
---
a/website/generated-content/documentation/sdks/python-pipeline-dependencies/index.html
+++
b/website/generated-content/documentation/sdks/python-pipeline-dependencies/index.html
@@ -39,7 +39,15 @@
<script>function showSearch(){addPlaceholder();var
e,t=document.querySelector(".searchBar");t.classList.remove("disappear"),e=document.querySelector("#iconsBar"),e.classList.add("disappear")}function
addPlaceholder(){$("input:text").attr("placeholder","What are you looking
for?")}function endSearch(){var
e,t=document.querySelector(".searchBar");t.classList.add("disappear"),e=document.querySelector("#iconsBar"),e.classList.remove("disappear")}function
blockScroll(){$("body").toggleClass(" [...]
<code>requirements.txt</code> file. For more complex scenarios, define the <a
href=#multiple-file-dependencies>pipeline in a package</a> and consider
installing your dependencies in a <a href=#custom-containers>custom
container</a>.</p><p>To supply a requirements.txt file:</p><ol><li><p>Find out
which packages are installed on your machine. Run the following
command:</p><pre><code> pip freeze > requirements.txt
</code></pre><p>This command creates a <code>requirements.txt</code> file that
lists all packages that are installed on your machine, regardless of where they
were installed from.</p></li><li><p>Edit the <code>requirements.txt</code> file
and delete all packages that are not relevant to your code.</p></li><li><p>Run
your pipeline with the following command-line option:</p><pre><code>
--requirements_file requirements.txt
-</code></pre><p>The runner will use the <code>requirements.txt</code> file to
install your additional dependencies onto the remote
workers.</p></li></ol><blockquote><p><strong>NOTE</strong>: An alternative to
<code>pip freeze</code> is to use a library like <a
href=https://github.com/jazzband/pip-tools>pip-tools</a> to compile all the
dependencies required for the pipeline from a <code>--requirements_file</code>,
where only top-level dependencies are mentioned.</p></blockquote><h2 id=cus
[...]
+</code></pre><p>The runner will use the <code>requirements.txt</code> file to
install your additional dependencies onto the remote
workers.</p></li></ol><blockquote><p><strong>NOTE</strong>: As an alternative
to <code>pip freeze</code>, use a library like <a
href=https://github.com/jazzband/pip-tools>pip-tools</a> to compile all of the
dependencies required for the pipeline from a <code>requirements.in</code>
file. In the <code>requirements.in</code> file, only the top-level dependencies
[...]
+the specified packages locally into a requirements cache directory,
+and then stages the requirements cache directory to the runner.
+At runtime, when available, Beam installs packages from the requirements cache.
+This mechanism makes it possible to stage the dependency packages to the runner
+at submission. At runtime, the runner workers might be able to install the
+packages from the cache without needing a connection to PyPI. To disable
staging the
+requirements, use the <code>--requirements_cache=skip</code> pipeline option.
+For more information, see the <a
href=https://beam.apache.org/releases/pydoc/current/_modules/apache_beam/options/pipeline_options.html#SetupOptions>help
descriptions of these pipeline options</a>.</p><h2 id=custom-containers>Custom
Containers</h2><p>You can pass a <a
href="https://hub.docker.com/search?q=apache%2Fbeam&type=image">container</a>
image with all the dependencies that are needed for the pipeline. <a
href=/documentation/runtime/environments/#running-pipelines>Follow the i [...]
COPY <path to requirements.txt> /tmp/requirements.txt
RUN python -m pip install -r /tmp/requirements.txt
</code></pre></li></ol><h2 id=local-or-nonpypi>Local Python packages or
non-public Python Dependencies</h2><p>If your pipeline uses packages that are
not available publicly (e.g. packages that you’ve downloaded from a
GitHub repo), make these packages available remotely by performing the
following steps:</p><ol><li><p>Identify which packages are installed on your
machine and are not public. Run the following command:</p><p>pip
freeze</p><p>This command lists all packages that are i [...]
@@ -65,7 +73,10 @@ RUN python -m pip install -r /tmp/requirements.txt
other_utils_and_helpers.py
</code></pre><p>See <a
href=https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/juliaset>Juliaset</a>
for an example that follows this project structure.</p></li><li><p>Install
your package in the submission environment, for example by using the following
command:</p><pre><code> pip install -e .
</code></pre></li><li><p>Run your pipeline with the following command-line
option:</p><pre><code> --setup_file /path/to/setup.py
-</code></pre></li></ol><p><strong>Note:</strong> If you <a
href=#pypi-dependencies>created a requirements.txt file</a> and your project
spans multiple files, you can get rid of the <code>requirements.txt</code> file
and instead, add all packages contained in <code>requirements.txt</code> to the
<code>install_requires</code> field of the setup call (in step 1).</p><h2
id=nonpython>Non-Python Dependencies or PyPI Dependencies with Non-Python
Dependencies</h2><p>If your pipeline uses non-Py [...]
+</code></pre></li></ol><p><strong>Note:</strong> It is not necessary to supply
the <code>--requirements_file</code> <a href=#pypi-dependencies>option</a> if
the dependencies of your package are defined in the
<code>install_requires</code> field of the <code>setup.py</code> file (see step
1).
+However unlike with the <code>--requirements_file</code> option, when you use
the <code>--setup_file</code> option, Beam doesn’t stage the dependent
packages to the runner.
+Only the pipeline package is staged. If they aren’t already provided in
the runtime environment,
+the package dependencies are installed from PyPI at runtime.</p><h2
id=nonpython>Non-Python Dependencies or PyPI Dependencies with Non-Python
Dependencies</h2><p>If your pipeline uses non-Python packages, such as packages
that require installation using the <code>apt install</code> command, or uses a
PyPI package that depends on non-Python dependencies during package
installation, we recommend installing them using a <a
href=#custom-containers>custom container</a>.
Otherwise, you must perform the following steps.</p><ol><li><p><a
href=#multiple-file-dependencies>Structure your pipeline as a
package</a>.</p></li><li><p>Add the required installation commands for the
non-Python dependencies, such as the <code>apt install</code> commands, to the
list of <code>CUSTOM_COMMANDS</code> in your <code>setup.py</code> file. See
the <a
href=https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py>Juliaset
setup.py [...]
</code></pre></li></ol><p><strong>Note:</strong> Because custom commands
execute after the dependencies for your workflow are installed (by
<code>pip</code>), you should omit the PyPI package dependency from the
pipeline’s <code>requirements.txt</code> file and from the
<code>install_requires</code> parameter in the <code>setuptools.setup()</code>
call of your <code>setup.py</code> file.</p><h2
id=pre-building-sdk-container-image>Pre-building SDK Container Image</h2><p>In
pipeline [...]
However, it may be possible to pre-build the SDK containers and perform the
dependency installation once before the workers start with
<code>--prebuild_sdk_container_engine</code>. For instructions of how to use
pre-building with Google Cloud
diff --git a/website/generated-content/sitemap.xml
b/website/generated-content/sitemap.xml
index 4d8d1aaf775..ccc486b2b37 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.54.0/</loc><lastmod>2024-03-08T17:55:34-05:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2024-03-08T17:55:34-05:00</lastmod></url><url><loc>/blog/</loc><lastmod>2024-03-08T17:55:34-05:00</lastmod></url><url><loc>/categories/</loc><lastmod>2024-03-08T17:55:34-05:00</lastmod></url><url><loc>/catego
[...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.54.0/</loc><lastmod>2024-03-08T20:15:04-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2024-03-08T20:15:04-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2024-03-08T20:15:04-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2024-03-08T20:15:04-08:00</lastmod></url><url><loc>/catego
[...]
\ No newline at end of file