rszper commented on code in PR #30493:
URL: https://github.com/apache/beam/pull/30493#discussion_r1514988361
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
Review Comment:
```suggestion
> **NOTE**: As an alternative to `pip freeze`, use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all of the
dependencies required for the pipeline from a `requirements.in` file. In the
`requirements.in` file, only the top-level dependencies are mentioned.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
+if available. This mechanism allows staging dependency packages to the runner
+at submission, and at runtime the runner workers might be able to install the
+packages from cache, without a connection to PyPI. To disable staging the
+requirements, supply the `--requirements_cache=skip` pipeline option.
Review Comment:
```suggestion
requirements, use the `--requirements_cache=skip` pipeline option.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -118,7 +128,10 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
--setup_file /path/to/setup.py
-**Note:** If you [created a requirements.txt file](#pypi-dependencies) and
your project spans multiple files, you can get rid of the `requirements.txt`
file and instead, add all packages contained in `requirements.txt` to the
`install_requires` field of the setup call (in step 1).
+**Note:** It is not necessary to supply the `--requirements_file`
[option](#pypi-dependencies) if the dependenices of your package are defined in
the `install_requires` field of the `setup.py` file (see step 1).
+However unlike the `--requirements_file` option, when using the
`--setup_file` option, Beam does not stage the dependent packages to the Runner,
+only the pipeline package is staged and its dependencies are installed from
PyPI
Review Comment:
```suggestion
Only the pipeline package is staged. If they aren't already provided in the
runtime environment,
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
Review Comment:
```suggestion
the specified packages locally into a requirements cache directory,
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
Review Comment:
```suggestion
and then stages the requirements cache directory to the runner.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
+if available. This mechanism allows staging dependency packages to the runner
+at submission, and at runtime the runner workers might be able to install the
Review Comment:
```suggestion
at submission. At runtime, the runner workers might be able to install the
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
Review Comment:
```suggestion
When you supply the `--requirements_file` pipeline option, during pipeline
submission, Beam downloads
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
+if available. This mechanism allows staging dependency packages to the runner
Review Comment:
```suggestion
This mechanism makes it possible to stage the dependency packages to the
runner
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
Review Comment:
```suggestion
At runtime, when available, Beam installs packages from the requirements
cache.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
+if available. This mechanism allows staging dependency packages to the runner
+at submission, and at runtime the runner workers might be able to install the
+packages from cache, without a connection to PyPI. To disable staging the
Review Comment:
```suggestion
packages from the cache without needing a connection to PyPI. To disable
staging the
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -46,11 +46,21 @@ To supply a requirements.txt file:
The runner will use the `requirements.txt` file to install your additional
dependencies onto the remote workers.
-> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `--requirements_file`, where only
top-level dependencies are mentioned.
+> **NOTE**: An alternative to `pip freeze` is to use a library like
[pip-tools](https://github.com/jazzband/pip-tools) to compile all the
dependencies required for the pipeline from a `requirements.in` file, where
only the top-level dependencies are mentioned.
+
+When you supply the `--requirements_file` pipeline option, Beam downloads
+specified packages locally into a requirements cache directory during pipeline
+submission, and stages the requirements cache directory to the runner.
+At pipeline runtime, Beam prefers to install packages from requirements cache
+if available. This mechanism allows staging dependency packages to the runner
+at submission, and at runtime the runner workers might be able to install the
+packages from cache, without a connection to PyPI. To disable staging the
+requirements, supply the `--requirements_cache=skip` pipeline option.
+For more information, see the [help descriptions of these pipeline
options](https://beam.apache.org/releases/pydoc/current/_modules/apache_beam/options/pipeline_options.html#SetupOptions).
## Custom Containers {#custom-containers}
-You can pass a
[container](https://hub.docker.com/search?q=apache%2Fbeam&type=image) image
with all the dependencies that are needed for the pipeline instead of
`requirements.txt`. [Follow the instructions on how to run pipeline with Custom
Container images](/documentation/runtime/environments/#running-pipelines).
+You can pass a
[container](https://hub.docker.com/search?q=apache%2Fbeam&type=image) image
with all the dependencies that are needed for the pipeline. [Follow the
instructions on how to run pipeline with Custom Container
images](/documentation/runtime/environments/#running-pipelines).
Review Comment:
```suggestion
You can pass a
[container](https://hub.docker.com/search?q=apache%2Fbeam&type=image) image
with all the dependencies that are needed for the pipeline. [Follow the
instructions the show how to run the pipeline with custom container
images](/documentation/runtime/environments/#running-pipelines).
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -118,7 +128,10 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
--setup_file /path/to/setup.py
-**Note:** If you [created a requirements.txt file](#pypi-dependencies) and
your project spans multiple files, you can get rid of the `requirements.txt`
file and instead, add all packages contained in `requirements.txt` to the
`install_requires` field of the setup call (in step 1).
+**Note:** It is not necessary to supply the `--requirements_file`
[option](#pypi-dependencies) if the dependenices of your package are defined in
the `install_requires` field of the `setup.py` file (see step 1).
+However unlike the `--requirements_file` option, when using the
`--setup_file` option, Beam does not stage the dependent packages to the Runner,
+only the pipeline package is staged and its dependencies are installed from
PyPI
+at runtime if not already provided in the runtime environment.
Review Comment:
```suggestion
the package dependencies are installed from PyPI at runtime.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -118,7 +128,10 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
--setup_file /path/to/setup.py
-**Note:** If you [created a requirements.txt file](#pypi-dependencies) and
your project spans multiple files, you can get rid of the `requirements.txt`
file and instead, add all packages contained in `requirements.txt` to the
`install_requires` field of the setup call (in step 1).
+**Note:** It is not necessary to supply the `--requirements_file`
[option](#pypi-dependencies) if the dependenices of your package are defined in
the `install_requires` field of the `setup.py` file (see step 1).
Review Comment:
```suggestion
**Note:** It is not necessary to supply the `--requirements_file`
[option](#pypi-dependencies) if the dependencies of your package are defined in
the `install_requires` field of the `setup.py` file (see step 1).
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -118,7 +128,10 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
--setup_file /path/to/setup.py
-**Note:** If you [created a requirements.txt file](#pypi-dependencies) and
your project spans multiple files, you can get rid of the `requirements.txt`
file and instead, add all packages contained in `requirements.txt` to the
`install_requires` field of the setup call (in step 1).
+**Note:** It is not necessary to supply the `--requirements_file`
[option](#pypi-dependencies) if the dependenices of your package are defined in
the `install_requires` field of the `setup.py` file (see step 1).
+However unlike the `--requirements_file` option, when using the
`--setup_file` option, Beam does not stage the dependent packages to the Runner,
Review Comment:
```suggestion
However unlike with the `--requirements_file` option, when you use the
`--setup_file` option, Beam doesn't stage the dependent packages to the runner.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]