rszper commented on code in PR #30450:
URL: https://github.com/apache/beam/pull/30450#discussion_r1506812879
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -111,22 +123,14 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
## Non-Python Dependencies or PyPI Dependencies with Non-Python Dependencies
{#nonpython}
-If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt-get install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, you must
perform the following steps.
-
-1. Add the required installation commands (e.g. the `apt-get install`
commands) for the non-Python dependencies to the list of `CUSTOM_COMMANDS` in
your `setup.py` file. See the [Juliaset
setup.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
+If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, we recommend
installing them using a (custom container){#custom-containers}.
Review Comment:
```suggestion
If your pipeline uses non-Python packages, such as packages that require
installation using the `apt install` command, or uses a PyPI package that
depends on non-Python dependencies during package installation, we recommend
installing them using a [custom container](#custom-containers).
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -111,22 +123,14 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
## Non-Python Dependencies or PyPI Dependencies with Non-Python Dependencies
{#nonpython}
-If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt-get install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, you must
perform the following steps.
-
-1. Add the required installation commands (e.g. the `apt-get install`
commands) for the non-Python dependencies to the list of `CUSTOM_COMMANDS` in
your `setup.py` file. See the [Juliaset
setup.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
+If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, we recommend
installing them using a (custom container){#custom-containers}.
+Alternatively you must perform the following steps.
Review Comment:
```suggestion
Otherwise, you must perform the following steps.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -89,20 +92,29 @@ Often, your pipeline code spans multiple files. To run your
project remotely, yo
setuptools.setup(
name='PACKAGE-NAME',
version='PACKAGE-VERSION',
- install_requires=[],
+ install_requires=[
+ # List Python packages your pipeline depends on.
+ ],
packages=setuptools.find_packages(),
)
-2. Structure your project so that the root directory contains the `setup.py`
file, the main workflow file, and a directory with the rest of the files.
+2. Structure your project so that the root directory contains the `setup.py`
file, the main workflow file, and a directory with the rest of the files, for
example:
root_dir/
setup.py
main.py
- other_files_dir/
+ my_package/
+ my_pipeline_launcher.py
+ my_custom_dofns_and_transforms.py
+ other_utils_and_helpers.py
- See
[Juliaset](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/juliaset)
for an example that follows this required project structure.
+ See
[Juliaset](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/juliaset)
for an example that follows this project structure.
-3. Run your pipeline with the following command-line option:
+3. Install your package in the submission environment, for example via:
Review Comment:
```suggestion
3. Install your package in the submission environment, for example by using
the following command:
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -111,22 +123,14 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
## Non-Python Dependencies or PyPI Dependencies with Non-Python Dependencies
{#nonpython}
-If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt-get install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, you must
perform the following steps.
-
-1. Add the required installation commands (e.g. the `apt-get install`
commands) for the non-Python dependencies to the list of `CUSTOM_COMMANDS` in
your `setup.py` file. See the [Juliaset
setup.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
+If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, we recommend
installing them using a (custom container){#custom-containers}.
+Alternatively you must perform the following steps.
- **Note:** You must make sure that these commands are runnable on the
remote worker (e.g. if you use `apt-get`, the remote worker needs `apt-get`
support).
+1. [Structure your pipeline as a package](#multiple-file-dependencies).
-2. If you are using a PyPI package that depends on non-Python dependencies,
add `['pip', 'install', '<your PyPI package>']` to the list of
`CUSTOM_COMMANDS` in your `setup.py` file.
-
-3. Structure your project so that the root directory contains the `setup.py`
file, the main workflow file, and a directory with the rest of the files.
-
- root_dir/
- setup.py
- main.py
- other_files_dir/
+2. Add the required installation commands (e.g. the `apt install` commands)
for the non-Python dependencies to the list of `CUSTOM_COMMANDS` in your
`setup.py` file. See the [Juliaset
setup.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
- See the
[Juliaset](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/juliaset)
project for an example that follows this required project structure.
+ **Note:** You must make sure that these commands are runnable on the
remote worker (e.g. if you use `apt`, the remote worker needs `apt` support).
Review Comment:
```suggestion
**Note:** You must verify that these commands run on the remote worker.
For example, if you use `apt`, the remote worker needs `apt` support.
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -25,9 +25,12 @@ Dependency management is about specifying dependencies that
your pipeline requir
## PyPI Dependencies {#pypi-dependencies}
-If your pipeline uses public packages from the [Python Package
Index](https://pypi.python.org/), make these packages available remotely by
performing the following steps:
+If your pipeline uses public packages from the [Python Package
Index](https://pypi.python.org/), you must make these packages available
remotely on the workers.
-**Note:** If your PyPI package depends on a non-Python package (e.g. a package
that requires installation on Linux using the `apt-get install` command), see
the [PyPI Dependencies with Non-Python Dependencies](#nonpython) section
instead.
+For simplest pipelines where a pipeline consists only of a single Python file
or a notebook, the easiest way to supply dependencies is to provide a
Review Comment:
```suggestion
For pipelines that consists only of a single Python file or a notebook, the
most straightforward way to supply dependencies is to provide a
```
##########
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md:
##########
@@ -111,22 +123,14 @@ Often, your pipeline code spans multiple files. To run
your project remotely, yo
## Non-Python Dependencies or PyPI Dependencies with Non-Python Dependencies
{#nonpython}
-If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt-get install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, you must
perform the following steps.
-
-1. Add the required installation commands (e.g. the `apt-get install`
commands) for the non-Python dependencies to the list of `CUSTOM_COMMANDS` in
your `setup.py` file. See the [Juliaset
setup.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
+If your pipeline uses non-Python packages (e.g. packages that require
installation using the `apt install` command), or uses a PyPI package that
depends on non-Python dependencies during package installation, we recommend
installing them using a (custom container){#custom-containers}.
+Alternatively you must perform the following steps.
- **Note:** You must make sure that these commands are runnable on the
remote worker (e.g. if you use `apt-get`, the remote worker needs `apt-get`
support).
+1. [Structure your pipeline as a package](#multiple-file-dependencies).
-2. If you are using a PyPI package that depends on non-Python dependencies,
add `['pip', 'install', '<your PyPI package>']` to the list of
`CUSTOM_COMMANDS` in your `setup.py` file.
-
-3. Structure your project so that the root directory contains the `setup.py`
file, the main workflow file, and a directory with the rest of the files.
-
- root_dir/
- setup.py
- main.py
- other_files_dir/
+2. Add the required installation commands (e.g. the `apt install` commands)
for the non-Python dependencies to the list of `CUSTOM_COMMANDS` in your
`setup.py` file. See the [Juliaset
setup.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
Review Comment:
```suggestion
2. Add the required installation commands for the non-Python dependencies,
such as the `apt install` commands, to the list of `CUSTOM_COMMANDS` in your
`setup.py` file. See the [Juliaset setup.py
file](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py)
for an example.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]