[
https://issues.apache.org/jira/browse/BEAM-13314?focusedWorklogId=743427&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-743427
]
ASF GitHub Bot logged work on BEAM-13314:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Mar/22 20:49
Start Date: 17/Mar/22 20:49
Worklog Time Spent: 10m
Work Description: tvalentyn commented on a change in pull request #16938:
URL: https://github.com/apache/beam/pull/16938#discussion_r829493051
##########
File path:
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md
##########
@@ -123,3 +133,20 @@ If your pipeline uses non-Python packages (e.g. packages
that require installati
--setup_file /path/to/setup.py
**Note:** Because custom commands execute after the dependencies for your
workflow are installed (by `pip`), you should omit the PyPI package dependency
from the pipeline's `requirements.txt` file and from the `install_requires`
parameter in the `setuptools.setup()` call of your `setup.py` file.
+
+## Pre-building SDK container image
+
+In the pre-building step, we install pipeline dependencies on the container
image prior to the job submission. This would speed up the pipeline execution.\
+To use pre-building the dependencies from `requirements.txt` on the container
image. Follow the steps below.
+1. Provide the container engine. We support `local_docker` and
`cloud_build`(requires a GCP project with Cloud Build API enabled).
+
+ --prebuild_sdk_container_engine <execution_environment>
+2. To pass a base image for pre-building dependencies, enable this flag. If
not, apache beam's base image would be used.
+
+ --sdk_container_image <location_to_base_image>
+3. To push the container image, pre-built locally with `local_docker` , to a
remote repository(eg: docker registry), provide URL to the remote registry by
passing
+
+ --docker_registry_push_url <IMAGE_URL>
Review comment:
Can we give an example of the expected value? As a user reading this doc
it is still not obvious.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 743427)
Time Spent: 6h (was: 5h 50m)
> Revise recommendations to manage Python pipeline dependencies.
> ---------------------------------------------------------------
>
> Key: BEAM-13314
> URL: https://issues.apache.org/jira/browse/BEAM-13314
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core, website
> Reporter: Valentyn Tymofieiev
> Assignee: Anand Inguva
> Priority: P2
> Labels: usability
> Time Spent: 6h
> Remaining Estimate: 0h
>
> The page
> https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
> recommends managing Python dependencies via requirements files.
> This approach is currently inefficient in light of introduction and adoption
> of PEP-517 by some packages, see:
> https://lists.apache.org/thread/trljnxo39c0cmff790yff3h8n5okqt3q and the
> rest of the thread, and does not mention Custom Containers or SDK prebuilding
> workflows.
>
> We should revise it and document best practices.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)