[ 
https://issues.apache.org/jira/browse/BEAM-13314?focusedWorklogId=743437&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-743437
 ]

ASF GitHub Bot logged work on BEAM-13314:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Mar/22 21:21
            Start Date: 17/Mar/22 21:21
    Worklog Time Spent: 10m 
      Work Description: AnandInguva commented on a change in pull request 
#16938:
URL: https://github.com/apache/beam/pull/16938#discussion_r829521957



##########
File path: 
website/www/site/content/en/documentation/sdks/python-pipeline-dependencies.md
##########
@@ -123,3 +133,20 @@ If your pipeline uses non-Python packages (e.g. packages 
that require installati
         --setup_file /path/to/setup.py
 
 **Note:** Because custom commands execute after the dependencies for your 
workflow are installed (by `pip`), you should omit the PyPI package dependency 
from the pipeline's `requirements.txt` file and from the `install_requires` 
parameter in the `setuptools.setup()` call of your `setup.py` file.
+
+## Pre-building SDK container image
+
+In the pre-building step, we install pipeline dependencies on the container 
image prior to the job submission. This would speed up the pipeline execution.\
+To use pre-building the dependencies from `requirements.txt` on the container 
image. Follow the steps below.
+1. Provide the container engine. We support `local_docker` and 
`cloud_build`(requires a GCP project with Cloud Build API enabled).
+
+       --prebuild_sdk_container_engine <execution_environment>
+2. To pass a base image for pre-building dependencies, enable this flag. If 
not, apache beam's base image would be used.
+
+       --sdk_container_image <location_to_base_image>
+3. To push the container image, pre-built locally with `local_docker` , to a 
remote repository(eg: docker registry), provide URL to the remote registry by 
passing
+
+       --docker_registry_push_url <IMAGE_URL>

Review comment:
       We can add an example. Also let me see if I can make the wording more 
simpler




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 743437)
    Time Spent: 6h 10m  (was: 6h)

> Revise recommendations to manage Python pipeline dependencies. 
> ---------------------------------------------------------------
>
>                 Key: BEAM-13314
>                 URL: https://issues.apache.org/jira/browse/BEAM-13314
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core, website
>            Reporter: Valentyn Tymofieiev
>            Assignee: Anand Inguva
>            Priority: P2
>              Labels: usability
>          Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> The page  
> https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/ 
> recommends managing Python dependencies via requirements files.
> This approach is currently inefficient in light of introduction and adoption 
> of PEP-517 by some packages, see: 
> https://lists.apache.org/thread/trljnxo39c0cmff790yff3h8n5okqt3q  and the 
> rest of the thread, and does not mention Custom Containers or SDK prebuilding 
> workflows.
>  
> We should revise it and document best practices.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to