[ 
https://issues.apache.org/jira/browse/AIRFLOW-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5520.
-----------------------------------
    Fix Version/s: 2.0.0
       Resolution: Fixed

> DataflowPythonOperator dependency management requires side effects
> ------------------------------------------------------------------
>
>                 Key: AIRFLOW-5520
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5520
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: gcp
>    Affects Versions: 1.10.2
>            Reporter: Jacob Ferriero
>            Priority: Major
>             Fix For: 2.0.0
>
>
> When using DataflowPythonOperator it is difficult to manage apache beam 
> version, (and other python dependencies) without affecting your entire 
> airflow environment. It seems the Dataflow hook just submits a subprocess and 
> python 
> The operator / hook should be improved to isolate python dependencies for 
> running run py_file.
> Perhaps this could be achieved in a virtual environment (similar to 
> PythonVirtualEnvOperator).
> For beam it's often customary to specify a --requirements_file or 
> --setup_file to manage python dependencies, we could run one of these in the 
> venv to get it setup. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to