Hi Alexander, Thanks for the inputs. Docker runtime is an add-on feature that is controlled by a feature flag. It does not force users to use docker to run tasks.
Best wishes Ping Zhang On Fri, Dec 17, 2021 at 2:53 AM Alexander Shorin <[email protected]> wrote: > How should your idea work on systems without docker? Like FreeBSD? And why > you made such leaky tasks which couldn't be isolated with common tools like > system packages, venv, etc. > > -- > ,,,^..^,,, > > > On Fri, Dec 17, 2021 at 2:53 AM Ping Zhang <[email protected]> wrote: > >> Hi Airflow Community, >> >> This is Ping Zhang from the Airbnb Airflow team. We would like to open >> source our internal feature: docker runtime isolation for airflow tasks. It >> has been in our production for close to 1 year and it is very stable. >> >> I will create an AIP after the discussion. >> >> Thanks, >> >> Ping >> >> >> Motivation >> >> Airflow worker host is a shared resource among all tasks running on it. >> Thus, it requires hosts to provision dependencies for all tasks, including >> system and python application level dependencies. It leads to a very fat >> runtime, thus long host provision time and low elasticity in the worker >> resource. This makes it challenging to prepare for unexpected burst load, >> including a large backfill or a rerun of large DAGs. >> >> The lack of runtime isolation makes it challenging and risky to do >> operations, including adding/upgrading system and python dependencies, and >> it is almost impossible to remove any dependencies. It also incurs lots of >> additional operating costs for the team as users do not have permission to >> add/upgrade python dependencies, which requires us to coordinate with them. >> When there are package version conflicts, it prevents installing them >> directly on the host. Users have to use PythonVirtualenvOperator, which >> slows down their development cycle. >> >> What change do you propose to make? >> >> To solve those problems, we propose introducing runtime isolation for >> Airflow tasks. It leverages docker as the tasks runtime environment. There >> are several benefits: >> >> 1. >> >> Provide runtime isolation on task level >> 2. >> >> Customize runtime to parse dag files >> 3. >> >> Lean runtime on airflow host, which enables high worker resource >> elasticity >> 4. >> >> Immutable and portable task execution untime >> 5. >> >> Process isolation ensures that all subprocesses of a task are cleaned >> up after docker exits (we have seen some orphaned hive, spark subprocesses >> after the airflow run process exits) >> >> ChangesAirflow Worker >> >> In the new design, the `airflow run local` and `airflow run raw` >> processes are running inside a docker container, which is launched by an >> airflow worker. In this way, the airflow worker runtime only needs minimum >> requirements to run airflow core and docker. >> Airflow Scheduler >> >> Instead of processing the DAG file directly, the DagFileProcessor process >> >> 1. >> >> launches a docker container required by that DAG file to process it >> and persists the serializable DAGs (SimpleDags) to a file so that the >> result can be read outside the docker container >> 2. >> >> reads the file persisted from the docker container, deserializes it >> and puts the result into the multiprocess queue >> >> >> This ensures the DAG parsing runtime is exactly the same as DAG execution >> runtime. >> >> This requires a DAG definition file to tell the DAG file processing loop >> to use which docker image to process it. We can easily achieve this by >> having a metadata file along with the DAG definition file to define the >> docker runtime. To ease the burden of users, a default docker image is >> provided when a DAG definition file does not require customized runtime. >> As a Whole >> >> >> >> >> >> >> Best wishes >> >> Ping Zhang >> >
