How should your idea work on systems without docker? Like FreeBSD? And why you made such leaky tasks which couldn't be isolated with common tools like system packages, venv, etc.
-- ,,,^..^,,, On Fri, Dec 17, 2021 at 2:53 AM Ping Zhang <[email protected]> wrote: > Hi Airflow Community, > > This is Ping Zhang from the Airbnb Airflow team. We would like to open > source our internal feature: docker runtime isolation for airflow tasks. It > has been in our production for close to 1 year and it is very stable. > > I will create an AIP after the discussion. > > Thanks, > > Ping > > > Motivation > > Airflow worker host is a shared resource among all tasks running on it. > Thus, it requires hosts to provision dependencies for all tasks, including > system and python application level dependencies. It leads to a very fat > runtime, thus long host provision time and low elasticity in the worker > resource. This makes it challenging to prepare for unexpected burst load, > including a large backfill or a rerun of large DAGs. > > The lack of runtime isolation makes it challenging and risky to do > operations, including adding/upgrading system and python dependencies, and > it is almost impossible to remove any dependencies. It also incurs lots of > additional operating costs for the team as users do not have permission to > add/upgrade python dependencies, which requires us to coordinate with them. > When there are package version conflicts, it prevents installing them > directly on the host. Users have to use PythonVirtualenvOperator, which > slows down their development cycle. > > What change do you propose to make? > > To solve those problems, we propose introducing runtime isolation for > Airflow tasks. It leverages docker as the tasks runtime environment. There > are several benefits: > > 1. > > Provide runtime isolation on task level > 2. > > Customize runtime to parse dag files > 3. > > Lean runtime on airflow host, which enables high worker resource > elasticity > 4. > > Immutable and portable task execution untime > 5. > > Process isolation ensures that all subprocesses of a task are cleaned > up after docker exits (we have seen some orphaned hive, spark subprocesses > after the airflow run process exits) > > ChangesAirflow Worker > > In the new design, the `airflow run local` and `airflow run raw` > processes are running inside a docker container, which is launched by an > airflow worker. In this way, the airflow worker runtime only needs minimum > requirements to run airflow core and docker. > Airflow Scheduler > > Instead of processing the DAG file directly, the DagFileProcessor process > > 1. > > launches a docker container required by that DAG file to process it > and persists the serializable DAGs (SimpleDags) to a file so that the > result can be read outside the docker container > 2. > > reads the file persisted from the docker container, deserializes it > and puts the result into the multiprocess queue > > > This ensures the DAG parsing runtime is exactly the same as DAG execution > runtime. > > This requires a DAG definition file to tell the DAG file processing loop > to use which docker image to process it. We can easily achieve this by > having a metadata file along with the DAG definition file to define the > docker runtime. To ease the burden of users, a default docker image is > provided when a DAG definition file does not require customized runtime. > As a Whole > > > > > > > Best wishes > > Ping Zhang >
