dimberman opened a new pull request #15330:
URL: https://github.com/apache/airflow/pull/15330
Add the ability to run @task.docker on a python function and turn it into a
DockerOperator that can run that python function remotely.
```
@task.docker(
image="quay.io/bitnami/python:3.8.8",
force_pull=True,
docker_url="unix://var/run/docker.sock",
network_mode="bridge",
api_version='auto',
)
def f():
import random
return [random.random() for i in range(10000000)]
```
One notable aspect of this architecture is that we had to build it to make
as few assumptions about user setups as possible. We could not share a volume
between the worker and the container as this would break if the user runs the
airflow worker on a docker container. We could not assume that users would have
any specialized system libraries on their images (this implementation only
requires python 3 and bash).
To work with these requirements, we use base64 encoding to store a jinja
generated python file and inputs (which are generated using the same functions
used by the PythonVirtualEnvOperator). Once the container starts, it uses these
environment variables to deserialize the strings, run the function, and store
the result in a file located at /tmp/script.out.
Once the function completes, we create a sleep loop until the DockerOperator
retrieves the result via docker's get_archive API. This result can then be
deserialized using pickle and sent to Airflow's XCom library in the same
fashion as a python or python_virtualenv result.
---
**^ Add meaningful description above**
Read the **[Pull Request
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
for more information.
In case of fundamental code change, Airflow Improvement Proposal
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party
License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]