sanje2v opened a new issue #21832:
URL: https://github.com/apache/airflow/issues/21832
### Apache Airflow version
2.2.4 (latest released)
### What happened
I am using Airflow 2.2.4 docker which is to run a DAG, `test_dag.py`,
defined as follows:
```
from airflow.decorators import dag, task
from airflow.utils import dates
@dag(schedule_interval=None,
start_date=dates.days_ago(1),
catchup=False)
def test_dag():
@task.docker(image='company/my-repo',
api_version='auto',
docker_url='tcp://docker-socket-proxy:2375/',
auto_remove=True)
def docker_task(inp):
print(inp)
return inp+1
@task.python()
def python_task(inp):
print(inp)
out = docker_task(10)
python_task(out)
_ = test_dag()
```
The Dockerfile for 'company/my-repo' is as follows:
```
FROM nvidia/cuda:11.2.2-runtime-ubuntu20.04
USER root
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python3 python3-pip
```
### What you expected to happen
I expected the DAG logs for `docker_task()` and `python_task()` to have 10
and 11 as output respectively.
Instead, the internal Airflow unmarshaller that is supposed to unpickle the
function definition of `docker_task()` inside the container of image
`company/my-repo` via `__PYTHON_SCRIPT` environmental variable to run it, makes
an **incorrect assumption** that the symbol `python` is defined as an alias for
either `/usr/bin/python2` or `/usr/bin/python3`. Most linux python
installations require that users explicitly specify either `python2` or
`python3` when running their scripts and `python` is NOT defined even when
`python3` is installed via aptitude package manager.
This error can be resolved for now by adding the following to `Dockerfile`
after python3 package installation:
`RUN apt-get install -y python-is-python3`
But this should NOT be a requirement.
`Dockerfile`s using base python images do not suffer from this problem as
they have the alias `python` defined.
The error logged is:
```
[2022-02-26, 11:30:47 UTC] {docker.py:258} INFO - Starting docker container
from image company/my-repo
[2022-02-26, 11:30:48 UTC] {docker.py:320} INFO - + python -c 'import
base64, os;x = base64.b64decode(os.environ["__PYTHON_SCRIPT"]);f =
open("/tmp/script.py", "wb"); f.write(x);'
[2022-02-26, 11:30:48 UTC] {docker.py:320} INFO - bash: python: command not
found
[2022-02-26, 11:30:48 UTC] {taskinstance.py:1700} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py",
line 1329, in _run_raw_task
self._execute_task_with_callbacks(context)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py",
line 1455, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py",
line 1511, in _execute_task
result = execute_callable(context=context)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/docker/decorators/docker.py",
line 117, in execute
return super().execute(context)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/decorators/base.py",
line 134, in execute
return_value = super().execute(context)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/docker/operators/docker.py",
line 390, in execute
return self._run_image()
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/docker/operators/docker.py",
line 265, in _run_image
return self._run_image_with_mounts(self.mounts + [tmp_mount],
add_tmp_variable=True)
File
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/docker/operators/docker.py",
line 324, in _run_image_with_mounts
raise AirflowException('docker container failed: ' + repr(result) +
f"lines {res_lines}")
airflow.exceptions.AirflowException: docker container failed: {'Error':
None, 'StatusCode': 127}lines + python -c 'import base64, os;x =
base64.b64decode(os.environ["__PYTHON_SCRIPT"]);f = open("/tmp/script.py",
"wb"); f.write(x);'
bash: python: command not found
```
### How to reproduce
_No response_
### Operating System
Ubuntu 20.04 WSL 2
### Versions of Apache Airflow Providers
_No response_
### Deployment
Docker-Compose
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]