PKJonas opened a new issue #13905:
URL: https://github.com/apache/airflow/issues/13905
### Bug
**Expected behavior**:
`DockerOperator` should attempt to pull an image when it is not present
locally.
**Actual behavior**:
`DockerOperator` does not attempt to pull an image unless `force_pull` is
set to `True`, instead displaying a misleading 404 error.
**Package versions**:
`apache-airflow-providers-docker 1.0.0`
`docker 3.7.3`
**Workaround**:
Set `force_pull` to `True`
### Minimal repro
Make sure you don't have an image tagged `ubuntu:latest` present locally.
```
DockerOperator(
task_id=f'try_to_pull_ubuntu',
image='ubuntu:latest',
command=f'''echo hello'''
)
```
prints: `{taskinstance.py:1396} ERROR - 404 Client Error: Not Found ("No
such image: ubuntu:latest")`
This, on the other hand:
```
DockerOperator(
task_id=f'try_to_pull_ubuntu',
image='ubuntu:latest',
command=f'''echo hello''',
force_pull=True
)
```
pulls the image and prints `{docker.py:263} INFO - hello`
### Source of the bug
When trying to run an image that's not present locally,
`self.cli.images(name=self.image)` in the line:
https://github.com/apache/airflow/blob/8723b1feb82339d7a4ba5b40a6c4d4bbb995a4f9/airflow/providers/docker/operators/docker.py#L286
returns a non-empty array even when the image has been deleted from the local
machine. If `force_pull` is not set, a pull is not attempted and a misleading
`404` error is raised downstream.
In fact, `self.cli.images` appears to return non-empty arrays even when
supplied with nonsense image names.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]