thomasniebler opened a new pull request #12104:
URL: https://github.com/apache/airflow/pull/12104


   **Apache Airflow version**: 1.10.12
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:  CentOS 7/Python 3.7
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release):
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   When starting a Docker container using DockerOperator, where the container 
exits immediately, the Airflow task hangs indefinitely. Related to 
https://github.com/docker/docker-py/issues/2087
   
   
   **What you expected to happen**:
   
   The DockerOperator should terminate if the corresponding Docker container 
already stopped.
   
   **How to reproduce it**:
   
   ```python
   from datetime import timedelta
   
   from airflow import DAG
   from airflow.operators.docker_operator import DockerOperator
   
   dummy_DAG = DAG("Dummy", default_args={
       "start_date": "2020-11-04 12:00:00"
   }, schedule_interval="* * * * *")
   
   python_sleep_operator = DockerOperator(
       image="python:3.7",
       command="python3 -c \"import taime; time.sleep(5); print('SUCCESS')\"",  
# this typo is indeed intended.
       task_id="sleepy",
       dag=dummy_DAG,
   )
   ```
   
   **Anything else we need to know**:
   
   In my opinion, the concrete issue is that `APIClient.attach(stream=True)` 
waits for any output, but as the container has stopped, there won't be any. 
Curiously, `docker attach someStoppedContainer` on the command line fails with 
an according error message, but the Python client's call does not, but it 
simply waits, causing the Airflow task to hang indefinitely.
   
   My proposal simply is to exchange `APIClient.attach` with `APIClient.logs`, 
as we do not necessarily have to attach to the container (which allows 
bidirectional data exchange), but only want to read `stdout` and `stderr`. 
Furthermore, `APIClient.logs` is able to "stream" all output logs from an 
exited container, whereas `APIClient.attach(stream=True)` waits for output from 
that *exited* container.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to