pasalkarsachin1 opened a new issue, #24681:
URL: https://github.com/apache/airflow/issues/24681

   ### Apache Airflow Provider(s)
   
   docker
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-docker==2.7.0
   docker==5.0.3
   
   ### Apache Airflow version
   
   2.3.2 (latest released)
   
   ### Operating System
   
   20.04.4 LTS (Focal Fossa)
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Deployed using docker compose command
   
   ### What happened
   
   Below is my dockeroperator code
   ```
   extract_data_from_presto = DockerOperator(
               task_id='download_data',
               image=IMAGE_NAME,
               api_version='auto',
               auto_remove=True,
               mount_tmp_dir=False,
               docker_url='unix://var/run/docker.sock',
               network_mode="host",
               tty=True,
               xcom_all=False,
               mounts=MOUNTS,
               environment={
                   "PYTHONPATH": "/opt",
               },
               command=f"test.py",
               retries=3,
               dag=dag,
           )
   ```
   Last line printed in docker is not getting pushed over xcom. In my case last 
line in docker is 
   
   `[2022-06-27, 08:31:34 UTC] {docker.py:312} INFO - {"day": 20220627, 
"batch": 1656318682, "source": "all",  "os": "ubuntu"}`
   
   However the xcom value returned shown in UI is empty
   <img width="1329" alt="image" 
src="https://user-images.githubusercontent.com/25153155/175916850-8f50c579-9d26-44bc-94ae-6d072701ff0b.png";>
   
   
   
   ### What you think should happen instead
   
   It should have return the `{"day": 20220627, "batch": 1656318682, "source": 
"all",  "os": "ubuntu"}` as output of return_value
   
   ### How to reproduce
   
   I am not able to exactly produce it with example but it's failing with my 
application. So I extended the DockerOperator class in my code & copy pasted 
the `_run_image_with_mounts` method and added 2 print statements
   ```
                   print(f"log lines from attach {log_lines}")
                   try:
                       if self.xcom_all:
                           return [stringify(line).strip() for line in 
self.cli.logs(**log_parameters)]
                       else:
                           lines = [stringify(line).strip() for line in 
self.cli.logs(**log_parameters, tail=1)]
                           print(f"lines from logs: {lines}")
   ```
   Value of log_lines comes from this 
[line](https://github.com/apache/airflow/blob/main/airflow/providers/docker/operators/docker.py#L309)
   
   The output of this is as below. First line is last print in my docker code
   ```
   [2022-06-27, 14:43:26 UTC] {pipeline.py:103} INFO - {"day": 20220627, 
"batch": 1656340990, "os": "ubuntu", "source": "all"}
   [2022-06-27, 14:43:27 UTC] {logging_mixin.py:115} INFO - log lines from 
attach ['2022-06-27, 14:43:15 UTC - root - read_from_presto - INFO - Processing 
datetime is 2022-06-27 14:43:10.755685', '2022-06-27, 14:43:15 UTC - 
pyhive.presto - presto - INFO - SHOW COLUMNS FROM <truncated data as it's too 
long>, '{"day": 20220627, "batch": 1656340990, "os": "ubuntu", "source": 
"all"}']
   [2022-06-27, 14:43:27 UTC] {logging_mixin.py:115} INFO - lines from logs: 
['{', '"', 'd', 'a', 'y', '"', ':', '', '2', '0', '2', '2', '0', '6', '2', '7', 
',', '', '"', 'b', 'a', 't', 'c', 'h', '"', ':', '', '1', '6', '5', '6', '3', 
'4', '0', '9', '9', '0', ',', '', '"', 'o', 's', '"', ':', '', '"', 'u', 'b', 
'u', 'n', 't', 'u', '"', ',', '', '"', 's', 'o', 'u', 'r', 'c', 'e', '"', ':', 
'', '"', 'a', 'l', 'l', '"', '}', '', '']
   
   ``` 
   
   From above you can see for some unknown reason 
`self.cli.logs(**log_parameters, tail=1)` returns array of characters.  This 
changes was brough as part of 
[change](https://github.com/apache/airflow/commit/2f4a3d4d4008a95fc36971802c514fef68e8a5d4)
  Before that it was returning the data from log_lines
   
   My suggestion to modify the code as below
   ```
                       if self.xcom_all:
                           return [stringify(line).strip() for line in 
log_lines]
                       else:
                           lines = [stringify(line).strip() for line in 
log_lines]
                           return lines[-1] if lines else None
   
   ```
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to