melle2 opened a new issue #11085:
URL: https://github.com/apache/airflow/issues/11085


   **Context**
   What I'm going to describe here is more or less related to #7141. But the 
context is bigger than described there. That's the reason why I open another 
Bug report.
   I'm struggling with this issue, too. I have tried a lot to be able to see 
the log entries in web GUI. Maybe I'm wrong, but in my view all this 
Elasticsearch process cannot be used at all. I followed all the discussion in 
#7141 but it seems like that some of the points are still not clear. I try to 
explain the issues in more high level instead of talking Python.
   
   My configuration:
   ```
   [elasticsearch]
   host = http://elasticsearch_de:9200
   log_id_template = {{dag_id}}-{{task_id}}-{{execution_date}}-{{try_number}}
   end_of_log_mark = end_of_log
   write_stdout = True
   json_format = True
   json_fields = execution_date, asctime, filename, lineno, levelname, message
   ```
   My example, the pure (snipped) console log output after a DAG has been 
executed:
   ```
   {"execution_date": "2020_09_21T14_00_00_000000", "asctime": null, 
"filename": "taskinstance.py", "lineno": 670, "levelname": "INFO", "message": 
"Dependencies all met for <TaskInstance: stg_cam_asm_load.asm_stage 
2020-09-21T14:00:00+00:00 [queued]>", "dag_id": "stg_cam_asm_load", "task_id": 
"asm_stage", "try_number": "1"}
   
   ...
   
   {"execution_date": "2020_09_21T14_00_00_000000", "asctime": null, 
"filename": "taskinstance.py", "lineno": 1187, "levelname": "INFO", "message": 
"Marking task as FAILED. dag_id=stg_cam_asm_load, task_id=asm_stage, 
execution_date=20200921T140000, start_date=20200921T150124, 
end_date=20200921T150125", "dag_id": "stg_cam_asm_load", "task_id": 
"asm_stage", "try_number": "1"}
   {"execution_date": "2020_08_29T14_00_00_000000", "asctime": null, 
"filename": "local_task_job.py", "lineno": 102, "levelname": "INFO", "message": 
"Task exited with return code 1", "dag_id": "stg_cam_asm_load", "task_id": 
"mail_on_failure_production_support", "try_number": "1"}
   end_of_log[2020-09-21 17:01:30,274: INFO/ForkPoolWorker-15] Task 
airflow.executors.celery_executor.execute_command[1cd84830-e24b-459a-9fc9-46bb4102709b]
 succeeded in 8.639222555000742s: None
   ```
   I send all this entries (manually) to Elasticsearch (as it is exactly the 
output I get):
   
   1. ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"execution_date": 
"2020_08_29T14_00_00_000000", "asctime": null, "filename": "taskinstance.py", 
"lineno": 670, "levelname": "INFO", "message": "Dependencies all met for 
<TaskInstance: stg_cam_asm_load.mail_on_failure_production_support 
2020-08-29T14:00:00+00:00 [queued]>", "dag_id": "stg_cam_asm_load", "task_id": 
"mail_on_failure_production_support", "try_number": "1"}'```
   2. ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"execution_date": 
"2020_09_21T14_00_00_000000", "asctime": null, "filename": "taskinstance.py", 
"lineno": 1187, "levelname": "INFO", "message": "Marking task as FAILED. 
dag_id=stg_cam_asm_load, task_id=asm_stage, execution_date=20200921T140000, 
start_date=20200921T150124, end_date=20200921T150125", "dag_id": 
"stg_cam_asm_load", "task_id": "asm_stage", "try_number": "1"}'```
   3. ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"execution_date": 
"2020_08_29T14_00_00_000000", "asctime": null, "filename": "local_task_job.py", 
"lineno": 102, "levelname": "INFO", "message": "Task exited with return code 
1", "dag_id": "stg_cam_asm_load", "task_id": 
"mail_on_failure_production_support", "try_number": "1"}'```
   4. ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data 'end_of_log[2020-09-21 
17:01:30,274: INFO/ForkPoolWorker-15] Task 
airflow.executors.celery_executor.execute_command[1cd84830-e24b-459a-9fc9-46bb4102709b]
 succeeded in 8.639222555000742s: None'```
   
   **Problem Description**
   Based on this entries the web server cannot find ANY entry as there are 
several issues.
   1. Very obvious, there is no linefeed after "end_of_log". Hence, the 
complete will be sent to Elasticsearch.
   1.1 Even if I add a "\n" to the configuration I will see "end_of_log\n".
   1.2 This output is plain text an not JSON.
   2. The "log_id" which is defined in configuration "log_id_template" is 
missing in every log output.
   3. The "offset" is missing. Based on the offset the order of the log entries 
is defined. Without offset nothing is displayed.
   4. And, in addition to that, there is a gap between "execution_date" which 
is in the log output and the execution date format the web GUI queries in 
Eleasticsearch:
   ```
   [2020-09-21 16:00:30,922] {base.py:81} INFO - GET 
http://elasticsearch_de:9200/_count [status:200 request:0.003s]
   192.168.105.1 - - [21/Sep/2020:16:00:30 +0000] "GET 
/admin/airflow/get_logs_with_metadata?dag_id=stg_cam_asm_load&task_id=asm_stage&execution_date=2020-09-21T14%3A00%3A00%2B00%3A00&try_number=1&metadata=%7B%22end_of_log%22%3Afalse%2C%22last_log_timestamp%22%3A%222020-09-21T15%3A55%3A31.254697%2B00%3A00%22%2C%22offset%22%3A%220%22%7D
 HTTP/1.1" 200 116 
"http://192.168.105.100:8080/admin/airflow/log?task_id=asm_stage&dag_id=stg_cam_asm_load&execution_date=2020-09-21T14%3A00%3A00%2B00%3A00&format=json";
 "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:80.0) Gecko/20100101 
Firefox/80.0"
   ```
   As you can see 2020_09_21T14_00_00_000000 != 2020-09-21T14:00:00+00:00 
(2020-09-21T14%3A00%3A00%2B00%3A00)
   
   **Expected Behavior**
   Only if I manipulate everything described in step 1-4 and add [log_id], 
[offset], adjust [end_of_log] output and adjust [execution_date]:
   1. ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"log_id": 
"stg_cam_asm_load-asm_stage-2020-09-21T14:00:00+00:00-1", "execution_date": 
"2020_09_21T14_00_00_000000", "asctime": null, "filename": "taskinstance.py", 
"lineno": 670, "levelname": "INFO", "message": "Dependencies all met for 
<TaskInstance: stg_cam_asm_load.asm_stage 2020-09-21T14:00:00+00:00 [queued]>", 
"dag_id": "stg_cam_asm_load", "task_id": "asm_stage", "try_number": "1", 
"offset": "1"}'```
   2.  ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"log_id": 
"stg_cam_asm_load-asm_stage-2020-09-21T14:00:00+00:00-1", "execution_date": 
"2020_09_21T14_00_00_000000", "asctime": null, "filename": "taskinstance.py", 
"lineno": 1187, "levelname": "INFO", "message": "Marking task as FAILED. 
dag_id=stg_cam_asm_load, task_id=asm_stage, execution_date=20200921T140000, 
start_date=20200921T150124, end_date=20200921T150125", "dag_id": 
"stg_cam_asm_load", "task_id": "asm_stage", "try_number": "1", "offset": 
"2"}'```
   3.  ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"log_id": 
"stg_cam_asm_load-asm_stage-2020-09-21T14:00:00+00:00-1", "execution_date": 
"2020_09_21T14_00_00_000000", "asctime": null, "filename": "local_task_job.py", 
"lineno": 102, "levelname": "INFO", "message": "Task exited with return code 
1", "dag_id": "stg_cam_asm_load", "task_id": "asm_stage", "try_number": "1", 
"offset": "3"}'```
   4. ```curl -X POST -H 'Content-Type: application/json' -i 
'http://elasticsearch_de:9200/airflow/_doc' --data '{"log_id": 
"stg_cam_asm_load-asm_stage-2020-09-21T14:00:00+00:00-1", "message": 
"end_of_log", "offset": "4"}'```
   
   ...all log entries are displayed in the web GUI:
   
![grafik](https://user-images.githubusercontent.com/71716485/93911149-553c5280-fd02-11ea-9285-d756398f62c7.png)
   
   What would be a helpful bugfix:
   1. Ensure that every log output is enriched by log_id and offset.
   2. [end of line] is logged within a valid JSON (or probably think about a 
way how to do this in a different way.)
   3. [execution_date] log output and the way the web GUI queries in 
Elasticsearch does match.
   
   **Note**
   Maybe my understanding of the whole process is wrong. Any solution without 
bugfix is highly welcome.
   
   
   **Apache Airflow version**:
   1.10.12
   
   **Environment**:
   - **Cloud provider or hardware configuration**:
   Docker Image in Oracle VirtualBox
   - **OS** (e.g. from /etc/os-release):
   RHEL8
   - **Others**:
   Python 3.8
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to