eyalzek opened a new issue #10406:
URL: https://github.com/apache/airflow/issues/10406


   **Apache Airflow version**:
   apache/airflow:1.10.11
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   v1.16.11-gke.5
   
   **Environment**:
   GKE
   
   
   **What happened**:
   Webserver doesn't fetch logs for tasks from elasticsearch
   
   **What you expected to happen**:
   task logs will be displayed in the webserver UI
   
   It seems like the webserver is trying to query task logs by the `log_id` 
field:
   
https://github.com/apache/airflow/blob/1.10.11/airflow/utils/log/es_task_handler.py#L175
   
   this field is missing from all log lines (which are written to stdout) using 
the KubernetesExecutor. Example log line:
   `{"asctime": null, "filename": "standard_task_runner.py", "lineno": 77, 
"levelname": "INFO", "message": "Running: ['airflow', 'run', 'hello_world', 
'hello_task_3', '2020-08-19T14:26:07.226064+00:00', '--job_id', '158', 
'--pool', 'default_pool', '--raw', '-sd', 
'/opt/airflow/dags/repo/dags/hello_world.py', '--cfg_path', 
'/tmp/tmpt7lafkaf']", "dag_id": "hello_world", "task_id": "hello_task_3", 
"execution_date": "2020_08_19T14_26_07_226064", "try_number": "1"}`
   
   
   **How to reproduce it**:
   this is the relevant configuration we have, scheduler and webserver running 
separately and tasks run using KubernetsExecutor (all in the same 
cluster/namespace):
   ```
   AIRFLOW__CORE__LOGGING_LEVEL: INFO
   AIRFLOW__CORE__REMOTE_LOGGING: "True"
   AIRFLOW__ELASTICSEARCH__HOST: http://elasticsearch.logging:9200
   AIRFLOW__ELASTICSEARCH__JSON_FORMAT: "True"
   AIRFLOW__ELASTICSEARCH__WRITE_STDOUT: "True"
   ```
   
   we are using fluentd 
(https://github.com/fluent/fluentd-kubernetes-daemonset) to forward log lines 
to elasticsearch, all task logs are written to stdout + elasticsearch as 
expected.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to