streetmapp opened a new issue #10080:
URL: https://github.com/apache/airflow/issues/10080


   **Summary**: In trying to troubleshoot why exactly my tasks weren't firing 
correctly in my Airflow Cluster running on Kubernetes with the Kubernetes 
Executor and KubernetesPodOperator, was trying to get insight to the logs. By 
default in the chart in this repo, they get logged locally and not to stdout. I 
managed to initially view the logs when I configured remote logging to GCS, but 
that wasn't satisfactory for my use case. I did manage to find documentation 
for getting my desired behavior for some reason under the Elasticsearch logging 
configuration. However, in getting this working, it in turn broke the Logs tab 
in the UI.
   
   **Apache Airflow version**: 1.10.11
   
   **Kubernetes version**: 1.16.9
   
   **Cloud provider or hardware configuration**: GKE
   
   **What happened**:
   Trying to view the logs for a task in the UI, the Logs UI just displays a 
loading wheel persistently. No errors seem to appear in the logs for the 
webserver when I access the page.
   
   **What you expected to happen**:
   To be able to view the logs that are available in the pod in the UI. I think 
the issue is with how the configuration for getting these logs to show up in 
stdout is. In order to get this to work, I had to effectively configure 
Elasticsearch logging, but with different values in place so it would go to 
stdout of the pod. This then allows me to use my own Kubernetes logging 
mechanism to view the logs, but this in turn causes the Airflow UI to not be 
able to display the logs.
   
   
   **How to reproduce it**:
   1. Setup Airflow to run with Kubernetes Executor
   2. Follow the configuration steps 
[here](https://airflow.apache.org/docs/stable/howto/write-logs.html#writing-logs-to-elasticsearch)
 to configure elasticsearch logging to stdout.
   
   With the configuration from the link above, this wasn't enough so I had 
added the `host` key to the configuration and that got me the results I wanted. 
The `json_format` was too noisy so I had turned that off as well.
   ```
   [core]
   # Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic 
Search.
   # Users must supply an Airflow connection id that provides access to the 
storage
   # location. If remote_logging is set to true, see UPDATING.md for additional
   # configuration requirements.
   remote_logging = True
   
   [elasticsearch]
   host = localhost
   log_id_template = {{dag_id}}-{{task_id}}-{{execution_date}}-{{try_number}}
   end_of_log_mark = end_of_log
   write_stdout = True
   ```
   3. Execute a DAG with KubernetesPodOperator
   4. In the UI go to a task and try to view its logs. Upon doing so, get this 
in the UI.
   
   
![image](https://user-images.githubusercontent.com/2141830/89043991-5ad77680-d317-11ea-8003-1a078aa9ae94.png)
   
   **Anything else we need to know**:
   This issue happens with every task logs that I try to view. I would expect 
to be able to view the logs in the UI despite me configuring them to go 
somewhere else. When I did this with GCS remote logging, the UI was working. So 
would expect to see similar here. Though I'm not entirely surprised given that 
the configuration to get this to work at all seems rather backwards. I wouldn't 
expect to have to configure Elasticsearch logging to get logs to show up in 
stdout when I have no intent on using Elasticsearch. So one of the big 
takeaways is that more work needs to be done to improve the integration of 
Airflow with Kubernetes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to