rhwang10 commented on issue #4303: [AIRFLOW-3370] Elasticsearch log task 
handler additional features
URL: https://github.com/apache/airflow/pull/4303#issuecomment-460654056
 
 
   > Really appreciate all the detailed docs in the PR and comments 👍 Pardon me 
for being unfamiliar with EFK stack, if we are not in a k8s env and multiple 
tasks were ran under the same main airflow worker process, can we easily 
separate those logs from different tis? If in a k8s env and we have 1 ti per 
pod, will this include the logs from the main process into the ti log?
   
   @KevinYang21 Thanks for the very detailed review! loved reading through your 
comments, and looking into addressing all of them. each TI that runs on a 
worker process is tagged with a `log_id` field, but the logs from the main 
process are not. When a log entry is read from elasticsearch, it expects that 
`log_id` to be part of the http request. So the logs from the main process 
won't be a part of the subsequent response, since the `log_id` will be 
empty/null. In the EFK stack, we tag every TI log with a log_id in the fluentD 
pipeline before sending it off to elasticsearch, but a user can do it any way 
they'd like (Logstash, Beats) as long as that `log_id` field is present. 
(https://github.com/astronomer/helm.astronomer.io/blob/master/charts/fluentd/templates/fluentd-configmap.yaml#L135-L143).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to