Taragolis commented on issue #23832:
URL: https://github.com/apache/airflow/issues/23832#issuecomment-1137572804

   @Dark-Knight11 I would like to warn about side effect of fetch AWS Glue Logs 
into Airflow Tasks logs (already mentioned in the almost the same issue 
https://github.com/apache/airflow/issues/23900#issuecomment-1137357273).
   
   If your Glue Job uses 40 DPU that mean it spawn minimum 1 driver and 39 
workers/executors and all of them will create errors and output logs, as result
   1. You need to find all of CloudWatch Logs Prefix started with `job_id` in 
correct log group, by default Glue uses
      - `/aws-glue/jobs/output` - for output logs
      - `/aws-glue/jobs/error` - for error logs
      - `/aws-glue/jobs/logs-v2` - (optional) for continuous logging
   2. You need to be sure that you fetch from all prefixes and this prefixes 
doesn't created in the same time
   3. Be sure that fetch logging processes/threads do not use all CPU/Memory/IO 
of Airflow Worker
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to