Taragolis commented on issue #23832: URL: https://github.com/apache/airflow/issues/23832#issuecomment-1137572804
@Dark-Knight11 I would like to warn about side effect of fetch AWS Glue Logs into Airflow Tasks logs (already mentioned in the almost the same issue https://github.com/apache/airflow/issues/23900#issuecomment-1137357273). If your Glue Job uses 40 DPU that mean it spawn minimum 1 driver and 39 workers/executors and all of them will create errors and output logs, as result 1. You need to find all of CloudWatch Logs Prefix started with `job_id` in correct log group, by default Glue uses - `/aws-glue/jobs/output` - for output logs - `/aws-glue/jobs/error` - for error logs - `/aws-glue/jobs/logs-v2` - (optional) for continuous logging 2. You need to be sure that you fetch from all prefixes and this prefixes doesn't created in the same time 3. Be sure that fetch logging processes/threads do not use all CPU/Memory/IO of Airflow Worker -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
