ak-arun commented on issue #23832:
URL: https://github.com/apache/airflow/issues/23832#issuecomment-1274369187

   Hi Andrey,
   
   Point taken. I am thinking if we just publish the logs when the job state
   changes to Error/Failed.
   
   Also, instead of mining all executor logs + the driver, I think we can read
   the 2 recently added job insights streams  -
   https://docs.aws.amazon.com/glue/latest/dg/monitor-job-insights.html
   
   For sure, this is a new feature in Glue and older jobs may not have these
   streams enabled. We can mention it works only if job-insights are
   enabled?-Thoughts ?
   
   -- 
   Thanks & Regards
   *Arun A K*
   
   
   
   
   On Wed, May 25, 2022 at 1:10 PM Andrey Anshin ***@***.***>
   wrote:
   
   > @Dark-Knight11 <https://github.com/Dark-Knight11> I would like to warn
   > about side effect of fetch AWS Glue Logs into Airflow Tasks logs (already
   > mentioned in the almost the same issue #23900 (comment)
   > <https://github.com/apache/airflow/issues/23900#issuecomment-1137357273>).
   >
   > If your Glue Job uses 40 DPU that mean it spawn minimum 1 driver and 39
   > workers/executors and all of them will create errors and output logs, as
   > result
   >
   >    1. You need to find all of CloudWatch Logs Prefix started with job_id
   >    in correct log group, by default Glue uses
   >       - /aws-glue/jobs/output - for output logs
   >       - /aws-glue/jobs/error - for error logs
   >       - /aws-glue/jobs/logs-v2 - (optional) for continuous logging
   >    2. You need to be sure that you fetch from all prefixes and this
   >    prefixes doesn't created in the same time
   >    3. Be sure that fetch logging processes/threads do not use all
   >    CPU/Memory/IO of Airflow Worker
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/airflow/issues/23832#issuecomment-1137572804>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AEVRXFOQEOW4BUUI2YU5FHLVLZNHPANCNFSM5WP75PDA>
   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to