omkar-foss commented on issue #43171: URL: https://github.com/apache/airflow/issues/43171#issuecomment-2444113421
Have a suggestion for multi-possible-root-cause issues - we can print Airflow error code with the error message e.g. `AERR055: Job 10 was killed before it finished` and can have an error code mapping with possible root causes like (just examples, not real causes): | Error Code | Possible Commonly Observed Causes | |------------|---------------------------------------------------------| | AERR055 | 1) Ran out of memory | | | 2) Job was stuck and killed after timeout | | | 3) Job being run on Spot Instance Node (K8S on EKS) | Since error codes are shareable and easily searchable, it would be useful for team collaboration as well (e.g. instead of me saying "I'm looking into the error `Job 10 was killed before it finished`", can probably just say "I'm looking into AERR055". Much like how we use JIRA ticket numbers or GitHub issue/PR numbers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
