houqp commented on pull request #10917: URL: https://github.com/apache/airflow/pull/10917#issuecomment-723260002
> Sorry, not happy about removing the exception from the context. We need to make a call here then, the trade off boils down to: * handle state update from external process to cover hard crashes like OOM and segfaults, but provide a generic state changed to failed exception to failure_callback * handle state update within run_raw_task, and live with the fact that hard crash will result in miss of task instance state clean up and callbacks In theory, we can pass exception info from run_raw_task into the external monitoring process, but this will require a much bigger refactor if we want to use shared memory. Alternatively, we can write out exception info from run_raw_task into a local temp file to be picked up by external monitoring process. What do you think? I am leaning towards the temp file solution, which seems to cover all requirements we have so far. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
