dstandish commented on PR #23658: URL: https://github.com/apache/airflow/pull/23658#issuecomment-1154232958
> Yep. The thread could hang indefinitely, as the kube logs api stream could do, so I need to terminate the process. > > However, i am not very knowledgeable on python's api for this kind of thing, so open to suggestions. ok yeah, thing i learned: killing threads is not well-supported. interestingly, [something similar is done for ECS](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/operators/ecs.py#L98), and threads are used there, but they don't deal with indefinite hanging problem (and therefore don't have to forcibly terminate it). one alternative i'll just mention is we could handle this by doing our own "following" instead of relying on the stream. i.e. we do our own loop and consume the latest logs, then sleep a second and do the same. this might allow for a cleaner solution. but, multiprocessing seems ok. i'll just mention a few things. one is I think you need to kill the logging process when `on_kill` is called. the other is, if you get an error in `await_container_completion`, your subprocess won't be killed. so, you probably should modify so that if you get an unexpected error after process has started, that the process will be killed. also, i would try to make note somewhere that this is due to a problem with kubernetes, and what we are doing here is a hack that should be removed once the issue has been resolved upstream (and propagated sufficiently, which of course could be a long time from now). the upstream issue is documented here https://github.com/kubernetes/kubernetes/issues/59902. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
