uranusjr opened a new pull request #15037: URL: https://github.com/apache/airflow/pull/15037
This breaks the loops sending callbacks to the multiprocessing pipe into size 100 batches, and call `DagFileProcessorAgent.heartbeat()` after each pipe (which would call `Pipe.recv()`) to consume the pipe. This avoids the pipe from becoming full, which would make `Pipe.send()` block and deadlocking the process. `Pipe.send()` is called in two code paths, (interestingly) represented exactly by the two py-spy traces available in #7935. The way I do this is pretty naive, but represents the direction I think the issue should be resolved. I don’t really understand what the database calls do in `_do_scheduling` and `_process_executor_events`, and therefore have no idea if it’s OK to call `self.processor_agent.heartbeat()` interleaving those database calls (previously `heartbeat()` is only called after all those database calls are done). Resolves #7935? There’s actually another separate issue described in it regarding the Redis worker being deadlocked. But this is no longer an issue according to @ashb (https://github.com/apache/airflow/issues/7935#issuecomment-784950133), and indeed all reports on that one is against 1.10.x, so I’m putting it off (and honestly I’m not sure how that one should be handled; that may need to involve some Redis internals). --- Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information. In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
