ashb opened a new pull request #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. URL: https://github.com/apache/airflow/pull/6627 Make sure you have checked _all_ steps below. ### Jira - [x] https://issues.apache.org/jira/browse/AIRFLOW-5931 ### Description - [x] Rather than running a fresh python interpreter which then has to re-load all of Airflow and its dependencies we should use os.fork when it is available/suitable which should speed up task running, espeically for short lived tasks. I've profiled this and it took the task duration (as measured by the `duration` column in the TI table) from an average of 14.063s down to just 0.932s! I _could_ make this change deeper and bypass the `CLIFactory`/go directly to `_run_raw_task`, but this makes the change the minimum needed to work. ### Tests - [x] No unit tests added. Hopefully existing tests good enough. Manual testing shows this working Other tests I need to perform: - [ ] Check if `os._exit` is right (this doesn't run atexit callbacks) - so I need to check if logging in the subprocess istidied up properly. - [ ] Test if this leaves "dangling"/broken DB connections. - [ ] Check remote log uploading ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
