potiuk opened a new pull request, #23866:
URL: https://github.com/apache/airflow/pull/23866

   This change should significantly speed up Breeze experience (and
   especially iterating over a change in Breeze for MacOS users -
   independently if you are using x86 or arm architecture.
   
   The problem with MacOS with docker is particularly slow filesystem
   used to map sources from Host to Docker VM. It is particularly bad
   when there are multiple small files involved.
   
   The improvement come from two areas:
   * removing duplicate pycache cleaning
   * moving MyPy cache to docker volume
   
   When entering breeze we are - just in case - cleaning .pyc and
   __pychache__ files potentially generated outside of the docker
   container - this is particularly useful if you use local IDE
   and you do not have bytecode generation disabled (we have it
   disabled in Breeze). Generating python bytecode might lead to
   various problems when you are switching branches and Python
   versions, so for Breeze development where the files change
   often anyway, disabling them and removing when they are found
   is important. This happens at entering breeze and it might take
   a second or two depending if you have locally generated.
   
   It could happen that __init script was called twice (depending which
   script was called - therefore the time could be double the one
   that was actually needed. Also if you ever generated provider
   packages, the time could be much longer, because node_modules
   generated in provider sources were not excluded from searching
   (and on MacOS it takes a LOT of time).
   
   This also led to duplicate time of exit as the initialization code
   installed traps that were also run twice. The traps however were
   rather fast so had no negative influence on performance.
   
   The change adds a guard so that initialization is only ever executed
   once.
   
   Second part of the change is moving the cache of mypy to a docker
   volume rather than being used from local source folder (default
   when complete sources are mounted). We were already using selective
   mount to make sure MacOS filesystem slowness affects us in minimal
   way - but with this change, the cache will be stored in docker
   volume that does not suffer from the same problems as mounting
   volumes from host. The Docker volume is preserved until the
   `docker stop` command is run - which means that iterating over
   a change should be WAY faster now - observed speed-up were around
   5x speedups for MyPy pre-commit.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a 
newsfragement file, named `{pr_number}.significant.rst`, in 
[newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to