potiuk opened a new pull request #18356:
URL: https://github.com/apache/airflow/pull/18356


   A lot of users have an expectations that Airflow Scheduler will
   `just work` and deliver the `optimal performance` for them without
   realising that in case of such comples systems as Airflow is you
   often have to decide what you should optimise for or accept some
   trade-offs or increase hardware capacity if you are not willing to
   make those trade-offs.
   
   Also it's not clear where the responsibility
   is - should it `just work` or should the user be responsible for
   understanding and fine tuning their system (both approaches are
   possible, there are some complex systmes which utilise a lot of
   automation/AI etc. to fine tune and optmise their behaviour but
   Airflow expects from the users to know a bit more on how the
   scheduling works and Airflow maintainers deliver a lot of
   knobs that can be turned to fine tune the system and to make
   trade-off decisions. This was not explicitely stated in our
   documentation and users could have different expectations about
   it (and they often had judging from issues they raised).
   
   This PR adds a "fine-tuning" chapter that aims to set the
   expectations of the users at the right level - it explains what
   Airflow provides, but also what is the user's responsibility - to
   decide what they are optimising, to see where their bottlenecks
   are and to decide if they need to change the configuration or
   increase hardware capacity (or make appropriate trade-offs).
   
   It also brings more of the fine-tuning parameters to the
   `tuneables` section of scheduler, based on some of the recent
   questions asked by the users - seems that having a specific
   overview of all performance-impacting parameters is a good idea,
   and we only had a very limited subset of those.
   
   Some user prefer `watch` rather than read that's why this PR
   also adds the link to the recording of talk from the
   Airlfow Summit 2021 where Ash describes - in a very concise
   and easy to grasp way - all the whys and hows of the scheduler.
   If you understand why and how the scheduler does what it does,
   fine-tuning decisions are much easier.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to