potiuk commented on PR #33117:
URL: https://github.com/apache/airflow/pull/33117#issuecomment-1677144580

   Can you perform some calculations and realistic simulations showing how this 
would help to achieve better performance? I think there is a big risks that 
your idea of horizontal scalability is there to implement the idea of 
horizontal scalability. But whether it solves any problem and allows to achieve 
some use cases and getting things better and more performant? With S3 and 
celery as broker of the messages, I am not sure if the overhead connected with 
it would justify potential gains. 
   
   Only some realistic case and performance benchmarks could justify it.
   
   Also if you want to propose it, you have to consider alternatives and  see 
how they compare - both performance wise and complexity-wise.
   
   And it's not an abstract ask.
   
   This is precisely what we've done when we've implemented Horizontal 
scalability for Scheduler. If you read that post 
https://www.astronomer.io/blog/airflow-2-scheduler/ which summarizes all the 
effort done there. It's just an icing on the cake that is a result of a 
numerous discussion we had on the devlist, discussing draft Airflow Improvement 
Proposals, and when we gone through prototype code walkthroughs with Ash where 
he explained us how the idea works.  
   
   When you look at the blog post - it explains why we implemented it and why 
we made the decisions - it was based on some observations and benchmarks that 
we have a real bottleneck on some realistic cases, then prototyping was done 
and and benchmarking on several approaches we could take, finally the currrent 
"Database locking" mechanism with SKIP_LOCKED was chosen based on those 
benchmarks and analysis as a good balance between simplicity and achieved gains.
   
   HINT: we've chosen the DB because it did not require to add any component, 
messaging communication queues, zookeper and many other thing that we could 
choose if there that would awfully complicate Airflow deployments.
   
   So, by the sheer look of it, your proposal goes into opposite direction 
comparing to what we decided for Scheduler. That does not mean it is wrong, but 
it's a hint, that likely it's not preferred one. But if you perform analysis of 
performance/gains/risks/complexity of this and compare with alternative 
solutions that you considered, discuss it at the devlist, turn it into Airflow 
Improvement Proposal, pass it throuhg voting and implement, then of course, 
there is a possibility this might be something that the community will be 
willing to accept.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to