houqp commented on pull request #1560:
URL: 
https://github.com/apache/arrow-datafusion/pull/1560#issuecomment-1016165577


   @realno to your question on poll v.s. push, I think you are spot on with 
regards to poll being simpler on design. @edrevo  and @j4ckcyw had some good 
prior discussions on this topic as well at 
https://github.com/ballista-compute/ballista/issues/463.
   
   My current thinking on this is the scheduler state complexity introduced 
through the push model might be needed in the long run for optimal task 
scheduling when we can take more resource related factors into account instead 
of just available task slots. Having a global view of the system generally 
yields better task placements.
   
   On the scaling front, even though push based model incurs more state 
management overhead on the scheduler side, it has its own scaling strength as 
well. For example, active message exchanges between scheduler and executors 
scale much better with larger executor pool size because heart beat messages 
are much easier to process compared to task poll requests. It is also easier 
for the scheduler to reduce its own load by scheduling proactively schedule 
less tasks v.s. rate limiting executor poll requests in a polling model.
   
   So I think there is no clear cut here. Perhaps another angle would be a 
hybrid model where we still use the base model as the foundation , but try to 
move as much state management logic into the executors to reduce the overhead 
on the scheduler side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to