itskals commented on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-673475255


   
   I was thinking that though spark has many queues and in many cases may be 
not all queues are used used to the same level at the same time.. I mean, when 
some queues are heavily used, some might be consume those events faster.
   
   If this is true, instead of having separate queue sizes for each and making 
it rigid, what if its a pool, from which the queue can loan event holders. and 
when the events are processed, they are given back to the pool.
   
   This pool is not memory units stuff, rather its just a counter (atomic 
probably). Let's say if we say for a driver memory of X GB, we allocate N event 
holders, which needs to be used by all queues. N then is the pool size. 
   When an event needs to be enqueued in a queue, ask the pool if an event 
placeholder can be used. If it says ok ( based on the current value of usage) 
then the queue can enqueue event. if no, the event is dropped.
   
   
   The idea here is that it is a middle ground between restricted queue size 
and infinite capacity queue. So here the queues are not statically bound to a 
size but rather more flexibility of given to grow. Also it has a softer high 
water mark ( which is N) beyond which it cant grow.
   
   Let me know what you think...  @SaurabhChawla100  @Ngone51  @tgravescs 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to