SaurabhChawla100 commented on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-672931880


   > so I definitely get the point here and it would definitely be nice to 
handle this better somehow, but if you are setting it be queue size + some 
threshold then the application has to have the memory to handle, so how is that 
different then just setting queue size to be larger? The one benefit is user 
doesn't have to try to pick an exact size, but the downside is they don't 
necessarily know how it affects the memory so you ideally add memory for worst 
case.
   > 
   > The reason I asked which queue dropped is that normally the event log 
queue fills up but it should only affect the history. If the executor 
management queue filled up then could affect dynamic allocation which is 
definitely bad, but in my opinion we should change that not to rely on events. 
The other queues ideally don't affect the application running but I am 
wondering if something is.
   
   So the main issue is , there is no fixed size of the Queue that can work for 
all the Spark Jobs and once the size of the Queue is set at the start of the 
application, Linked-blocking Queue can handle max those number of the events 
when the Queue is full, after that event is dropped. The idea here is to handle 
this scenario more gracefully where the queue can size can be increased in case 
overflow of events and once those overflow of those events are over queue size 
can be set to size which is the initial capacity .
   
   This is handled on per queue if there is problem with one queue(eg appStatus 
Queue), then that queue size will change at run time.
   Also there is future plan to make this logic dependent on the driver memory. 
Taking the decision to increase the size of the Queue based on certain 
threshold driver max heap size. This can prevent the OOM.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to