itskals commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-675199584
> I don't understand this, how does it prevent the user from changing the conf? They either change the percentage to be 20% or the event queue size to 36000. Either way they change a conf. @tgravescs what I understand from the discussions here is that, let us assume 30,000 is a good number in many cases for a certain workload. But in some scenarios, like slight change in input data pattern etc, the number of events generated increases. In that case, this 20% can be helpful. How? This 20% will accommodate, hopefully, the extra burden w/o losing the events and user manually changing it to 36,000. ( could be seen this way, "instead of changing the value twice, change two values once") Other things that I see is that, the queue size need not increase to the config value( 20%) in one step. It can be gradual and multi step. Also, it can shrink back to original value when event queue is back to "acceptable" usage. Also I see that the increase in the additional queue size is memory pressure aware and tries to keep itself as best effort. In such cases, setting this value to a higher value say 100% might not harm. Anyways, it feels good if the system detects that need to increase the event size and does it automatically, while at the same time being resource aware + giving flexibility to user to control it (if needed). @Ngone51 @SaurabhChawla100 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
