SaurabhChawla100 commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-675235066
>what I understand from the discussions here is that, let us assume 30,000 is a good number in many cases for a certain workload. But in some scenarios, like slight change in input data pattern etc, the number of events generated increases. In that case, this 20% can be helpful. How? This 20% will accommodate, hopefully, the extra burden w/o losing the events and user manually changing it to 36,000. ( could be seen this way, "instead of changing the value twice, change two values once") @itskals - Thanks for explaining this is detail > The problem is why you would like to set it to 20% in first place? Why not 10% or 30%? If one exactly know he/she would expect 20% more size, I think he/she would/could also set it to 36,000, especially when he/she has no background of this PR. We can use it as default value like 10% in the conf and user has fixed the value of some 30,000 only. And there is slight increase in the load while processing the job, and we are getting 32,000 task completed in small interval of time and causing the overflow of the queue even after 30,000 . But if we have this extra threshold of default value 10% which is 33000 than those event drops is prevented and job can be prevented from the abrupt behaviour due to the event drop. And also prevented user to change the value from 30000 to some other number less frequently. This is the best case scenario example and the proposal is to handle this best case scenario, which prevents the event drop. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
