Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/19233
  
    If you know you won't need more than 10, then set the max to 10.
    
    If you don't necessarily know that, then I think you're complaining that 
dynamic allocation doesn't 'know' how many executors will be needed in advance. 
Yes, in general the load goes up and down and can't be predicted, so dynamic 
allocation is always adapting, and will add executors or time out idle ones 
eventually to match load. This is just how it works.
    
    I think you're suggesting a specific strategy for Spark Streaming jobs 
only. While I understand it, because you do know more about the load in this 
type of job, this is also a reason to set the max because you know what it 
should be, or simply not use Spark Streaming. It's often not used in streaming 
because the lag of adapting to a new load of tasks increases latency and 
variability.
    
    Just set your max to 10, or perhaps set it to more rapidly time out idle 
executors.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to