[GitHub] [spark] viirya commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

GitBox Wed, 16 Dec 2020 17:20:20 -0800


viirya commented on pull request #30770:
URL: https://github.com/apache/spark/pull/30770#issuecomment-747142154



   > The problem is, this is completely relying on luck - this doesn't give any 
help on physical plan. Again the problem exists even without the PR, but then 
shouldn't we fix the root cause instead of extending the possibility of luck? 
At least Spark should be able to know there're other executors still keeping 
the state, and taking into account while planning.
   
   We already have preferred locations for stateful operations. This is how 
Spark takes into account when planning physical stateful operations. I think 
users can adjust locality wait to force Spark doing that.
   
   The proposal of this is to stabilize the unloading behavior. To avoid unload 
some stores earlier and some stores later. It makes harder to estimate the 
query behavior. It is possible that a query works because it unloads stores 
earlier and sometime it doesn't because it unloads stores later.
   
   If you think we should not make it as a configurable item. I can remove it 
from a configuration and only check if alive time is more than the maintenance 
interval.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] viirya commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

Reply via email to