HeartSaVioR commented on PR #42567: URL: https://github.com/apache/spark/pull/42567#issuecomment-1687172602
Thanks for clarifying. So the capped memory usage could be temporarily exceeded but RocksDB will try hard to respect the requirement as soon as it can. That explains the enough difference on rationalization of the capped memory. I'm OK with soft limit. I'd say hard limit is very useful to restrict the blast radius and the streaming query should just fail if it cannot live with hard limit despite of proper rebalancing of state. But given that we don't have a proper rebalancing of state, hard limiting may play as random failures depending on scheduling of stateful partitions. We should probably document the behavior though, so that some users would be able to plan for some margin. Could you please update the doc? > They might still need to restart the cluster to allow for updated memory limit values to take effect, correct ? Yes, that's what I meant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
