HeartSaVioR commented on PR #44542: URL: https://github.com/apache/spark/pull/44542#issuecomment-1882621196
I believe the real thing here is that the failure of maintenance task is hammering all active state store providers, effectively impacting to all stateful tasks on the executor. Let's look back what we do in maintenance task. Mostly we do snapshotting and cleaning up orphaned files. If we suppose the task fails, would the state store (provider) be impacted? From what I understand, no, it is not impacted. This is reflected in the HDFS backed state store provider. If we look at maintenance task in HDFS backed state store provider, it swallows non-fatal exception. If we agree that the failure of maintenance task in RocksDB state store provider does not impact the actual state store (provider), we can do the same to RocksDB state store provider. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
