Kimahriman commented on PR #38853: URL: https://github.com/apache/spark/pull/38853#issuecomment-1333651886
> Just to give you more context for your previous comment ([#38853 (comment)](https://github.com/apache/spark/pull/38853#issuecomment-1333073885))... > > We have two different set of code path, 1) two physical nodes for one stateful operator (streaming aggregation) 2) one physical node for one stateful operator (others). For the latter, it only initializes read-write state store, and at the task completion, it only calls abort if the task failed to do commit. For the former, it initializes read only state store as well, which we call abort at the task completion to clean up the resource (NOTE: not to rollback as this is read-only. The name is unfortunately due to compatibility issue during the addition of new interface. See [21413b7](https://github.com/apache/spark/commit/21413b7dd4e19f725b21b92cddfbe73d1b381a05)). > > It is safe for read-only state store to call abort() even there is another read-write store referring the same, because read-write store would have completed to call commit() if the task works correctly and we expect (effectively) no-op from calling abort() after commit(). So the abort is there to basically make sure any writes to the read only state store get reverted unless a downstream read-write instance has already committed changes. I know there are a few different code paths for stateful things (sessions, flatmapgroupswithstate, etc), do we know if that abort call covers all the cases for https://issues.apache.org/jira/browse/SPARK-38277? Or do some of the other code paths not end up using that read-only path at all? Or would it be better just do explicitly clear the write batch anyway even if it is redundant with the abort call? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
