HeartSaVioR commented on PR #38853:
URL: https://github.com/apache/spark/pull/38853#issuecomment-1333166532

   Just to give you more context for your previous comment 
(https://github.com/apache/spark/pull/38853#issuecomment-1333073885)...
   
   We have two different set of code path, 1) two physical nodes for one 
stateful operator (streaming aggregation) 2) one physical node for one stateful 
operator (others). For the latter, it only initializes read-write state store, 
and at the task completion, it only calls abort if the task failed to do 
commit. For the former, it initializes read only state store as well, which we 
call abort at the task completion to clean up the resource (NOTE: not to 
rollback as this is read-only. The name is unfortunately due to compatibility 
issue during the addition of new interface. See 
https://github.com/apache/spark/commit/21413b7dd4e19f725b21b92cddfbe73d1b381a05).
   
   It is safe for read-only state store to call abort() even there is another 
read-write store referring the same, because read-write store would have 
completed to call commit() if the task works correctly and we expect 
(effectively) no-op from calling abort() after commit().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to