databricks-david-lewis commented on issue #27377: [SPARK-30666][Core] Reliable 
single-stage accumulators
URL: https://github.com/apache/spark/pull/27377#issuecomment-587968381
 
 
   @EnricoMi Thank you for continuing to work on this! I appreciate all the 
time and thought you've put into it.
   
   I worry that your solution will lead to lots of duplicated work. Is there 
some way to move the `AccumulatorMode` logic out of the accumulator itself? It 
seems like most accumulators always want to do the correct thing, which is only 
act on the data that is passed on to the rest of the stages.
   
   The only exception I can think of is counting the total number of bytes read 
or written, which is unreliable anyway because certain failures mean that 
information never makes it back to the driver.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to