databricks-david-lewis commented on issue #27377: [SPARK-30666][Core] Reliable single-stage accumulators URL: https://github.com/apache/spark/pull/27377#issuecomment-587968381 @EnricoMi Thank you for continuing to work on this! I appreciate all the time and thought you've put into it. I worry that your solution will lead to lots of duplicated work. Is there some way to move the `AccumulatorMode` logic out of the accumulator itself? It seems like most accumulators always want to do the correct thing, which is only act on the data that is passed on to the rest of the stages. The only exception I can think of is counting the total number of bytes read or written, which is unreliable anyway because certain failures mean that information never makes it back to the driver.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
