utkarsh39 commented on PR #44321:
URL: https://github.com/apache/spark/pull/44321#issuecomment-1864719908

   **Proposal To Gain Consensus**
   The PR alleviates memory pressure on the driver although at the cost of 
introducing a breaking change as identified by @JoshRosen  in 
https://github.com/apache/spark/pull/44321#pullrequestreview-1785137821. I 
propose that we disable the feature by default and introduce a breaking change 
wherein the `TaskInfo.accumulables()` are empty for `Resubmitted` tasks upon 
the loss of an executor? The behavior change would be to return an **empty** 
`Accumulables` as opposed to returning `Accumulables` of a earlier successful 
task attempt today. When this change is enabled, the behavior change will 
affect the following consumers:
   1. `EventLoggingListener` where task accumulables are serialized to JSON 
upon task completion ([code 
link](https://github.com/apache/spark/blob/aa1ff3789e492545b07d84ac095fc4c39f7446c6/core/src/main/scala/org/apache/spark/util/JsonProtocol.scala#L159)).
   2. Custom Spark Listeners installed by Spark users
   
   
   What do the reviewers think of the proposal?
   
   Note that the current design in the PR does not implement this proposal. 
Currently, accessing the empty accumulables would result in a crash. I will 
refactor the change if agree upon this proposal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to