mridulm edited a comment on pull request #30876:
URL: https://github.com/apache/spark/pull/30876#issuecomment-751186157


   @dongjoon-hyun proactive replication only applies to persisted RDD blocks, 
not shuffle blocks - not sure if I am missing something here.
   
   Even for persisted RDD blocks, it specifically applies when RDD is persisted 
with storage levels where `replication > 1` [1].
   I view loss of all replicas of a RDD blockId similarly - whether replication 
is 1 or higher.
   Having said that, specifically for usecases where spark cluster might be 
source of truth (or cost of recomputation is prohibitive), applications can 
ofcourse enable proactive replication via this flag.
   I am not sure I am seeing a concrete reason to turn this on for all 
applications.
   
   Please let me know if I am missing something in my understanding.
   
   [1] ESS serving disk backed blocks might have some corner cases to this flow 
which I have not thought through.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to