Ngone51 commented on pull request #30763:
URL: https://github.com/apache/spark/pull/30763#issuecomment-792861035


   >  I just have my doubts about the amount of the efforts we need to do
   
   As far as I see, 
   
   1) in @hiboyang 's PR, he added `getAllMapOutputStatusMetadata` in 
`MapOutputTracker`. IIUC, this must need corresponding change to handle the 
metadata at the reader side, which would be a new code path.
   
   2) in this PR, I see we added `ShuffleOutputTracker`, which is very similar 
to `MapOutputTracker`. And, `MapOutputTracker` has add a new interface - 
`updateMapOutput` to support node decommission recently. But 
`ShuffleOutputTracker` doesn't have it. Do we want to support decommission for 
custom storages too or only specific to the BlockManager? In the way of 
`ShuffleOutputTracker`, I think we must need extra effort if we want to support 
it in custom storages. However, if we have the generic `Location`,  we can 
reuse `MapOutputTracker` directly.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to