Ngone51 commented on pull request #30763: URL: https://github.com/apache/spark/pull/30763#issuecomment-792861035
> I just have my doubts about the amount of the efforts we need to do As far as I see, 1) in @hiboyang 's PR, he added `getAllMapOutputStatusMetadata` in `MapOutputTracker`. IIUC, this must need corresponding change to handle the metadata at the reader side, which would be a new code path. 2) in this PR, I see we added `ShuffleOutputTracker`, which is very similar to `MapOutputTracker`. And, `MapOutputTracker` has add a new interface - `updateMapOutput` to support node decommission recently. But `ShuffleOutputTracker` doesn't have it. Do we want to support decommission for custom storages too or only specific to the BlockManager? In the way of `ShuffleOutputTracker`, I think we must need extra effort if we want to support it in custom storages. However, if we have the generic `Location`, we can reuse `MapOutputTracker` directly. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
