Ngone51 commented on pull request #30763: URL: https://github.com/apache/spark/pull/30763#issuecomment-792850010
> So if I understand correctly BlockManagerId would extend the Location class, right? And here MapStatus#location would be a generic Location? Yes > As the current reader uses MapOutputTracker#getMapSizesByExecutorId you would like to keep that method and runtime throw an exception when it's called and location is not BlockManagerId? We don't. Actually, `MapOutputTracker` should be refactored to work with the `Location` instead of the specific `BlockManagerId` if `Location` introduced. Accordingly, `blocksByAddress` would be refactored to store the unique "address" generated by `Localtion`. That also means we'd always keep the generic `Location` inside `ShuffleBlockFetchIterator` instead of a specific `Location`, so we don't need casting. I think it also answers this question: > In this case we should check the references of this MapStatus#location and based on that decide where we are safe to cast Location to BlockManagerId or where else we would pass the location further as a Location (or at least what else the generic location should contain to have the existing things working...). Acutally, I only find one reference that need cast: https://github.com/apache/spark/blob/f340857757034ac955862b34e60322ca5ee81758/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1661 And yes the custom reader should care more about casting. They should definitely cast the generic `Location` to their implemented one if they want to get the specific information. But the casting should always succeed because Spark would only use one type storage at a time. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
