Ngone51 commented on pull request #30763:
URL: https://github.com/apache/spark/pull/30763#issuecomment-792850010


   > So if I understand correctly BlockManagerId would extend the Location 
class, right?
   And here MapStatus#location would be a generic Location?
   
   Yes
   
   > As the current reader uses MapOutputTracker#getMapSizesByExecutorId you 
would like to keep that method and runtime throw an exception when it's called 
and location is not BlockManagerId?
   
   We don't. Actually, `MapOutputTracker` should be refactored to work with the 
`Location` instead of the specific `BlockManagerId` if `Location` introduced. 
Accordingly, `blocksByAddress` would be refactored to store the unique 
"address" generated by `Localtion`. That also means we'd always keep the 
generic `Location` inside `ShuffleBlockFetchIterator` instead of a specific 
`Location`, so we don't need casting.
   
   I think it also answers this question: 
   
   > In this case we should check the references of this MapStatus#location and 
based on that decide where we are safe to cast Location to BlockManagerId or 
where else we would pass the location further as a Location (or at least what 
else the generic location should contain to have the existing things 
working...).
   
   Acutally, I only find one reference that need cast:
   
   
https://github.com/apache/spark/blob/f340857757034ac955862b34e60322ca5ee81758/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1661
   
   
   And yes the custom reader should care more about casting. They should 
definitely cast the generic `Location` to their implemented one if they want to 
get the specific information. But the casting should always succeed because 
Spark would only use one type storage at a time.
   
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to