attilapiros commented on a change in pull request #31876:
URL: https://github.com/apache/spark/pull/31876#discussion_r599413719



##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -502,7 +502,7 @@ private[spark] class MapOutputTrackerMaster(
   }
 
   /** Unregister map output information of the given shuffle, mapper and block 
manager */
-  def unregisterMapOutput(shuffleId: Int, mapIndex: Int, bmAddress: 
BlockManagerId): Unit = {
+  def unregisterMapOutput(shuffleId: Int, mapIndex: Int, bmAddress: Location): 
Unit = {
     shuffleStatuses.get(shuffleId) match {
       case Some(shuffleStatus) =>
         shuffleStatus.removeMapOutput(mapIndex, bmAddress)

Review comment:
       ```suggestion
           shuffleStatus.removeMapOutput(mapIndex, loc)
   ```

##########
File path: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala
##########
@@ -28,16 +28,23 @@ import org.apache.spark.internal.config
 import org.apache.spark.storage.BlockManagerId
 import org.apache.spark.util.Utils
 
+trait Location extends Externalizable {

Review comment:
       Using`Externalizable` here is totally fine.

##########
File path: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala
##########
@@ -28,16 +28,23 @@ import org.apache.spark.internal.config
 import org.apache.spark.storage.BlockManagerId
 import org.apache.spark.util.Utils
 
+trait Location extends Externalizable {
+  def host: String
+  def port: Int
+  def hostPort: String
+  def executorId: String = "unknown"

Review comment:
       I think in `Location` to specify these methods violates the abstraction. 
I know this convenient and avoids casting but still we should avoid meaningless 
methods. In a disaggregated storage solution the `executorId` has no value and 
probably host and port is not enough and never needed as separate entities but 
an URL like construct will be more useful (or something else but it is the 
responsibility of the specific subclass).

##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -502,7 +502,7 @@ private[spark] class MapOutputTrackerMaster(
   }
 
   /** Unregister map output information of the given shuffle, mapper and block 
manager */
-  def unregisterMapOutput(shuffleId: Int, mapIndex: Int, bmAddress: 
BlockManagerId): Unit = {
+  def unregisterMapOutput(shuffleId: Int, mapIndex: Int, bmAddress: Location): 
Unit = {

Review comment:
       Nit: to be consistent with the other changes:
   
   ```suggestion
     def unregisterMapOutput(shuffleId: Int, mapIndex: Int, loc: Location): 
Unit = {
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to