[GitHub] [spark] attilapiros commented on a change in pull request #30004: [SPARK-33114][CORE] Add metadata in MapStatus to support custom shuffle manager

GitBox Fri, 05 Mar 2021 20:49:04 -0800


attilapiros commented on a change in pull request #30004:
URL: https://github.com/apache/spark/pull/30004#discussion_r588832427




##########
File path: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala
##########
@@ -63,16 +63,37 @@ class MapOutputTrackerSuite extends SparkFunSuite {
     assert(tracker.containsShuffle(10))
     val size1000 = MapStatus.decompressSize(MapStatus.compressSize(1000L))
     val size10000 = MapStatus.decompressSize(MapStatus.compressSize(10000L))
-    tracker.registerMapOutput(10, 0, MapStatus(BlockManagerId("a", "hostA", 
1000),
-        Array(1000L, 10000L), 5))
-    tracker.registerMapOutput(10, 1, MapStatus(BlockManagerId("b", "hostB", 
1000),
-        Array(10000L, 1000L), 6))
+    val mapStatus1 = MapStatus(BlockManagerId("a", "hostA", 1000), 
Array(1000L, 10000L), 5)
+    val mapStatus2 = MapStatus(BlockManagerId("b", "hostB", 1000), 
Array(10000L, 1000L), 6)
+    tracker.registerMapOutput(10, 0, mapStatus1)
+    tracker.registerMapOutput(10, 1, mapStatus2)
     val statuses = tracker.getMapSizesByExecutorId(10, 0)
     assert(statuses.toSet ===
       Seq((BlockManagerId("a", "hostA", 1000),
         ArrayBuffer((ShuffleBlockId(10, 5, 0), size1000, 0))),
           (BlockManagerId("b", "hostB", 1000),
             ArrayBuffer((ShuffleBlockId(10, 6, 0), size10000, 1)))).toSet)
+    val allStatuses = tracker.getAllMapOutputStatuses(10)
+    assert(allStatuses.toSet === Set(mapStatus1, mapStatus2))

Review comment:
       With the `toSet` you lose the ordering meanwhile ordering can be 
important (that is the mapIndex) so it should be tested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] attilapiros commented on a change in pull request #30004: [SPARK-33114][CORE] Add metadata in MapStatus to support custom shuffle manager

Reply via email to