mridulm commented on a change in pull request #34156:
URL: https://github.com/apache/spark/pull/34156#discussion_r720616373
##########
File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala
##########
@@ -519,17 +521,19 @@ private[spark] abstract class MapOutputTracker(conf:
SparkConf) extends Logging
* but endMapIndex is excluded). If endMapIndex=Int.MaxValue, the actual
endMapIndex will be
* changed to the length of total map outputs.
*
- * @return A sequence of 2-item tuples, where the first item in the tuple is
a BlockManagerId,
- * and the second item is a sequence of (shuffle block id, shuffle
block size, map index)
- * tuples describing the shuffle blocks that are stored at that
block manager.
- * Note that zero-sized blocks are excluded in the result.
+ * @return A case class object which includes two attributes. The first
attribute is a sequence
+ * of 2-item tuples, where the first item in the tuple is a
BlockManagerId, and the
+ * second item is a sequence of (shuffle block id, shuffle block
size, map index) tuples
+ * tuples describing the shuffle blocks that are stored at that
block manager. Note that
+ * zero-sized blocks are excluded in the result. The second
attribute is a boolean flag,
+ * indicating whether batch fetch can be enabled.
*/
def getMapSizesByExecutorId(
shuffleId: Int,
startMapIndex: Int,
endMapIndex: Int,
startPartition: Int,
- endPartition: Int): Iterator[(BlockManagerId, Seq[(BlockId, Long, Int)])]
+ endPartition: Int): MapSizesByExecutorId
Review comment:
Given how close we are to RC, I am fine with this approach.
In general though, we should be very careful about setting expectation that
there would be compatibility guarantees with `private[spark]` classes; they are
explicitly marked that way to make it very clear not to depend on them. Inspite
of that, if there are projects/users depending on it, it is up to them to
ensure compatibility - not spark project.
@zhouyejoe Can you evaluate the change that @Ngone51 proposed please ?
Internally, it could delegate to the same `getMapSizesByExecutorIdImpl` (which
would be what you have here) and simply return response.iter for existing
method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]