otterc commented on a change in pull request #33446:
URL: https://github.com/apache/spark/pull/33446#discussion_r674418174
##########
File path:
core/src/main/scala/org/apache/spark/storage/PushBasedFetchHelper.scala
##########
@@ -138,10 +140,19 @@ private class PushBasedFetchHelper(
((shuffleBlockId.shuffleId, shuffleBlockId.reduceId), size)
}.toMap
val address = req.address
+ var remainingBlockCount = sizeMap.size
+ val requestStartTime = clock.nanoTime()
+
val mergedBlocksMetaListener = new MergedBlocksMetaListener {
override def onSuccess(shuffleId: Int, reduceId: Int, meta:
MergedBlockMeta): Unit = {
logInfo(s"Received the meta of push-merged block for ($shuffleId,
$reduceId) " +
s"from ${req.address.host}:${req.address.port}")
+ remainingBlockCount -= 1
+ if (remainingBlockCount == 0) {
+ iterator.logFetchIfSlow(
+ TimeUnit.NANOSECONDS.toMillis(clock.nanoTime() - requestStartTime),
+ sizeMap.values.sum, sizeMap.size, address)
+ }
Review comment:
For the FetchShuffleBlocks, I think we check `remainingBlock==0` to know
when the network request (containing multiple blocks) is done. @xkrogen
please correct me if I am wrong.
Here we don't need to because each network req is just requesting the meta
of 1 merged block. However, the size of the meta information is not know to the
client. The size here is not the size of the meta information which is being
fetched from the server but the size of the merged block. So, I think we can't
use that for figuring out if the fetch of this meta information is slow. Hmmm..
Then I think maybe we should not log it for the meta request. Sorry, should
have thought about it earlier.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]