otterc commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r658159431
##########
File path:
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java
##########
@@ -222,7 +223,7 @@ public void sendMergedBlockMetaReq(
handler.addRpcRequest(requestId, callback);
RpcChannelListener listener = new RpcChannelListener(requestId, callback);
channel.writeAndFlush(
- new MergedBlockMetaRequest(requestId, appId, shuffleId,
reduceId)).addListener(listener);
+ new MergedBlockMetaRequest(requestId, appId, shuffleId,
shuffleSequenceId, reduceId)).addListener(listener);
Review comment:
> I still prefer passing shuffleSequenceId in the fetch protocol as it
feels simpler and straightforward to me. If there are other strong reasons for
not doing so other than the ones mentioned earlier,
As I mentioned in my earlier
[comment](https://github.com/apache/spark/pull/33034#discussion_r657623667),
IMO these are strong enough reasons to not modify existing protocol
- Adding a field that is not really needed. This change also trickles down
to the various other classes and APIs like `BlockStoreClient`,
`ExternalBlockStoreClient`, `OneForOneBlocker`, `ExternalBlockHandler`, etc.
Instead the alternative is just a self-contained simple change to track this
metadata in `RemoteBlockPushResolver`.
- Adding this field in the message suggests that the fetch request could be
for older `shuffleSequenceId`. You said we can add a clarifying comment, but
if there is no need to make such a change then what's the point of adding a
comment to explain all this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]