vanzin commented on a change in pull request #25299: [SPARK-27651][Core] Avoid
the network when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#discussion_r348107457
##########
File path:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -369,6 +371,16 @@ public int removeBlocks(String appId, String execId,
String[] blockIds) {
return numRemovedBlocks;
}
+ public Map<String, String[]> getLocalDirs(String appId, String[] execIds) {
+ return Arrays.stream(execIds)
+ .map(exec -> {
+ ExecutorShuffleInfo info = executors.get(new AppExecId(appId, exec));
+ return Pair.of(exec, (info != null) ? info.localDirs : null);
+ })
+ .filter(pair -> pair.getValue() != null)
Review comment:
After going through the block manager code again, I think it's better to
throw an exception than to ignore this. It's unlikely to happen, but if it does
(probably because of a bug somewhere instead of environmental issues),
returning a different set of executors than the request will probably cause
hard to debug issues in the application, so a useful exception at this point
(instead of just an NPE) would be better for debugging.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]