cryptoe commented on code in PR #13062:
URL: https://github.com/apache/druid/pull/13062#discussion_r973916496
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/shuffle/DurableStorageInputChannelFactory.java:
##########
@@ -112,11 +119,33 @@ public ReadableFrameChannel openChannel(StageId stageId,
int workerNumber, int p
catch (Exception e) {
throw new IOE(
e,
- "Could not find remote output of worker task[%s] stage[%d]
partition[%d]",
- workerTaskId,
+ "Could not find remote output of worker task[%d] stage[%d]
partition[%d]",
+ workerNumber,
stageId.getStageNumber(),
partitionNumber
);
}
}
+
+ @Nullable
+ public String findSuccessfulPartitionOutput(
+ final String controllerTaskId,
+ final int workerNo,
+ final int stageNumber,
+ final int partitionNumber
+ ) throws IOException
+ {
+ List<String> fileNames = storageConnector.lsFiles(
+ DurableStorageOutputChannelFactory.getPartitionOutputsFolderName(
+ controllerTaskId,
+ workerNo,
+ stageNumber,
+ partitionNumber
+ )
+ );
+ Optional<String> maybeFileName = fileNames.stream()
Review Comment:
Consider the scenario where:
1. File with task id which starts with B is written at t0
2. The same partition with task id which starts with A is written at t5
The alphabetically sorted order will give different results at t4 and t6
which is what we would want to avoid.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]