[GitHub] [druid] LakshSingla commented on a diff in pull request #13062: Use worker number instead of task id in MSQ for communication to/from workers.

GitBox Thu, 15 Sep 2022 21:23:41 -0700


LakshSingla commented on code in PR #13062:
URL: https://github.com/apache/druid/pull/13062#discussion_r972600501



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/shuffle/DurableStorageInputChannelFactory.java:
##########
@@ -112,11 +119,33 @@ public ReadableFrameChannel openChannel(StageId stageId, 
int workerNumber, int p
     catch (Exception e) {
       throw new IOE(
           e,
-          "Could not find remote output of worker task[%s] stage[%d] 
partition[%d]",
-          workerTaskId,
+          "Could not find remote output of worker task[%d] stage[%d] 
partition[%d]",
+          workerNumber,
           stageId.getStageNumber(),
           partitionNumber
       );
     }
   }
+
+  @Nullable
+  public String findSuccessfulPartitionOutput(
+      final String controllerTaskId,
+      final int workerNo,
+      final int stageNumber,
+      final int partitionNumber
+  ) throws IOException
+  {
+    List<String> fileNames = storageConnector.lsFiles(
+        DurableStorageOutputChannelFactory.getPartitionOutputsFolderName(
+            controllerTaskId,
+            workerNo,
+            stageNumber,
+            partitionNumber
+        )
+    );
+    Optional<String> maybeFileName = fileNames.stream()

Review Comment:
   > As rows in both of them may not follow the same order we might be in a 
soup if worker one reads zombie_task files and worker 2 reads good_id files.
   
   I was under the assumption that the row ordering in both of the successful 
writes should be the same.
   Considering a non shuffling case, it doesn't change the row order, therefore 
the output of the successful write should be identical to the input (which if 
we track back to the original input source, should have a fixed row order.
   In a shuffling case, we might sort the rows; therefore, the output must be 
similar to the sort done. I think there might be some indeterminism there 
(which I doubt considering that the FrameChannelMerger for both workers should 
produce the same output), but when we read the sorted data, we pass it through 
the `FrameChannelMerger` so as long as the rows are identical (and not the 
order) in the files we shouldn't have an issue.
   
   This is all considering that the successful files have different outputs, 
which I don't think should be the case 🤔. WDYT?  



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/shuffle/DurableStorageInputChannelFactory.java:
##########
@@ -112,11 +119,33 @@ public ReadableFrameChannel openChannel(StageId stageId, 
int workerNumber, int p
     catch (Exception e) {
       throw new IOE(
           e,
-          "Could not find remote output of worker task[%s] stage[%d] 
partition[%d]",
-          workerTaskId,
+          "Could not find remote output of worker task[%d] stage[%d] 
partition[%d]",
+          workerNumber,
           stageId.getStageNumber(),
           partitionNumber
       );
     }
   }
+
+  @Nullable
+  public String findSuccessfulPartitionOutput(
+      final String controllerTaskId,
+      final int workerNo,
+      final int stageNumber,
+      final int partitionNumber
+  ) throws IOException
+  {
+    List<String> fileNames = storageConnector.lsFiles(
+        DurableStorageOutputChannelFactory.getPartitionOutputsFolderName(
+            controllerTaskId,
+            workerNo,
+            stageNumber,
+            partitionNumber
+        )
+    );
+    Optional<String> maybeFileName = fileNames.stream()

Review Comment:
   > As rows in both of them may not follow the same order we might be in a 
soup if worker one reads zombie_task files and worker 2 reads good_id files.
   
   I was under the assumption that the row ordering in both of the successful 
writes should be the same.
   
   Considering a non shuffling case, it doesn't change the row order, therefore 
the output of the successful write should be identical to the input (which if 
we track back to the original input source, should have a fixed row order.
   
   In a shuffling case, we might sort the rows; therefore, the output must be 
similar to the sort done. I think there might be some indeterminism there 
(which I doubt considering that the FrameChannelMerger for both workers should 
produce the same output), but when we read the sorted data, we pass it through 
the `FrameChannelMerger` so as long as the rows are identical (and not the 
order) in the files we shouldn't have an issue.
   
   This is all considering that the successful files have different outputs, 
which I don't think should be the case 🤔. WDYT?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] LakshSingla commented on a diff in pull request #13062: Use worker number instead of task id in MSQ for communication to/from workers.

Reply via email to