Re: [PR] Add `aggregatorMergeStrategy` property in SegmentMetadata queries (druid)

via GitHub Thu, 13 Jul 2023 15:19:05 -0700


abhishekrb19 commented on code in PR #14560:
URL: https://github.com/apache/druid/pull/14560#discussion_r1263100788



##########
processing/src/main/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChest.java:
##########
@@ -268,16 +270,30 @@ public static SegmentAnalysis mergeAnalyses(
       return arg1;
     }
 
-    // Swap arg1, arg2 so the later-ending interval is first. This ensures we 
prefer the latest column order.
-    // We're preserving it so callers can see columns in their natural order.
-    if (dataSource != null) {
+    // This is a defensive check since SegementMetadata query instantiation 
guarantees this
+    if (CollectionUtils.isNullOrEmpty(dataSources)) {
+      throw InvalidInput.exception("SegementMetadata queries require at least 
one datasource.");
+    }
+
+    SegmentId mergedSegmentId = null;
+
+    for (String dataSource : dataSources) {
       final SegmentId id1 = SegmentId.tryParse(dataSource, arg1.getId());
       final SegmentId id2 = SegmentId.tryParse(dataSource, arg2.getId());
 
-      if (id1 != null && id2 != null && 
id2.getIntervalEnd().isAfter(id1.getIntervalEnd())) {
-        final SegmentAnalysis tmp = arg1;
-        arg1 = arg2;
-        arg2 = tmp;
+      // Swap arg1, arg2 so the later-ending interval is first. This ensures 
we prefer the latest column order.
+      // We're preserving it so callers can see columns in their natural order.
+      if (id1 != null && id2 != null) {
+        if (id2.getIntervalEnd().isAfter(id1.getIntervalEnd()) ||
+            (id2.getIntervalEnd().isEqual(id1.getIntervalEnd()) && 
id2.getPartitionNum() > id1.getPartitionNum())) {
+          mergedSegmentId = SegmentId.merged(dataSource, id2.getInterval(), 
id2.getPartitionNum());
+          final SegmentAnalysis tmp = arg1;
+          arg1 = arg2;
+          arg2 = tmp;
+        } else {
+          mergedSegmentId = SegmentId.merged(dataSource, id1.getInterval(), 
id1.getPartitionNum());
+        }
+        break;
       }

Review Comment:
   @abhishekagarwal87, this is for Union datasources that can have multiple 
table sources. Segment metadata queries already 
[support](https://github.com/apache/druid/pull/14560/files#diff-232a53270a761b5a581cf2e575511bc7981cff11bf2f08c7bf019eb1fb310e4fR110)
 table and union datasources. Previously the caller was passing in only the 
[first](https://github.com/apache/druid/pull/14560/files#diff-acc53e0e891879963c30d5c8b349347f6b7d96c0560992a3c8fe1f9f3d2ddd3bL142)
 datasource to this function to align the supplied args. So 
`SegmentId.tryParse()` will not succeed if the datasource name wasn't the first 
datasource in the the union datasource.
   
   With this change, all the table names (in the case of union datasources) are 
passed in to parse the segment id, sort, and align the segment analyses - 
`arg1` and `arg2`. I will add a Javadoc and maybe a unit test to this effect in 
a follow on when I add the `earliest` strategy



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add `aggregatorMergeStrategy` property in SegmentMetadata queries (druid)

Reply via email to