This is an automated email from the ASF dual-hosted git repository.

zachjsh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new f4d0ea7bc8 Add support for earliest `aggregatorMergeStrategy` (#14598)
f4d0ea7bc8 is described below

commit f4d0ea7bc82bb185a467f7f9fa2a085068995573
Author: Abhishek Radhakrishnan <[email protected]>
AuthorDate: Tue Jul 18 12:37:10 2023 -0700

    Add support for earliest `aggregatorMergeStrategy` (#14598)
    
    * Add EARLIEST aggregator merge strategy.
    
    - More unit tests.
    - Include the aggregators analysis type by default in tests.
    
    * Docs.
    
    * Some comments and a test
    
    * Collapse into individual code blocks.
---
 docs/querying/segmentmetadataquery.md              |  12 +-
 .../SegmentMetadataQueryQueryToolChest.java        |  17 +-
 .../metadata/metadata/AggregatorMergeStrategy.java |   1 +
 .../SegmentMetadataQueryQueryToolChestTest.java    | 416 ++++++++++++++++++++-
 .../query/metadata/SegmentMetadataQueryTest.java   |  71 ++--
 5 files changed, 459 insertions(+), 58 deletions(-)

diff --git a/docs/querying/segmentmetadataquery.md 
b/docs/querying/segmentmetadataquery.md
index 22176ee264..3e1b4a5a24 100644
--- a/docs/querying/segmentmetadataquery.md
+++ b/docs/querying/segmentmetadataquery.md
@@ -62,7 +62,7 @@ There are several main parts to a segment metadata query:
 |merge|Merge all individual segment metadata results into a single result|no|
 |context|See [Context](../querying/query-context.md)|no|
 |analysisTypes|A list of Strings specifying what column properties (e.g. 
cardinality, size) should be calculated and returned in the result. Defaults to 
["cardinality", "interval", "minmax"], but can be overridden with using the 
[segment metadata query 
config](../configuration/index.md#segmentmetadata-query-config). See section 
[analysisTypes](#analysistypes) for more details.|no|
-|aggregatorMergeStrategy| The strategy Druid uses to merge aggregators across 
segments. If true and if the `aggregators` analysis type is enabled, 
`aggregatorMergeStrategy` defaults to `strict`. Possible values include 
`strict`, `lenient`, and `latest`. See 
[`aggregatorMergeStrategy`](#aggregatormergestrategy) for details.|no|
+|aggregatorMergeStrategy| The strategy Druid uses to merge aggregators across 
segments. If true and if the `aggregators` analysis type is enabled, 
`aggregatorMergeStrategy` defaults to `strict`. Possible values include 
`strict`, `lenient`, `earliest`, and `latest`. See 
[`aggregatorMergeStrategy`](#aggregatormergestrategy) for details.|no|
 |lenientAggregatorMerge|Deprecated. Use `aggregatorMergeStrategy` property 
instead. If true, and if the `aggregators` analysis type is enabled, Druid 
merges aggregators leniently.|no|
 
 The format of the result is:
@@ -186,7 +186,7 @@ Currently, there is no API for retrieving this information.
 * `aggregators` in the result will contain the list of aggregators usable for 
querying metric columns. This may be
 null if the aggregators are unknown or unmergeable (if merging is enabled).
 
-* Merging can be `strict`, `lenient`, or `latest`. See 
[`aggregatorMergeStrategy`](#aggregatormergestrategy) for details.
+* Merging can be `strict`, `lenient`, `earliest`, or `latest`. See 
[`aggregatorMergeStrategy`](#aggregatormergestrategy) for details.
 
 * The form of the result is a map of column name to aggregator.
 
@@ -201,10 +201,12 @@ Conflicts between aggregator metadata across segments can 
occur if some segments
 two segments use incompatible aggregators for the same column, such as 
`longSum` changed to `doubleSum`.
 Druid supports the following aggregator merge strategies:
 
-- `strict`:  If there are any segments with unknown aggregators or any 
conflicts of any kind, the merged aggregators
+- `strict`: If there are any segments with unknown aggregators or any 
conflicts of any kind, the merged aggregators
   list is `null`.
-- `lenient`:  Druid ignores segments with unknown aggregators. Conflicts 
between aggregators set the aggregator for that particular column to null.
-- the aggregator for that particular column.
+- `lenient`: Druid ignores segments with unknown aggregators. Conflicts 
between aggregators set the aggregator for
+  that particular column to null.
+- `earliest`: In the event of conflicts between segments, Druid selects the 
aggregator from the earliest segment
+  for that particular column.
 - `latest`: In the event of conflicts between segments, Druid selects the 
aggregator from the most recent segment
    for that particular column.
 
diff --git 
a/processing/src/main/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChest.java
 
b/processing/src/main/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChest.java
index 655b95e503..4bb73d845e 100644
--- 
a/processing/src/main/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChest.java
+++ 
b/processing/src/main/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChest.java
@@ -277,6 +277,7 @@ public class SegmentMetadataQueryQueryToolChest extends 
QueryToolChest<SegmentAn
 
     SegmentId mergedSegmentId = null;
 
+    // Union datasources can have multiple datasources. So we iterate through 
all the datasources to parse the segment id.
     for (String dataSource : dataSources) {
       final SegmentId id1 = SegmentId.tryParse(dataSource, arg1.getId());
       final SegmentId id2 = SegmentId.tryParse(dataSource, arg2.getId());
@@ -364,15 +365,23 @@ public class SegmentMetadataQueryQueryToolChest extends 
QueryToolChest<SegmentAn
           aggregators.put(aggregator.getName(), aggregator);
         }
       }
+    } else if (AggregatorMergeStrategy.EARLIEST == aggregatorMergeStrategy) {
+      // The segment analyses are already ordered above, where arg1 is the 
analysis pertaining to the latest interval
+      // followed by arg2. So for earliest strategy, the iteration order 
should be arg2 and arg1.
+      for (SegmentAnalysis analysis : ImmutableList.of(arg2, arg1)) {
+        if (analysis.getAggregators() != null) {
+          for (Map.Entry<String, AggregatorFactory> entry : 
analysis.getAggregators().entrySet()) {
+            aggregators.putIfAbsent(entry.getKey(), entry.getValue());
+          }
+        }
+      }
     } else if (AggregatorMergeStrategy.LATEST == aggregatorMergeStrategy) {
       // The segment analyses are already ordered above, where arg1 is the 
analysis pertaining to the latest interval
-      // followed by arg2.
+      // followed by arg2. So for latest strategy, the iteration order should 
be arg1 and arg2.
       for (SegmentAnalysis analysis : ImmutableList.of(arg1, arg2)) {
         if (analysis.getAggregators() != null) {
           for (Map.Entry<String, AggregatorFactory> entry : 
analysis.getAggregators().entrySet()) {
-            final String aggregatorName = entry.getKey();
-            final AggregatorFactory aggregator = entry.getValue();
-            aggregators.putIfAbsent(aggregatorName, aggregator);
+            aggregators.putIfAbsent(entry.getKey(), entry.getValue());
           }
         }
       }
diff --git 
a/processing/src/main/java/org/apache/druid/query/metadata/metadata/AggregatorMergeStrategy.java
 
b/processing/src/main/java/org/apache/druid/query/metadata/metadata/AggregatorMergeStrategy.java
index d7c013131a..9155bdab34 100644
--- 
a/processing/src/main/java/org/apache/druid/query/metadata/metadata/AggregatorMergeStrategy.java
+++ 
b/processing/src/main/java/org/apache/druid/query/metadata/metadata/AggregatorMergeStrategy.java
@@ -28,6 +28,7 @@ public enum AggregatorMergeStrategy
 {
   STRICT,
   LENIENT,
+  EARLIEST,
   LATEST;
 
   @JsonValue
diff --git 
a/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChestTest.java
 
b/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChestTest.java
index 4e77087c79..1926e14eae 100644
--- 
a/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChestTest.java
+++ 
b/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryQueryToolChestTest.java
@@ -32,6 +32,7 @@ import org.apache.druid.query.CacheStrategy;
 import org.apache.druid.query.DataSource;
 import org.apache.druid.query.Druids;
 import org.apache.druid.query.TableDataSource;
+import org.apache.druid.query.UnionDataSource;
 import org.apache.druid.query.aggregation.AggregatorFactory;
 import org.apache.druid.query.aggregation.DoubleMaxAggregatorFactory;
 import org.apache.druid.query.aggregation.DoubleSumAggregatorFactory;
@@ -62,8 +63,18 @@ import java.util.stream.Collectors;
 public class SegmentMetadataQueryQueryToolChestTest
 {
   private static final DataSource TEST_DATASOURCE = new 
TableDataSource("dummy");
-  private static final SegmentId TEST_SEGMENT_ID1 = 
SegmentId.of(TEST_DATASOURCE.toString(), Intervals.of("2020-01-01/2020-01-02"), 
"test", 0);
-  private static final SegmentId TEST_SEGMENT_ID2 = 
SegmentId.of(TEST_DATASOURCE.toString(), Intervals.of("2021-01-01/2021-01-02"), 
"test", 0);
+  private static final SegmentId TEST_SEGMENT_ID1 = SegmentId.of(
+      TEST_DATASOURCE.toString(),
+      Intervals.of("2020-01-01/2020-01-02"),
+      "test",
+      0
+  );
+  private static final SegmentId TEST_SEGMENT_ID2 = SegmentId.of(
+      TEST_DATASOURCE.toString(),
+      Intervals.of("2021-01-01/2021-01-02"),
+      "test",
+      0
+  );
 
   @Test
   public void testCacheStrategy() throws Exception
@@ -162,19 +173,19 @@ public class SegmentMetadataQueryQueryToolChestTest
 
     Assert.assertEquals(
         new SegmentAnalysis(
-        "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
-        null,
-        new LinkedHashMap<>(),
-        0,
-        0,
-        ImmutableMap.of(
-            "foo", new LongSumAggregatorFactory("foo", "foo"),
-            "bar", new DoubleSumAggregatorFactory("bar", "bar"),
-            "baz", new DoubleSumAggregatorFactory("baz", "baz")
-        ),
-        null,
-        null,
-        null
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar"),
+                "baz", new DoubleSumAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
         ),
         mergeStrict(analysis1, analysis2)
     );
@@ -198,6 +209,25 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLenient(analysis1, analysis2)
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar"),
+                "baz", new DoubleSumAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -292,6 +322,25 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLenient(analysis1, analysis2)
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            expectedIntervals,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "baz", new DoubleSumAggregatorFactory("baz", "baz"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -374,6 +423,24 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLenient(analysis1, analysis2)
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -449,6 +516,21 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLenient(analysis1, analysis2)
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            null,
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -552,6 +634,48 @@ public class SegmentMetadataQueryQueryToolChestTest
         )
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar"),
+                "baz", new LongMaxAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
+    // Simulate multi-level earliest merge
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar"),
+                "baz", new LongMaxAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(
+            mergeEarliest(analysis1, analysis2),
+            mergeEarliest(analysis1, analysis2)
+        )
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -571,7 +695,7 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLatest(analysis1, analysis2)
     );
 
-    // Simulate multi-level lenient merge
+    // Simulate multi-level latest merge
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -683,6 +807,48 @@ public class SegmentMetadataQueryQueryToolChestTest
         )
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleMaxAggregatorFactory("bar", "bar"),
+                "baz", new LongMaxAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
+    // Simulate multi-level earliest merge
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleMaxAggregatorFactory("bar", "bar"),
+                "baz", new LongMaxAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(
+            mergeEarliest(analysis1, analysis2),
+            mergeEarliest(analysis1, analysis2)
+        )
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -702,7 +868,7 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLatest(analysis1, analysis2)
     );
 
-    // Simulate multi-level lenient merge
+    // Simulate multi-level latest merge
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
@@ -729,8 +895,18 @@ public class SegmentMetadataQueryQueryToolChestTest
   @Test
   public void 
testMergeAggregatorsConflictWithEqualSegmentIntervalsAndDifferentPartitions()
   {
-    final SegmentId segmentId1 = SegmentId.of(TEST_DATASOURCE.toString(), 
Intervals.of("2023-01-01/2023-01-02"), "test", 1);
-    final SegmentId segmentId2 = SegmentId.of(TEST_DATASOURCE.toString(), 
Intervals.of("2023-01-01/2023-01-02"), "test", 2);
+    final SegmentId segmentId1 = SegmentId.of(
+        TEST_DATASOURCE.toString(),
+        Intervals.of("2023-01-01/2023-01-02"),
+        "test",
+        1
+    );
+    final SegmentId segmentId2 = SegmentId.of(
+        TEST_DATASOURCE.toString(),
+        Intervals.of("2023-01-01/2023-01-02"),
+        "test",
+        2
+    );
 
     final SegmentAnalysis analysis1 = new SegmentAnalysis(
         segmentId1.toString(),
@@ -817,6 +993,48 @@ public class SegmentMetadataQueryQueryToolChestTest
         )
     );
 
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2023-01-01T00:00:00.000Z_2023-01-02T00:00:00.000Z_merged_2",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar"),
+                "baz", new LongMaxAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(analysis1, analysis2)
+    );
+
+    // Simulate multi-level earliest merge
+    Assert.assertEquals(
+        new SegmentAnalysis(
+            "dummy_2023-01-01T00:00:00.000Z_2023-01-02T00:00:00.000Z_merged_2",
+            null,
+            new LinkedHashMap<>(),
+            0,
+            0,
+            ImmutableMap.of(
+                "foo", new LongSumAggregatorFactory("foo", "foo"),
+                "bar", new DoubleSumAggregatorFactory("bar", "bar"),
+                "baz", new LongMaxAggregatorFactory("baz", "baz")
+            ),
+            null,
+            null,
+            null
+        ),
+        mergeEarliest(
+            mergeEarliest(analysis1, analysis2),
+            mergeEarliest(analysis1, analysis2)
+        )
+    );
+
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2023-01-01T00:00:00.000Z_2023-01-02T00:00:00.000Z_merged_2",
@@ -836,7 +1054,7 @@ public class SegmentMetadataQueryQueryToolChestTest
         mergeLatest(analysis1, analysis2)
     );
 
-    // Simulate multi-level lenient merge
+    // Simulate multi-level latest merge
     Assert.assertEquals(
         new SegmentAnalysis(
             "dummy_2023-01-01T00:00:00.000Z_2023-01-02T00:00:00.000Z_merged_2",
@@ -976,6 +1194,12 @@ public class SegmentMetadataQueryQueryToolChestTest
     Assert.assertFalse(mergeLenient(analysis2, analysis3).isRollup());
     Assert.assertTrue(mergeLenient(analysis4, analysis5).isRollup());
 
+    Assert.assertNull(mergeEarliest(analysis1, analysis2).isRollup());
+    Assert.assertNull(mergeEarliest(analysis1, analysis4).isRollup());
+    Assert.assertNull(mergeEarliest(analysis2, analysis4).isRollup());
+    Assert.assertFalse(mergeEarliest(analysis2, analysis3).isRollup());
+    Assert.assertTrue(mergeEarliest(analysis4, analysis5).isRollup());
+
     Assert.assertNull(mergeLatest(analysis1, analysis2).isRollup());
     Assert.assertNull(mergeLatest(analysis1, analysis4).isRollup());
     Assert.assertNull(mergeLatest(analysis2, analysis4).isRollup());
@@ -1042,6 +1266,146 @@ public class SegmentMetadataQueryQueryToolChestTest
     );
   }
 
+
+  @Test
+  public void testMergeWithUnionDatasource()
+  {
+    final SegmentAnalysis analysis1 = new SegmentAnalysis(
+        TEST_SEGMENT_ID1.toString(),
+        null,
+        new LinkedHashMap<>(),
+        0,
+        0,
+        ImmutableMap.of(
+            "foo", new LongSumAggregatorFactory("foo", "foo"),
+            "bar", new DoubleSumAggregatorFactory("bar", "bar")
+        ),
+        null,
+        null,
+        null
+    );
+    final SegmentAnalysis analysis2 = new SegmentAnalysis(
+        TEST_SEGMENT_ID2.toString(),
+        null,
+        new LinkedHashMap<>(),
+        0,
+        0,
+        ImmutableMap.of(
+            "foo", new LongSumAggregatorFactory("foo", "foo"),
+            "bar", new DoubleMaxAggregatorFactory("bar", "bar"),
+            "baz", new LongMaxAggregatorFactory("baz", "baz")
+        ),
+        null,
+        null,
+        false
+    );
+
+    final SegmentAnalysis expectedMergedAnalysis = new SegmentAnalysis(
+        "dummy_2021-01-01T00:00:00.000Z_2021-01-02T00:00:00.000Z_merged",
+        null,
+        new LinkedHashMap<>(),
+        0,
+        0,
+        ImmutableMap.of(
+            "foo", new LongSumAggregatorFactory("foo", "foo"),
+            "bar", new DoubleMaxAggregatorFactory("bar", "bar"),
+            "baz", new LongMaxAggregatorFactory("baz", "baz")
+        ),
+        null,
+        null,
+        null
+    );
+
+    Assert.assertEquals(
+        expectedMergedAnalysis,
+        SegmentMetadataQueryQueryToolChest.finalizeAnalysis(
+            SegmentMetadataQueryQueryToolChest.mergeAnalyses(
+                new UnionDataSource(
+                    ImmutableList.of(
+                        new TableDataSource("foo"),
+                        new TableDataSource("dummy")
+                    )
+                ).getTableNames(),
+                analysis1,
+                analysis2,
+                AggregatorMergeStrategy.LATEST
+            )
+        )
+    );
+
+    Assert.assertEquals(
+        expectedMergedAnalysis,
+        SegmentMetadataQueryQueryToolChest.finalizeAnalysis(
+            SegmentMetadataQueryQueryToolChest.mergeAnalyses(
+                new UnionDataSource(
+                    ImmutableList.of(
+                        new TableDataSource("dummy"),
+                        new TableDataSource("foo"),
+                        new TableDataSource("bar")
+                    )
+                ).getTableNames(),
+                analysis1,
+                analysis2,
+                AggregatorMergeStrategy.LATEST
+            )
+        )
+    );
+  }
+
+  @Test
+  public void testMergeWithNullAnalyses()
+  {
+    final SegmentAnalysis analysis1 = new SegmentAnalysis(
+        TEST_SEGMENT_ID1.toString(),
+        null,
+        new LinkedHashMap<>(),
+        0,
+        0,
+        null,
+        null,
+        null,
+        null
+    );
+    final SegmentAnalysis analysis2 = new SegmentAnalysis(
+        TEST_SEGMENT_ID2.toString(),
+        null,
+        new LinkedHashMap<>(),
+        0,
+        0,
+        null,
+        null,
+        null,
+        false
+    );
+
+    Assert.assertEquals(
+        analysis1,
+        SegmentMetadataQueryQueryToolChest
+            .mergeAnalyses(TEST_DATASOURCE.getTableNames(), analysis1, null, 
AggregatorMergeStrategy.STRICT)
+    );
+    Assert.assertEquals(
+        analysis2,
+        SegmentMetadataQueryQueryToolChest
+            .mergeAnalyses(TEST_DATASOURCE.getTableNames(), null, analysis2, 
AggregatorMergeStrategy.STRICT)
+    );
+    Assert.assertNull(
+        SegmentMetadataQueryQueryToolChest
+            .mergeAnalyses(TEST_DATASOURCE.getTableNames(), null, null, 
AggregatorMergeStrategy.STRICT)
+    );
+    Assert.assertNull(
+        SegmentMetadataQueryQueryToolChest
+            .mergeAnalyses(TEST_DATASOURCE.getTableNames(), null, null, 
AggregatorMergeStrategy.LENIENT)
+    );
+    Assert.assertNull(
+        SegmentMetadataQueryQueryToolChest
+            .mergeAnalyses(TEST_DATASOURCE.getTableNames(), null, null, 
AggregatorMergeStrategy.EARLIEST)
+    );
+    Assert.assertNull(
+        SegmentMetadataQueryQueryToolChest
+            .mergeAnalyses(TEST_DATASOURCE.getTableNames(), null, null, 
AggregatorMergeStrategy.LATEST)
+    );
+  }
+
   private static SegmentAnalysis mergeStrict(SegmentAnalysis analysis1, 
SegmentAnalysis analysis2)
   {
     return SegmentMetadataQueryQueryToolChest.finalizeAnalysis(
@@ -1066,6 +1430,18 @@ public class SegmentMetadataQueryQueryToolChestTest
     );
   }
 
+  private static SegmentAnalysis mergeEarliest(SegmentAnalysis analysis1, 
SegmentAnalysis analysis2)
+  {
+    return SegmentMetadataQueryQueryToolChest.finalizeAnalysis(
+        SegmentMetadataQueryQueryToolChest.mergeAnalyses(
+            TEST_DATASOURCE.getTableNames(),
+            analysis1,
+            analysis2,
+            AggregatorMergeStrategy.EARLIEST
+        )
+    );
+  }
+
   private static SegmentAnalysis mergeLatest(SegmentAnalysis analysis1, 
SegmentAnalysis analysis2)
   {
     return SegmentMetadataQueryQueryToolChest.finalizeAnalysis(
diff --git 
a/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryTest.java
 
b/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryTest.java
index 0e13b9f598..e919799deb 100644
--- 
a/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryTest.java
+++ 
b/processing/src/test/java/org/apache/druid/query/metadata/SegmentMetadataQueryTest.java
@@ -96,6 +96,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
       QueryRunnerTestHelper.NOOP_QUERYWATCHER
   );
   private static final ObjectMapper MAPPER = new DefaultObjectMapper();
+  private static final String DATASOURCE = "testDatasource";
 
   @SuppressWarnings("unchecked")
   public static QueryRunner makeMMappedQueryRunner(
@@ -176,8 +177,8 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
       boolean bitmaps
   )
   {
-    final SegmentId id1 = SegmentId.dummy(differentIds ? "testSegment1" : 
"testSegment");
-    final SegmentId id2 = SegmentId.dummy(differentIds ? "testSegment2" : 
"testSegment");
+    final SegmentId id1 = SegmentId.dummy(differentIds ? "testSegment1" : 
DATASOURCE);
+    final SegmentId id2 = SegmentId.dummy(differentIds ? "testSegment2" : 
DATASOURCE);
     this.runner1 = mmap1
                    ? makeMMappedQueryRunner(id1, rollup1, bitmaps, FACTORY)
                    : makeIncrementalIndexQueryRunner(id1, rollup1, bitmaps, 
FACTORY);
@@ -191,14 +192,15 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
     this.differentIds = differentIds;
     this.bitmaps = bitmaps;
     testQuery = Druids.newSegmentMetadataQueryBuilder()
-                      .dataSource("testing")
+                      .dataSource(DATASOURCE)
                       .intervals("2013/2014")
                       .toInclude(new 
ListColumnIncluderator(Arrays.asList("__time", "index", "placement")))
                       .analysisTypes(
                           SegmentMetadataQuery.AnalysisType.CARDINALITY,
                           SegmentMetadataQuery.AnalysisType.SIZE,
                           SegmentMetadataQuery.AnalysisType.INTERVAL,
-                          SegmentMetadataQuery.AnalysisType.MINMAX
+                          SegmentMetadataQuery.AnalysisType.MINMAX,
+                          SegmentMetadataQuery.AnalysisType.AGGREGATORS
                       )
                       .merge(true)
                       .build();
@@ -213,6 +215,12 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
       overallSize1 = mmap1 ? 201345 : 200831;
       overallSize2 = mmap2 ? 201345 : 200831;
     }
+
+    final Map<String, AggregatorFactory> expectedAggregators = new HashMap<>();
+    for (AggregatorFactory agg : TestIndex.METRIC_AGGS) {
+      expectedAggregators.put(agg.getName(), agg.getCombiningFactory());
+    }
+
     expectedSegmentAnalysis1 = new SegmentAnalysis(
         id1.toString(),
         
ImmutableList.of(Intervals.of("2011-01-12T00:00:00.000Z/2011-04-15T00:00:00.001Z")),
@@ -258,7 +266,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
         ),
         overallSize1,
         1209,
-        null,
+        expectedAggregators,
         null,
         null,
         null
@@ -309,7 +317,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
         // null_column will be included only for incremental index, which 
makes a little bigger result than expected
         overallSize2,
         1209,
-        null,
+        expectedAggregators,
         null,
         null,
         null
@@ -329,7 +337,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testSegmentMetadataQueryWithRollupMerge()
   {
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -385,7 +393,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new ListColumnIncluderator(Arrays.asList("placement", 
"placementish")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.ROLLUP)
@@ -403,7 +411,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testSegmentMetadataQueryWithHasMultipleValuesMerge()
   {
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -459,7 +467,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new ListColumnIncluderator(Arrays.asList("placement", 
"placementish")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.CARDINALITY)
@@ -477,7 +485,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testSegmentMetadataQueryWithComplexColumnMerge()
   {
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -533,7 +541,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new ListColumnIncluderator(Arrays.asList("placement", 
"quality_uniques")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.CARDINALITY)
@@ -621,8 +629,13 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
       ColumnAnalysis analysis
   )
   {
+    final Map<String, AggregatorFactory> expectedAggregators = new HashMap<>();
+    for (AggregatorFactory agg : TestIndex.METRIC_AGGS) {
+      expectedAggregators.put(agg.getName(), agg.getCombiningFactory());
+    }
+
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         ImmutableList.of(expectedSegmentAnalysis1.getIntervals().get(0)),
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -656,7 +669,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
         ),
         expectedSegmentAnalysis1.getSize() + 
expectedSegmentAnalysis2.getSize(),
         expectedSegmentAnalysis1.getNumRows() + 
expectedSegmentAnalysis2.getNumRows(),
-        null,
+        expectedAggregators,
         null,
         null,
         null
@@ -692,7 +705,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testSegmentMetadataQueryWithNoAnalysisTypesMerge()
   {
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -736,7 +749,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("placement")))
         .analysisTypes()
@@ -758,7 +771,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
       expectedAggregators.put(agg.getName(), agg.getCombiningFactory());
     }
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -802,7 +815,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("placement")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.AGGREGATORS)
@@ -824,7 +837,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
       expectedAggregators.put(agg.getName(), agg.getCombiningFactory());
     }
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -868,7 +881,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing222")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("placement")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.AGGREGATORS)
@@ -887,7 +900,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testSegmentMetadataQueryWithTimestampSpecMerge()
   {
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -931,7 +944,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("placement")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.TIMESTAMPSPEC)
@@ -949,7 +962,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testSegmentMetadataQueryWithQueryGranularityMerge()
   {
     SegmentAnalysis mergedSegmentAnalysis = new SegmentAnalysis(
-        differentIds ? "merged" : SegmentId.dummy("testSegment").toString(),
+        differentIds ? "merged" : SegmentId.dummy(DATASOURCE).toString(),
         null,
         new LinkedHashMap<>(
             ImmutableMap.of(
@@ -993,7 +1006,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
 
     SegmentMetadataQuery query = Druids
         .newSegmentMetadataQueryBuilder()
-        .dataSource("testing")
+        .dataSource(DATASOURCE)
         .intervals("2013/2014")
         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("placement")))
         .analysisTypes(SegmentMetadataQuery.AnalysisType.QUERYGRANULARITY)
@@ -1150,7 +1163,7 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testDefaultIntervalAndFiltering()
   {
     SegmentMetadataQuery testQuery = Druids.newSegmentMetadataQueryBuilder()
-                                           .dataSource("testing")
+                                           .dataSource(DATASOURCE)
                                            .toInclude(new 
ListColumnIncluderator(Collections.singletonList("placement")))
                                            .merge(true)
                                            .build();
@@ -1410,12 +1423,12 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testCacheKeyWithListColumnIncluderator()
   {
     SegmentMetadataQuery oneColumnQuery = 
Druids.newSegmentMetadataQueryBuilder()
-                                                .dataSource("testing")
+                                                .dataSource(DATASOURCE)
                                                 .toInclude(new 
ListColumnIncluderator(Collections.singletonList("foo")))
                                                 .build();
 
     SegmentMetadataQuery twoColumnQuery = 
Druids.newSegmentMetadataQueryBuilder()
-                                                .dataSource("testing")
+                                                .dataSource(DATASOURCE)
                                                 .toInclude(new 
ListColumnIncluderator(Arrays.asList("fo", "o")))
                                                 .build();
 
@@ -1436,12 +1449,12 @@ public class SegmentMetadataQueryTest extends 
InitializedNullHandlingTest
   public void testAnanlysisTypesBeingSet()
   {
     SegmentMetadataQuery query1 = Druids.newSegmentMetadataQueryBuilder()
-                                        .dataSource("testing")
+                                        .dataSource(DATASOURCE)
                                         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("foo")))
                                         .build();
 
     SegmentMetadataQuery query2 = Druids.newSegmentMetadataQueryBuilder()
-                                        .dataSource("testing")
+                                        .dataSource(DATASOURCE)
                                         .toInclude(new 
ListColumnIncluderator(Collections.singletonList("foo")))
                                         
.analysisTypes(SegmentMetadataQuery.AnalysisType.MINMAX)
                                         .build();


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to