LakshSingla commented on a change in pull request #12046:
URL: https://github.com/apache/druid/pull/12046#discussion_r767555906



##########
File path: 
core/src/main/java/org/apache/druid/timeline/partition/DimensionRangeShardSpec.java
##########
@@ -145,29 +141,81 @@ private static ShardSpecLookup createLookup(List<? 
extends ShardSpec> shardSpecs
     return Collections.unmodifiableList(dimensions);
   }
 
-  private Range<String> getFirstDimRange()
+  /**
+   * Check if a given domain of Strings is a singleton set containing the 
given value
+   * @param rangeSet Domain of Strings
+   * @param val Value of String
+   * @return rangeSet == {val}
+   */
+  private boolean isRangeSetSingletonWithVal(RangeSet<String> rangeSet, String 
val)
   {
-    Range<String> range;
-    if (firstDimStart == null && firstDimEnd == null) {
-      range = Range.all();
-    } else if (firstDimStart == null) {
-      range = Range.atMost(firstDimEnd);
-    } else if (firstDimEnd == null) {
-      range = Range.atLeast(firstDimStart);
-    } else {
-      range = Range.closed(firstDimStart, firstDimEnd);
+    if (val == null) {
+      return false;
     }
-    return range;
+    return rangeSet.asRanges().equals(
+        Collections.singleton(Range.singleton(val))
+    );
   }
 
+  /**
+   * Pruning scenarios at dimension i: (When the Effective domain is empty)
+   *  0) query domain left-aligns with start and right-aligns with end and 
then deviates outside to align with neither
+   *  1) The query domain left-aligns with start until dimension i and then 
deviates to be strictly less than start[i]
+   *  2) The query domain right-aligns with end until dimension i and then 
deviates to be strictly greater than end[i]
+   *
+   * Immediate acceptance criteria at dimension i:
+   *  Effective domain is non-empty and no longer aligns with any boundary
+   *
+   * @param domain The domain inferred from the query. Assumed to be non-emtpy
+   * @return boolean indicating if the segment needs to be considered or pruned
+   */
   @Override
   public boolean possibleInDomain(Map<String, RangeSet<String>> domain)
   {
-    RangeSet<String> rangeSet = domain.get(dimensions.get(0));
-    if (rangeSet == null) {
-      return true;
+    final StringTuple segmentStart = start == null ? new StringTuple(new 
String[dimensions.size()]) : start;
+    final StringTuple segmentEnd = end == null ? new StringTuple(new 
String[dimensions.size()]) : end;
+
+    // Indicates if the effective domain is equivalent to start till the 
previous dimension
+    boolean effectiveDomainIsStart = true;
+    // Indicates if the effective domain is equivalent to end till the 
previous dimension
+    boolean effectiveDomainIsEnd = true;
+
+    for (int i = 0; i < dimensions.size(); i++) {
+      String dimension = dimensions.get(i);
+      RangeSet<String> queryDomainForDimension = domain.get(dimension);
+      if (queryDomainForDimension == null) {
+        queryDomainForDimension = TreeRangeSet.create();
+        queryDomainForDimension.add(Range.all());
+      }
+
+      // Compute the segment's range based on its metadata and (outward) 
boundary alignment
+      Range<String> rangeTillSegmentBoundary = Range.all();

Review comment:
       What does `outward boundary alignment` mean? 

##########
File path: 
core/src/main/java/org/apache/druid/timeline/partition/DimensionRangeShardSpec.java
##########
@@ -145,29 +141,81 @@ private static ShardSpecLookup createLookup(List<? 
extends ShardSpec> shardSpecs
     return Collections.unmodifiableList(dimensions);
   }
 
-  private Range<String> getFirstDimRange()
+  /**
+   * Check if a given domain of Strings is a singleton set containing the 
given value
+   * @param rangeSet Domain of Strings
+   * @param val Value of String
+   * @return rangeSet == {val}
+   */
+  private boolean isRangeSetSingletonWithVal(RangeSet<String> rangeSet, String 
val)
   {
-    Range<String> range;
-    if (firstDimStart == null && firstDimEnd == null) {
-      range = Range.all();
-    } else if (firstDimStart == null) {
-      range = Range.atMost(firstDimEnd);
-    } else if (firstDimEnd == null) {
-      range = Range.atLeast(firstDimStart);
-    } else {
-      range = Range.closed(firstDimStart, firstDimEnd);
+    if (val == null) {
+      return false;
     }
-    return range;
+    return rangeSet.asRanges().equals(
+        Collections.singleton(Range.singleton(val))
+    );
   }
 
+  /**
+   * Pruning scenarios at dimension i: (When the Effective domain is empty)
+   *  0) query domain left-aligns with start and right-aligns with end and 
then deviates outside to align with neither
+   *  1) The query domain left-aligns with start until dimension i and then 
deviates to be strictly less than start[i]
+   *  2) The query domain right-aligns with end until dimension i and then 
deviates to be strictly greater than end[i]
+   *
+   * Immediate acceptance criteria at dimension i:
+   *  Effective domain is non-empty and no longer aligns with any boundary

Review comment:
       Is there one more scenario more scenario:- Effective domain is non 
empty, aligns throughout all dimensions, but no more dimension left. (final 
`return true` case)? 

##########
File path: 
core/src/test/java/org/apache/druid/timeline/partition/DimensionRangeShardSpecTest.java
##########
@@ -297,6 +300,96 @@ public void testIsInChunk_withMultiValues()
     ));
   }
 
+  @Test
+  public void testPossibleInDomain()
+  {
+    setDimensions("planet", "country", "city");
+
+    final StringTuple start = StringTuple.create("Earth", "France", "Paris");
+    final StringTuple end = StringTuple.create("Earth", "USA", "New York");
+
+    final RangeSet<String> universalSet = TreeRangeSet.create();
+    universalSet.add(Range.all());
+
+    ShardSpec shard = new DimensionRangeShardSpec(dimensions, start, end, 0, 
null);
+    Map<String, RangeSet<String>> domain = new HashMap<>();
+    RangeSet<String> planetSet;
+    RangeSet<String> countrySet;
+    RangeSet<String> citySet;
+
+    // null * null * null === (-INF, INF) * (-INF, INF) * (-INF, INF)
+    assertTrue(shard.possibleInDomain(domain));
+
+    // (-INF, INF) * (-INF, INF) * (-INF, INF)
+    populateDomain(domain, universalSet, universalSet, universalSet);
+    assertTrue(shard.possibleInDomain(domain));
+
+    // {Earth} * (-INF, INF) * (-INF, INF)
+    planetSet = getUnion(Range.singleton("Earth"));
+    populateDomain(domain, planetSet, universalSet, universalSet);
+    assertTrue(shard.possibleInDomain(domain));
+
+    // (-INF, Earth) * (-INF, INF) * (-INF, INF)
+    planetSet = getUnion(Range.lessThan("Earth"));
+    populateDomain(domain, planetSet, universalSet, universalSet);
+    assertFalse(shard.possibleInDomain(domain));
+
+    // (-INF, INF) * (-INF, "France") * (-INF, INF)

Review comment:
       In the top level comments for each assert in the test case, can you also 
add what that assert passing/failing means.
   For example something like: `// Should prune the segment if it lies on the 
boundary for first 2 dimensions, and then a disjoint set`.
   This would help if in future the logic changes and some test cases pass 
while others donot.
   Another good place to add this might be in the `asserFalse/assertTrue` 
itself like: `assertFalse(message, expected, actual)

##########
File path: 
core/src/test/java/org/apache/druid/timeline/partition/DimensionRangeShardSpecTest.java
##########
@@ -297,6 +301,98 @@ public void testIsInChunk_withMultiValues()
     ));
   }
 
+  @Test
+  public void testPossibleInDomain()

Review comment:
       I second this. Currently `assertFalse` and `assertTrue`'s are mixed up. 
One does have to look closely to get the test case. Can it also help us in 
drawing parallels among the testcases? (for many True case there should be a 
counter False case as well)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to