clintropolis commented on code in PR #15058:
URL: https://github.com/apache/druid/pull/15058#discussion_r1348471975


##########
processing/src/main/java/org/apache/druid/query/filter/vector/VectorValueMatcher.java:
##########
@@ -33,13 +43,279 @@ public interface VectorValueMatcher extends 
VectorSizeInspector
 {
   /**
    * Examine the current vector and return a match indicating what is accepted.
-   *
+   * <p>
    * Does not modify "mask".
    *
-   * @param mask must not be null; use {@link VectorMatch#allTrue} if you 
don't need a mask.
-   *
+   * @param mask           must not be null; use {@link VectorMatch#allTrue} 
if you don't need a mask.
+   * @param includeUnknown mapping for Druid native two state logic system 
into SQL three-state logic system. If set
+   *                       to true, this method should also return true if the 
result is 'unknown' to be a match, such
+   *                       as from the input being null valued. Used primarily 
to allow
+   *                       {@link org.apache.druid.segment.filter.NotFilter} 
to invert a match in an SQL compliant
+   *                       manner
    * @return the subset of "mask" that this value matcher accepts. May be the 
same instance as {@param mask} if
    * every row in the mask matches the filter.
    */
-  ReadableVectorMatch match(ReadableVectorMatch mask);
+  ReadableVectorMatch match(ReadableVectorMatch mask, boolean includeUnknown);
+
+  /**
+   * Make a {@link VectorValueMatcher} that only selects input rows with null 
values
+   * @param selector
+   * @return
+   */
+  static VectorValueMatcher nullMatcher(VectorValueSelector selector)
+  {
+    return new BaseVectorValueMatcher(selector)
+    {
+      final VectorMatch match = VectorMatch.wrap(new 
int[selector.getMaxVectorSize()]);
+
+      @Override
+      public ReadableVectorMatch match(final ReadableVectorMatch mask, boolean 
includeUnknown)
+      {
+        return matchNulls(mask, match, selector.getNullVector());
+      }
+    };
+  }
+
+  /**
+   * Make an always false {@link VectorValueMatcher} for a {@link 
SingleValueDimensionVectorSelector}. When
+   * {@code includeUnknown} is specified to the {@link 
VectorValueMatcher#match(ReadableVectorMatch, boolean)} function,
+   * this matcher will add all rows of {@link 
SingleValueDimensionVectorSelector#getRowVector()} which are null to the
+   * {@link ReadableVectorMatch} as selections, to participate in Druid 
2-state logic system to SQL 3-state logic
+   * system conversion.
+   */
+  static VectorValueMatcher 
makeAllFalseMatcher(SingleValueDimensionVectorSelector selector)
+  {
+    final IdLookup idLookup = selector.idLookup();
+    final VectorMatch match = VectorMatch.wrap(new 
int[selector.getMaxVectorSize()]);
+
+    if (idLookup == null || !selector.nameLookupPossibleInAdvance()) {
+      // must call selector.lookupName on every id to check for nulls
+      return new BaseVectorValueMatcher(selector)
+      {
+        @Override
+        public ReadableVectorMatch match(ReadableVectorMatch mask, boolean 
includeUnknown)
+        {
+          if (includeUnknown) {
+            final int[] vector = selector.getRowVector();
+            final int[] inputSelection = mask.getSelection();
+            final int inputSelectionSize = mask.getSelectionSize();
+            final int[] outputSelection = match.getSelection();
+            int outputSelectionSize = 0;
+
+            for (int i = 0; i < inputSelectionSize; i++) {
+              final int rowNum = inputSelection[i];
+              if 
(NullHandling.isNullOrEquivalent(selector.lookupName(vector[rowNum]))) {
+                outputSelection[outputSelectionSize++] = rowNum;
+              }
+            }
+            match.setSelectionSize(outputSelectionSize);
+            return match;
+          }
+          return VectorMatch.allFalse();

Review Comment:
   yea, there will likely be some, but .. there isn't really anything we can do 
because the current behavior is incorrect. If the query results don't have to 
be correct then I'm sure we could make something even faster that doesn't read 
or check anything 😜



##########
processing/src/main/java/org/apache/druid/query/filter/vector/VectorValueMatcher.java:
##########
@@ -33,13 +43,279 @@ public interface VectorValueMatcher extends 
VectorSizeInspector
 {
   /**
    * Examine the current vector and return a match indicating what is accepted.
-   *
+   * <p>
    * Does not modify "mask".
    *
-   * @param mask must not be null; use {@link VectorMatch#allTrue} if you 
don't need a mask.
-   *
+   * @param mask           must not be null; use {@link VectorMatch#allTrue} 
if you don't need a mask.
+   * @param includeUnknown mapping for Druid native two state logic system 
into SQL three-state logic system. If set
+   *                       to true, this method should also return true if the 
result is 'unknown' to be a match, such
+   *                       as from the input being null valued. Used primarily 
to allow
+   *                       {@link org.apache.druid.segment.filter.NotFilter} 
to invert a match in an SQL compliant
+   *                       manner
    * @return the subset of "mask" that this value matcher accepts. May be the 
same instance as {@param mask} if
    * every row in the mask matches the filter.
    */
-  ReadableVectorMatch match(ReadableVectorMatch mask);
+  ReadableVectorMatch match(ReadableVectorMatch mask, boolean includeUnknown);
+
+  /**
+   * Make a {@link VectorValueMatcher} that only selects input rows with null 
values
+   * @param selector
+   * @return
+   */
+  static VectorValueMatcher nullMatcher(VectorValueSelector selector)
+  {
+    return new BaseVectorValueMatcher(selector)
+    {
+      final VectorMatch match = VectorMatch.wrap(new 
int[selector.getMaxVectorSize()]);
+
+      @Override
+      public ReadableVectorMatch match(final ReadableVectorMatch mask, boolean 
includeUnknown)
+      {
+        return matchNulls(mask, match, selector.getNullVector());
+      }
+    };
+  }
+
+  /**
+   * Make an always false {@link VectorValueMatcher} for a {@link 
SingleValueDimensionVectorSelector}. When
+   * {@code includeUnknown} is specified to the {@link 
VectorValueMatcher#match(ReadableVectorMatch, boolean)} function,
+   * this matcher will add all rows of {@link 
SingleValueDimensionVectorSelector#getRowVector()} which are null to the
+   * {@link ReadableVectorMatch} as selections, to participate in Druid 
2-state logic system to SQL 3-state logic
+   * system conversion.
+   */
+  static VectorValueMatcher 
makeAllFalseMatcher(SingleValueDimensionVectorSelector selector)
+  {
+    final IdLookup idLookup = selector.idLookup();
+    final VectorMatch match = VectorMatch.wrap(new 
int[selector.getMaxVectorSize()]);
+
+    if (idLookup == null || !selector.nameLookupPossibleInAdvance()) {
+      // must call selector.lookupName on every id to check for nulls
+      return new BaseVectorValueMatcher(selector)
+      {
+        @Override
+        public ReadableVectorMatch match(ReadableVectorMatch mask, boolean 
includeUnknown)
+        {
+          if (includeUnknown) {
+            final int[] vector = selector.getRowVector();
+            final int[] inputSelection = mask.getSelection();
+            final int inputSelectionSize = mask.getSelectionSize();
+            final int[] outputSelection = match.getSelection();
+            int outputSelectionSize = 0;
+
+            for (int i = 0; i < inputSelectionSize; i++) {
+              final int rowNum = inputSelection[i];
+              if 
(NullHandling.isNullOrEquivalent(selector.lookupName(vector[rowNum]))) {
+                outputSelection[outputSelectionSize++] = rowNum;
+              }
+            }
+            match.setSelectionSize(outputSelectionSize);
+            return match;
+          }
+          return VectorMatch.allFalse();

Review Comment:
   yea, there will likely be some, but .. there isn't really anything we can do 
because the current behavior is incorrect. If the query results don't have to 
be correct then I'm sure we could make something even faster that doesn't read 
or check anything 😜



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to