gianm commented on code in PR #15058:
URL: https://github.com/apache/druid/pull/15058#discussion_r1345182118
##########
processing/src/main/java/org/apache/druid/query/filter/DruidPredicateFactory.java:
##########
@@ -58,4 +58,9 @@ default Predicate<Object> makeObjectPredicate()
final Predicate<String> stringPredicate = makeStringPredicate();
return o -> stringPredicate.apply(null);
}
+
+ default boolean isNullInputUnknown()
Review Comment:
Javadoc would be helpful
##########
processing/src/main/java/org/apache/druid/query/filter/InDimFilter.java:
##########
@@ -627,6 +627,12 @@ public DruidDoublePredicate makeDoublePredicate()
}
}
+ @Override
+ public boolean isNullInputUnknown()
+ {
+ return !values.contains(null);
Review Comment:
Cache this? Looks like `isNullInputUnknown` is called row-by-row in certain
cases.
##########
processing/src/main/java/org/apache/druid/segment/DimensionSelector.java:
##########
@@ -186,6 +187,9 @@ static DimensionSelector multiConstant(@Nullable final
List<String> values)
// does not report value cardinality, but otherwise behaves identically
when used for grouping or selecting to a
// normal multi-value dimension selector (getObject on a row with a
single value returns the object instead of
// the list)
+ if (values.get(0) == null) {
+ return NullDimensionSelectorHolder.NULL_DIMENSION_SELECTOR;
+ }
Review Comment:
Why is this needed — doesn't `constant(null)` do the same thing?
##########
processing/src/main/java/org/apache/druid/segment/filter/ValueMatchers.java:
##########
@@ -315,6 +339,125 @@ public void inspectRuntimeShape(RuntimeShapeInspector
inspector)
};
}
+ public static ValueMatcher makeAlwaysFalseMatcher(final DimensionSelector
selector, boolean multiValue)
Review Comment:
Javadoc would be helpful here, since it's unclear when to use this one vs
`allFalse`
##########
processing/src/main/java/org/apache/druid/query/filter/ValueMatcher.java:
##########
@@ -31,6 +31,16 @@
*/
public interface ValueMatcher extends HotLoopCallee
{
+ /**
+ * Returns true if the current row matches the condition.
+ *
+ * @param includeUnknown mapping for Druid native two state logic system
into SQL three-state logic system. If set
+ * to true, this method should also return true if the
result is 'unknown' to be a match, such
+ * as from the input being null valued. Used primarily
to allow
+ * {@link org.apache.druid.segment.filter.NotFilter}
to invert a match in an SQL compliant
Review Comment:
tiniest of all nits: "a SQL" is better than "an SQL" because "SQL" is
pronounced "sequel" 😉
##########
processing/src/main/java/org/apache/druid/segment/filter/ColumnComparisonFilter.java:
##########
@@ -79,14 +79,16 @@ public ValueMatcher makeMatcher(ColumnSelectorFactory
factory)
public static ValueMatcher makeValueMatcher(final List<Supplier<String[]>>
valueGetters)
{
if (valueGetters.isEmpty()) {
- return BooleanValueMatcher.of(true);
+ return ValueMatchers.allTrue();
}
return new ValueMatcher()
{
@Override
- public boolean matches()
+ public boolean matches(boolean includeUnknown)
{
+ // todo (clint): what to do about includeUnknown
Review Comment:
We should avoid committing todos.
IMO, column comparison should ignore `includeUnknown`, i.e. if inputs are
`x` and `y` it should be treated as `x IS NOT DISTINCT FROM y`. Just to keep
things simple.
##########
processing/src/main/java/org/apache/druid/segment/StringDimensionIndexer.java:
##########
@@ -380,38 +384,70 @@ public void inspectRuntimeShape(RuntimeShapeInspector
inspector)
}
};
} else {
- return BooleanValueMatcher.of(false);
+ return new ValueMatcher()
+ {
+ @Override
+ public boolean matches(boolean includeUnknown)
+ {
+ if (includeUnknown) {
+ IndexedInts row = getRow();
+ if (row.size() == 0) {
+ return true;
+ }
+ //noinspection SSBasedInspection
Review Comment:
I think the `SSBasedInspection` is complaining that `row.size()` is called
for each iteration rather than being saved in a local variable. Might as well
save it?
##########
processing/src/main/java/org/apache/druid/segment/filter/ValueMatchers.java:
##########
@@ -315,6 +339,123 @@
};
}
+ public static ValueMatcher makeAlwaysFalseMatcher(final DimensionSelector
selector, boolean multiValue)
+ {
+ final IdLookup lookup = selector.idLookup();
+ // if the column doesn't have null
+ if (lookup == null || !selector.nameLookupPossibleInAdvance()) {
+ return new ValueMatcher()
+ {
+ @Override
+ public boolean matches(boolean includeUnknown)
+ {
+ if (includeUnknown) {
+ IndexedInts row = selector.getRow();
+ if (row.size() == 0) {
+ return true;
+ }
+ for (int i = 0; i < row.size(); i++) {
+ if
(NullHandling.isNullOrEquivalent(selector.lookupName(row.get(i)))) {
+ return true;
+ }
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+ {
+ inspector.visit("selector", selector);
+ }
+ };
+ } else {
+ final int nullId = lookup.lookupId(null);
+ if (nullId < 0) {
+ // column doesn't have null value so no unknowns, can safely return
always false matcher
+ return ValueMatchers.allFalse();
+ }
+ if (multiValue) {
+ return new ValueMatcher()
+ {
+ @Override
+ public boolean matches(boolean includeUnknown)
+ {
+ if (includeUnknown) {
+ IndexedInts row = selector.getRow();
+ if (row.size() == 0) {
+ return true;
+ }
+ for (int i = 0; i < row.size(); i++) {
+ if (row.get(i) == nullId) {
+ return true;
+ }
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+ {
+ inspector.visit("selector", selector);
+ }
+ };
+ } else {
+ return new ValueMatcher()
+ {
+ @Override
+ public boolean matches(boolean includeUnknown)
+ {
+ return includeUnknown && selector.getRow().get(0) == nullId;
+ }
+
+ @Override
+ public void inspectRuntimeShape(RuntimeShapeInspector inspector)
+ {
+ inspector.visit("selector", selector);
+ }
+ };
+ }
+ }
+ }
+
+ public static ValueMatcher
makeAlwaysFalseMatcher(BaseNullableColumnValueSelector selector)
Review Comment:
This notice looks like a legit concern: all `ColumnValueSelector` are both
`BaseNullableColumnValueSelector` and `BaseObjectColumnValueSelector`, so
dispatch is ambiguous. Use different names, perhaps?
##########
processing/src/main/java/org/apache/druid/query/filter/ValueMatcher.java:
##########
@@ -31,6 +31,16 @@
*/
public interface ValueMatcher extends HotLoopCallee
{
+ /**
+ * Returns true if the current row matches the condition.
+ *
+ * @param includeUnknown mapping for Druid native two state logic system
into SQL three-state logic system. If set
+ * to true, this method should also return true if the
result is 'unknown' to be a match, such
+ * as from the input being null valued. Used primarily
to allow
+ * {@link org.apache.druid.segment.filter.NotFilter}
to invert a match in an SQL compliant
Review Comment:
tiniest of all nits: "a SQL" is better than "an SQL" because "SQL" is
pronounced "sequel" 😉
##########
processing/src/main/java/org/apache/druid/segment/DimensionSelector.java:
##########
@@ -186,6 +187,9 @@ static DimensionSelector multiConstant(@Nullable final
List<String> values)
// does not report value cardinality, but otherwise behaves identically
when used for grouping or selecting to a
// normal multi-value dimension selector (getObject on a row with a
single value returns the object instead of
// the list)
+ if (values.get(0) == null) {
+ return NullDimensionSelectorHolder.NULL_DIMENSION_SELECTOR;
+ }
Review Comment:
Why is this needed — doesn't `constant(null)` do the same thing?
##########
processing/src/main/java/org/apache/druid/segment/filter/NotFilter.java:
##########
@@ -42,6 +44,18 @@
import java.util.Set;
/**
+ * Nice filter you have there... NOT!
Review Comment:
Good one
##########
processing/src/main/java/org/apache/druid/segment/vector/FilteredVectorOffset.java:
##########
@@ -129,7 +129,8 @@ private void advanceWhileVectorIsEmptyAndPopulateOffsets()
return;
}
- final ReadableVectorMatch match =
filterMatcher.match(VectorMatch.allTrue(baseOffset.getCurrentVectorSize()));
+ final ReadableVectorMatch match =
filterMatcher.match(VectorMatch.allTrue(baseOffset.getCurrentVectorSize()),
Review Comment:
Formatting is a little wonky
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]