clintropolis opened a new pull request #10613:
URL: https://github.com/apache/druid/pull/10613
### Description
This PR adds support for `ExpressionFilter` in vectorized query engines, and
expands vectorization support for conditional expressions to support string
typed columns.
The signature of the `Filter` interface method which checks if a matcher can
be vectorized has been updated to accept a `ColumnInspector`, so that the
`ExpressionVirtualColumn.canVectorize` method can do its thing to determine if
the underlying expression can be vectorized:
```java
default boolean canVectorizeMatcher(ColumnInspector inspector)
{
return false;
}
```
Expression filters on string typed expressions use a new
'ObjectVectorValueMatcher` (since string expressions most naturally use a
`VectorObjectSelector` instead of the dictionary encoded vector dimension
selectors), and `DruidPredicateFactory` has been updated to include a method to
make an object matching predicate to work with that.
```java
default Predicate<Object> makeObjectPredicate()
{
throw new IllegalStateException("Object predicate not implemented");
}
```
Similarly, `VectorColumnProcessorFactory` has been expanded with a new
method to allow constructing these `VectorObjectSelector` based matchers:
```java
T makeObjectProcessor(ColumnCapabilities capabilities,
VectorObjectSelector selector);
```
I imagine that this potentially opens the door to someday having filters
that could support any of the complex typed columns we support as inputs as
well using the predicate factory pattern, though I haven't really spent time
thinking of any use cases for this yet.
<hr>
This PR has:
- [ ] been self-reviewed.
- [ ] added documentation for new or modified features or behaviors.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.
<hr>
##### Key changed/added classes in this PR
* `Filter`
* `ExpressionFilter`
* `VectorComparisonProcessors`
* `BivariateFunctionVectorObjectProcessor `
* `VectorValueMatcherFactory`
* `ObjectVectorValueMatcher`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]