clintropolis opened a new pull request #10613:
URL: https://github.com/apache/druid/pull/10613


   ### Description
   This PR adds support for `ExpressionFilter` in vectorized query engines, and 
expands vectorization support for conditional expressions to support string 
typed columns.
   
   The signature of the `Filter` interface method which checks if a matcher can 
be vectorized has been updated to accept a `ColumnInspector`, so that the 
`ExpressionVirtualColumn.canVectorize` method can do its thing to determine if 
the underlying expression can be vectorized:
   
   ```java
     default boolean canVectorizeMatcher(ColumnInspector inspector)
     {
       return false;
     }
   ```
   
   Expression filters on string typed expressions use a new 
'ObjectVectorValueMatcher` (since string expressions most naturally use a 
`VectorObjectSelector` instead of the dictionary encoded vector dimension 
selectors), and `DruidPredicateFactory` has been updated to include a method to 
make an object matching predicate to work with that.
   
   ```java
     default Predicate<Object> makeObjectPredicate()
     {
       throw new IllegalStateException("Object predicate not implemented");
     }
   ```
   
   Similarly, `VectorColumnProcessorFactory` has been expanded with a new 
method to allow constructing these `VectorObjectSelector` based matchers:
   
   ```java
     T makeObjectProcessor(ColumnCapabilities capabilities, 
VectorObjectSelector selector);
   ```
   
   I imagine that this potentially opens the door to someday having filters 
that could support any of the complex typed columns we support as inputs as 
well using the predicate factory pattern, though I haven't really spent time 
thinking of any use cases for this yet.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `Filter`
    * `ExpressionFilter`
    * `VectorComparisonProcessors`
    * `BivariateFunctionVectorObjectProcessor `
    * `VectorValueMatcherFactory`
    * `ObjectVectorValueMatcher`
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to