clintropolis commented on code in PR #12914:
URL: https://github.com/apache/druid/pull/12914#discussion_r953261475
##########
core/src/main/java/org/apache/druid/segment/column/ValueType.java:
##########
@@ -63,23 +63,37 @@ public enum ValueType implements TypeDescriptor
* String object type. This type may be used as a grouping key, an input to
certain types of complex sketch
* aggregators, and as an input to expression virtual columns. String types
might potentially be 'multi-valued' when
* stored in segments, and contextually at various layers of query
processing, but this information is not available
- * through this enum alone, and must be accompany this type indicator to
properly handle.
+ * at this level.
+ *
+ * Strings are typically represented as {@link String}, but multi-value
strings might also appear as a
+ * {@link java.util.List<String>}.
*/
STRING,
/**
* Placeholder for arbitrary 'complex' types, which have a corresponding
serializer/deserializer implementation. Note
* that knowing a type is complex alone isn't enough information to work
with it directly, and additional information
- * in the form of a type name that is registered in the complex type
registry must be available to make this type
- * meaningful. This type is not currently supported as a grouping key for
aggregations, and may not be used as an
- * input to expression virtual columns, and might only be supported by the
specific aggregators crafted to handle
- * this complex type.
+ * in the form of a type name which must be registered in the complex type
registry. This type is not currently
+ * supported as a grouping key for aggregations, and might only be supported
by the specific aggregators crafted to
+ * handle this complex type. Filtering on this type with standard filters
will most likely have limited support, and
Review Comment:
so, there is
[`isFilterable`](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilities.java#L95)
on column capabilities which is used exclusively to allow complex types to
provide indexes to use for filtering,
https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/segment/ColumnSelectorColumnIndexSelector.java#L86.
This could allow complex columns to directly provide indexes that filters
understand, which I suppose could be used for filtering.
However, there are no examples of this in current complex types, and
importantly, I don't consider this support to be fully built-out yet though
because without value matcher integration its not really complete and likely
not correct except in certain situations. We did start to orient some things to
make this someday be possible, like adding the object predicate
https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/filter/DruidPredicateFactory.java#L46,
but some more plumbing needs to be done to make use of it I think.
The big win here on finishing this stuff out someday would be to allow
matching constant forms of complex values with selector filters, and making is
null/is not null actually work correctly (like is null matches everything since
the column is treated as a null) which is like a special case of the same thing
but especially jarring.
Beyond equality checks I'm not sure most of the standard filters make sense
with complex types in general, but tangentially, it would probably be nice in
the future to have a mode where filtering on them with filters that don't make
sense is an error condition instead of silently treating it as a column of null
values.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]