clintropolis commented on a change in pull request #10432:
URL: https://github.com/apache/druid/pull/10432#discussion_r494751677
##########
File path: docs/querying/query-context.md
##########
@@ -106,3 +106,4 @@ vectorization. These query types will ignore the
"vectorize" parameter even if i
|--------|-------|------------|
|vectorize|`true`|Enables or disables vectorized query execution. Possible
values are `false` (disabled), `true` (enabled if possible, disabled otherwise,
on a per-segment basis), and `force` (enabled, and groupBy or timeseries
queries that cannot be vectorized will fail). The `"force"` setting is meant to
aid in testing, and is not generally useful in production (since real-time
segments can never be processed with vectorized execution, any queries on
real-time data will fail). This will override
`druid.query.default.context.vectorize` if it's set.|
|vectorSize|`512`|Sets the row batching size for a particular query. This will
override `druid.query.default.context.vectorSize` if it's set.|
+|vectorizeVirtualColumns|`true`|Enables or disables vectorized query
processing of queries with virtual columns, layered on top of `vectorize`
(`vectorize` must also be set to true for a query to utilize vectorization).
Possible values are `false` (disabled), `true` (enabled if possible, disabled
otherwise, on a per-segment basis), and `force` (enabled, and groupBy or
timeseries queries with virtual columns that cannot be vectorized will fail).
The `"force"` setting is meant to aid in testing, and is not generally useful
in production. This will override
`druid.query.default.context.vectorizeVirtualColumns` if it's set.|
Review comment:
Given that #10429 has uncovered a bug with vectorized group by engine
when grouping on numeric columns with null values in sql compatible null
handling mode that isn't related to expressions/virtual columns, I guess this
seems prudent and I've come around to being in the 'false by default' camp.
##########
File path:
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/VectorGroupByEngine.java
##########
@@ -76,10 +76,18 @@ public static boolean canVectorize(
Function<String, ColumnCapabilities> capabilitiesFunction = name ->
query.getVirtualColumns().getColumnCapabilitiesWithFallback(adapter,
name);
+ final boolean adapterCanVectorize = adapter.canVectorize(filter,
query.getVirtualColumns(), false);
+ final boolean virtualColumnsCanVectorize;
+ if (query.getVirtualColumns().getVirtualColumns().length > 0) {
Review comment:
I'll see if I can clean this up a bit
##########
File path: docs/querying/query-context.md
##########
@@ -92,7 +92,7 @@ include "selector", "bound", "in", "like", "regex", "search",
"and", "or", and "
- All aggregators must offer vectorized implementations. These include
"count", "doubleSum", "floatSum", "longSum", "longMin",
"longMax", "doubleMin", "doubleMax", "floatMin", "floatMax", "longAny",
"doubleAny", "floatAny", "stringAny",
"hyperUnique", "filtered", "approxHistogram", "approxHistogramFold", and
"fixedBucketsHistogram" (with numerical input).
-- No virtual columns.
+- All virtual columns must offer vectorized implementations.
Review comment:
Ah, I was planning on filling out a list of supported expressions but
figured the list was sort of volatile since I have a few PRs and branches in
flight and was originally going to do this later. I was planning to wait for
#10429 to go in before I fixup this PR (to make sure changing the default to
false doesn't cause any ill effects to tests) so I can go ahead and fill out
this list to include the stuff from #10401 and #10429 in this PR at least.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]