clintropolis commented on a change in pull request #10432:
URL: https://github.com/apache/druid/pull/10432#discussion_r494751677



##########
File path: docs/querying/query-context.md
##########
@@ -106,3 +106,4 @@ vectorization. These query types will ignore the 
"vectorize" parameter even if i
 |--------|-------|------------|
 |vectorize|`true`|Enables or disables vectorized query execution. Possible 
values are `false` (disabled), `true` (enabled if possible, disabled otherwise, 
on a per-segment basis), and `force` (enabled, and groupBy or timeseries 
queries that cannot be vectorized will fail). The `"force"` setting is meant to 
aid in testing, and is not generally useful in production (since real-time 
segments can never be processed with vectorized execution, any queries on 
real-time data will fail). This will override 
`druid.query.default.context.vectorize` if it's set.|
 |vectorSize|`512`|Sets the row batching size for a particular query. This will 
override `druid.query.default.context.vectorSize` if it's set.|
+|vectorizeVirtualColumns|`true`|Enables or disables vectorized query 
processing of queries with virtual columns, layered on top of `vectorize` 
(`vectorize` must also be set to true for a query to utilize vectorization). 
Possible values are `false` (disabled), `true` (enabled if possible, disabled 
otherwise, on a per-segment basis), and `force` (enabled, and groupBy or 
timeseries queries with virtual columns that cannot be vectorized will fail). 
The `"force"` setting is meant to aid in testing, and is not generally useful 
in production. This will override 
`druid.query.default.context.vectorizeVirtualColumns` if it's set.|

Review comment:
       Given that #10429 has uncovered a bug with vectorized group by engine 
when grouping on numeric columns with null values in sql compatible null 
handling mode that isn't related to expressions/virtual columns, I guess this 
seems prudent and I've come around to being in the 'false by default' camp.

##########
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/vector/VectorGroupByEngine.java
##########
@@ -76,10 +76,18 @@ public static boolean canVectorize(
     Function<String, ColumnCapabilities> capabilitiesFunction = name ->
         query.getVirtualColumns().getColumnCapabilitiesWithFallback(adapter, 
name);
 
+    final boolean adapterCanVectorize = adapter.canVectorize(filter, 
query.getVirtualColumns(), false);
+    final boolean virtualColumnsCanVectorize;
+    if (query.getVirtualColumns().getVirtualColumns().length > 0) {

Review comment:
       I'll see if I can clean this up a bit

##########
File path: docs/querying/query-context.md
##########
@@ -92,7 +92,7 @@ include "selector", "bound", "in", "like", "regex", "search", 
"and", "or", and "
 - All aggregators must offer vectorized implementations. These include 
"count", "doubleSum", "floatSum", "longSum", "longMin",
  "longMax", "doubleMin", "doubleMax", "floatMin", "floatMax", "longAny", 
"doubleAny", "floatAny", "stringAny",
  "hyperUnique", "filtered", "approxHistogram", "approxHistogramFold", and 
"fixedBucketsHistogram" (with numerical input). 
-- No virtual columns.
+- All virtual columns must offer vectorized implementations.

Review comment:
       Ah, I was planning on filling out a list of supported expressions but 
figured the list was sort of volatile since I have a few PRs and branches in 
flight and was originally going to do this later. I was planning to wait for 
#10429 to go in before I fixup this PR (to make sure changing the default to 
false doesn't cause any ill effects to tests) so I can go ahead and fill out 
this list to include the stuff from #10401 and #10429 in this PR at least.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to