siddharthteotia commented on a change in pull request #4535: Implement DISTINCT
clause
URL: https://github.com/apache/incubator-pinot/pull/4535#discussion_r315381026
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/operator/blocks/IntermediateResultsBlock.java
##########
@@ -231,41 +237,123 @@ private DataTable getSelectionResultDataTable()
SelectionOperatorUtils.getDataTableFromRows(_selectionResult,
_selectionDataSchema));
}
+
@Nonnull
private DataTable getAggregationResultDataTable()
throws Exception {
// Extract each aggregation column name and type from aggregation function
context.
- int numAggregationFunctions = _aggregationFunctionContexts.length;
- String[] columnNames = new String[numAggregationFunctions];
- DataSchema.ColumnDataType[] columnDataTypes = new
DataSchema.ColumnDataType[numAggregationFunctions];
- for (int i = 0; i < numAggregationFunctions; i++) {
- AggregationFunctionContext aggregationFunctionContext =
_aggregationFunctionContexts[i];
- columnNames[i] = aggregationFunctionContext.getAggregationColumnName();
- columnDataTypes[i] =
aggregationFunctionContext.getAggregationFunction().getIntermediateResultColumnType();
+ if (AggregationFunctionUtils.isDistinct(_aggregationFunctionContexts)) {
Review comment:
Other than the two special casing occurrences you pointed out earlier in
FunctionCallAstNode and SelectAstNode (which I have mentioned in my comments as
to why they are specifically needed), there are following places in the code
where we do the check for DISTINCT
(1) DefaultAggregationExecutor -- when operator initializes the executor
**Reason** -- earlier for each function we had to build column expression
tree (since each function can only have a single column). Now for DISTINCT
function, we will have to do this for each column
(2) IntermediateResultBlock -- when building the data table. Since now we
are sending proper matrix for a DISTINCT query (similar to DataTable for
Selection query). The above reason is also applicable here.
(3) AggregationPlanNode -- to set the projected column types in function
context to be used later during building the data table -- this is needed since
for aggregation functions, the type was fixed at Long, Double.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]