siddharthteotia commented on a change in pull request #4790: Support ORDER BY
for DISTINCT queries
URL: https://github.com/apache/incubator-pinot/pull/4790#discussion_r348258761
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java
##########
@@ -38,16 +45,20 @@
* // TODO: Support group-by
*/
public class DistinctAggregationFunction implements
AggregationFunction<DistinctTable, Comparable> {
- private final DistinctTable _distinctTable;
+ private DistinctTable _distinctTable;
private final String[] _columnNames;
private final int _limit;
+ private final List<SelectionSort> _orderBy;
private FieldSpec.DataType[] _dataTypes;
- DistinctAggregationFunction(String multiColumnExpression, int limit) {
- _distinctTable = new DistinctTable(limit);
+ DistinctAggregationFunction(String multiColumnExpression, int limit,
List<SelectionSort> orderBy) {
_columnNames =
multiColumnExpression.split(FunctionCallAstNode.DISTINCT_MULTI_COLUMN_SEPARATOR);
- _limit = limit;
+ _orderBy = orderBy;
+ // use a multiplier for trim size when DISTINCT queries have ORDER BY.
This logic
+ // is similar to what we have in GROUP BY with ORDER BY
+ // this does not guarantee 100% accuracy but still takes closer to it
+ _limit = CollectionUtils.isNotEmpty(_orderBy) ? limit * 5 : limit;
Review comment:
Right now I have kept this logic local to distinct code since now we anyway
have multiple concrete Table implementations. If need be, I can move
GroupByUtils to something like TableCapacityUtils and add there as a follow-up.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]