yashmayya commented on code in PR #13835:
URL: https://github.com/apache/pinot/pull/13835#discussion_r1720962052
##########
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java:
##########
@@ -344,6 +346,55 @@ public static AggregationFunctionType
getAggregationFunctionType(String function
}
}
+ public static int
compareFunctionParametersForStarTree(AggregationFunctionType functionType,
+ Map<String, Object> configuration1, Map<String, Object> configuration2) {
+ switch (functionType) {
+ case DISTINCTCOUNTHLL:
+ case DISTINCTCOUNTRAWHLL: {
+ // Compare the log2m value accounting for defaults
+ if ((MapUtils.isEmpty(configuration1) ||
!configuration1.containsKey(Constants.HLL_LOG2M_KEY))
+ && (MapUtils.isEmpty(configuration2) ||
!configuration2.containsKey(Constants.HLL_LOG2M_KEY))) {
+ return 0;
+ }
+ if ((MapUtils.isEmpty(configuration1) ||
!configuration1.containsKey(Constants.HLL_LOG2M_KEY))) {
+ return
Integer.compare(CommonConstants.Helix.DEFAULT_HYPERLOGLOG_LOG2M,
Integer.parseInt(String.valueOf(
+ configuration2.get(Constants.HLL_LOG2M_KEY))));
+ }
+ if ((MapUtils.isEmpty(configuration2)) ||
!configuration2.containsKey(Constants.HLL_LOG2M_KEY)) {
+ return
Integer.compare(Integer.parseInt(String.valueOf(configuration1.get(Constants.HLL_LOG2M_KEY))),
+ CommonConstants.Helix.DEFAULT_HYPERLOGLOG_LOG2M);
+ }
+ return
Integer.compare(Integer.parseInt(String.valueOf(configuration1.get(Constants.HLL_LOG2M_KEY))),
+
Integer.parseInt(String.valueOf(configuration2.get(Constants.HLL_LOG2M_KEY))));
+ }
+
+ case DISTINCTCOUNTHLLPLUS:
+ case DISTINCTCOUNTRAWHLLPLUS: {
+ // Only p value needs to be compared; HyperLogLogPlus with any sp
value can be merged as long as p value is the
+ // same
+ if ((MapUtils.isEmpty(configuration1) ||
!configuration1.containsKey(Constants.HLLPLUS_ULL_P_KEY))
+ && (MapUtils.isEmpty(configuration2) ||
!configuration2.containsKey(Constants.HLLPLUS_ULL_P_KEY))) {
+ return 0;
+ }
+ if ((MapUtils.isEmpty(configuration1) ||
!configuration1.containsKey(Constants.HLLPLUS_ULL_P_KEY))) {
+ return
Integer.compare(CommonConstants.Helix.DEFAULT_HYPERLOGLOG_PLUS_P,
+
Integer.parseInt(String.valueOf(configuration2.get(Constants.HLLPLUS_ULL_P_KEY))));
+ }
+ if ((MapUtils.isEmpty(configuration2) ||
!configuration2.containsKey(Constants.HLLPLUS_ULL_P_KEY))) {
+ return
Integer.compare(Integer.parseInt(String.valueOf(configuration1.get(Constants.HLLPLUS_ULL_P_KEY))),
+ CommonConstants.Helix.DEFAULT_HYPERLOGLOG_PLUS_P);
+ }
+ return
Integer.compare(Integer.parseInt(String.valueOf(configuration1.get(Constants.HLLPLUS_ULL_P_KEY))),
+
Integer.parseInt(String.valueOf(configuration2.get(Constants.HLLPLUS_ULL_P_KEY))));
+ }
+
+ // By default, assume that the rest of the aggregation functions do not
have relevant configurations to be checked
+ // for a star-tree index
+ default:
+ return 0;
Review Comment:
While this is definitely true for some aggregation functions like `COUNT` /
`MIN` / `MAX` / `SUM` / `AVG` / `MINMAXRANGE` / `DISTINCTCOUNTBITMAP` /
`PERCENTILEEST`, it's a little more nuanced for certain other aggregation
functions like `DISTINCTCOUNTULL` and `SUMPRECISION` when there is a mismatch
in the query time aggregation parameter value (different `p` value for
`DISTINCTCOUNTULL`, different precision value for `SUMPRECISION`) and the
star-tree index aggregation parameter value. In these cases, there will never
be a query error since intermediate results with any parameter value can be
merged but the result could have lower / higher precision than desired by the
user. On second thought, I'm wondering if it might just be better to do a
stricter parameter match for these aggregation functions too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]