>From Wail Alkowaileet <wael....@gmail.com>: Attention is currently required from: Ian Maxon, Ali Alsuliman, Gaurav Vaghasiya. Wail Alkowaileet has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013 )
Change subject: [NO ISSUE] Added Optimize/Adaptive Groupby for row based records ...................................................................... Patch Set 23: (12 comments) Patchset: PS23: First pass after sometime. File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/config/OptimizationConfUtil.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/2b5b5b4a_c41f0a23 PS23, Line 102: OptimizeGroupBy optimizeGroupBy File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/OptimizeGroupByPOperator.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/861b71b3_2f90a1ab PS23, Line 73: throw new AlgebricksException( : "Optimize group-by currently works only for one nested plan with one root containing" : + "an aggregate and a nested-tuple-source."); This a bad pattern. You need to fallback to the regular group by (the non-optimized one) whenever we cannot use the optimized group by. You should not throw an exception. Besides, I don't think it is right to do those checks here. There should be a rewrite rule that does all those checks and pass all the necessary parameters in the logical GROUP-BY operator. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/54ad1f64_b5246b75 PS23, Line 78: AggregateOperator aggOp = (AggregateOperator) r0.getValue(); You need to check the operator tag before casting. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/182678b4_05a3656c PS23, Line 82: String aggOpType = aggOp.getExpressions().get(0).getValue().toString(); You need to check function identifiers, not the string. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/a469f9e0_d29ec477 PS23, Line 83: String aggType; : if (aggOpType.contains("sql-count")) : aggType = "COUNT"; : else if (aggOpType.contains("sql-sum")) : aggType = "SUM"; : else if (aggOpType.contains("sql-max")) : aggType = "MAX"; : else if (aggOpType.contains("sql-min")) : aggType = "MIN"; : else { : throw new AlgebricksException("Optimize group-by currently not supporting average"); : } Use the function identifier instead. File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/config/AlgebricksConfig.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/e5e6b8fb_a46c16d5 PS23, Line 51: OPTIMIZE_GROUPBY_DEFAULT This should change to false once all tests are passed File hyracks-fullstack/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/PushGroupByIntoSortRule.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/223ce94d_9cfa0c31 PS23, Line 137: localAggregateOp.getExpressions().get(0).getValue().toString().contains("sql-count") : || localAggregateOp.getExpressions().get(0).getValue().toString() : .contains("sql-sum") : || localAggregateOp.getExpressions().get(0).getValue().toString() : .contains("sql-max") : || localAggregateOp.getExpressions().get(0).getValue().toString() : .contains("sql-min") Use FunctionIdentifier https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/f2ce7f54_31427441 PS23, Line 116: AbstractLogicalOperator op3 = : (AbstractLogicalOperator) op2Ref.getValue().getInputs().get(0).getValue(); : if (context.getPhysicalOptimizationConfig().isOptimizationGroupBy() : && op3.getPhysicalOperator().getOperatorTag() == PhysicalOperatorTag.HASH_PARTITION_EXCHANGE : && groupByOperator.isGlobal()) { : AbstractLogicalOperator localOp = (AbstractLogicalOperator) op3.getInputs().get(0).getValue(); : if (localOp.getPhysicalOperator().getOperatorTag() == PhysicalOperatorTag.SORT_GROUP_BY) { : GroupByOperator localGroupbyOperator = (GroupByOperator) localOp; : ILogicalPlan localP0 = localGroupbyOperator.getNestedPlans().get(0); : Mutable<ILogicalOperator> localR0 = localP0.getRoots().get(0); : AggregateOperator localAggregateOp = (AggregateOperator) localR0.getValue(); : : if (localAggregateOp.getExpressions().size() != 1) : continue; : if (!localAggregateOp.getExpressions().get(0).getValue().isFunctional()) : continue; : AggregateFunctionCallExpression temp = (AggregateFunctionCallExpression) localAggregateOp : .getExpressions().get(0).getValue(); : if (temp.getArguments().size() != 1) : continue; : : if (localAggregateOp.getExpressions().get(0).getValue().toString().contains("sql-count") : || localAggregateOp.getExpressions().get(0).getValue().toString() : .contains("sql-sum") : || localAggregateOp.getExpressions().get(0).getValue().toString() : .contains("sql-max") : || localAggregateOp.getExpressions().get(0).getValue().toString() : .contains("sql-min")) { : if (!localGroupbyOperator.isGroupAll()) { : localOp.setPhysicalOperator( : new OptimizeGroupByPOperator(localGroupbyOperator.getGroupByVarList(), : localGroupbyOperator.isGroupAll())); : } : } : } : } Extract to a function File hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/util/string/UTF8StringUtil.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/08a052c7_7c9ac9d3 PS23, Line 333: rawByteEquals I don't know why I think we have a function like this before. Maybe it is something I never merged into master File hyracks-fullstack/hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/util/ArrayBackedValueStorage.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/624fef02_af24b566 PS23, Line 42: ArrayBackedValueStorage Hmm why you need this? File hyracks-fullstack/hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/comm/io/ArrayTupleBuilder.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013/comment/6e7d3972_d6c38151 PS23, Line 200: ByteBuffer byteBuffer Maybe you can extract the information without the need to wrap the byte array. See IntegerPointable -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18013 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: I80ed8233130ee763bb2eee354c666400d4fd027c Gerrit-Change-Number: 18013 Gerrit-PatchSet: 23 Gerrit-Owner: Gaurav Vaghasiya <gvaghasiy...@gmail.com> Gerrit-Reviewer: Ali Alsuliman <ali.al.solai...@gmail.com> Gerrit-Reviewer: Ali Alsuliman <ali.alsuli...@couchbase.com> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Ian Maxon <ima...@apache.org> Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Gerrit-Reviewer: Wail Alkowaileet <wael....@gmail.com> Gerrit-CC: Murtadha Hubail <mhub...@apache.org> Gerrit-CC: Till Westmann <ti...@apache.org> Gerrit-Attention: Ian Maxon <ima...@apache.org> Gerrit-Attention: Ali Alsuliman <ali.al.solai...@gmail.com> Gerrit-Attention: Ali Alsuliman <ali.alsuli...@couchbase.com> Gerrit-Attention: Gaurav Vaghasiya <gvaghasiy...@gmail.com> Gerrit-Comment-Date: Sun, 13 Apr 2025 08:50:13 +0000 Gerrit-HasComments: Yes Gerrit-Has-Labels: No Gerrit-MessageType: comment