>From Ali Alsuliman <ali.al.solai...@gmail.com>: Attention is currently required from: Peeyush Gupta, Preetham Poluparthi. Ali Alsuliman has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763 )
Change subject: [ASTERIXDB-3632]: Add Index Advisor ...................................................................... Patch Set 31: (38 comments) File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/am/AbstractIntroduceAccessMethodRule.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/629d933c_5b974b9c PS31, Line 118: setIndexProvider Maybe we should have this done as part of setMetadataDeclarations()? File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/am/IntroduceJoinAccessMethodRule.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/eeca4d17_4c408cbf PS31, Line 129: getFakeIndexProvider() This can be null, right? When AdvisorPlanParser.getCBOPlanState(opRef) returns null and the fake index provider is not set. File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/EnumerateJoinsRule.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/3278e39c_8ffa6f6b PS31, Line 144: boolean adviseIndex = context.getIndexAdvisor().getAdvise(); Move it to next where it's used. File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/Stats.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/a0b2ce16_0dba17d8 PS31, Line 280: ILogicalOperator originalLeft, originalRight; : originalLeft = abjoin.getInputs().get(0).getValue(); : originalRight = abjoin.getInputs().get(1).getValue(); What happened and we needed this? File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/indexadvisor/AdviseIndexRule.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/03b48c50_9fc8eedd PS31, Line 41: public class AdviseIndexRule Can you add a documentation up that describes what this rule is about and the purpose? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/4e5c8082_7a7c3c2e PS31, Line 57: / if Remove https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/afb64775_58ebd668 PS31, Line 70: ((AbstractLogicalOperator) opRef.getValue()).disableJobGen(); Do you need this? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/d568ff7b_9b1cf3ae PS31, Line 88: LogicalOperatorTag.UNNEST_MAP How about the left-outer case? Also, this could be for a primary index look-up. Do you want to differentiate between a secondary index vs. primary index look-up? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/8d67e516_25fb29cb PS31, Line 94: UnnestingFunctionCallExpression unnestExpr = (UnnestingFunctionCallExpression) expr; Use AccessMethodJobGenParams.readFromFuncArgs()? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/087c7407_381f56cf PS31, Line 105: DataverseName.createSinglePartName Use createFromCanonicalForm? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/9af94d14_01c86c7f PS31, Line 108: (Index.ValueIndexDetails) Shouldn't we check if it's actually of ValueIndexDetails before casting? include array indexes in your tests just to make sure we don't fail in their presence. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/79c45052_4c0def6b PS31, Line 126: lookupIndex Do we need to consider field types? we are moving away from typed indexes, but there are still there for cases like secondary indexes with CAST (DEFAULT NULL) which define field types. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/3da112c2_d3495a80 PS31, Line 139: List<String> field = fields.get(i); Check how it is presented when you have nested fields. File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/indexadvisor/AdvisorConditionParser.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/2fc909ca_3812642a PS31, Line 47: AdvisorConditionParser Not sure, but it feels like a lot of this logic is already done when the optimizer actually tries to figure out if a secondary index is possible since it has to figure out the variables and their corresponding field names. I am referring to OptimizableFuncExpr. We should check if we can just go to that place when the optimizer finds these things out and pass them to the advisor. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/18e29d7a_8a778e97 PS31, Line 102: ILogicalExpression constantExpression = expr.getArguments().get(1).getValue(); How about if the 0th argument is the constant expression? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/e7b54ed4_92201a24 PS31, Line 125: BuiltinFunctions.FIELD_ACCESS_BY_NAME Check the other two FIELD_ACCESS_NESTED, FIELD_ACCESS_BY_INDEX https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/63d71c34_4aae3208 PS31, Line 128: e) ((ConstantExpression) functionCallExpr.getArguments() : .get(1).getValue()).getValue()).getObject()).getStringValue(); See if you can call one of the methods in ConstantExpressionUtil https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/e26d7792_7092ab75 PS31, Line 179: for (int i = 0; i < assignOp.getVariables().size(); i++) { : OperatorManipulationUtil.replaceVarWithExpr((AbstractFunctionCallExpression) filterExpr.getValue(), : assignOp.getVariables().get(i), assignOp.getExpressions().get(i).getValue()); : } Think about the case of a chain of assign operators. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/b34700fe_267b47ac PS31, Line 198: LogicalExpression lhs = expr.getArguments().get(0).getValue(); Return null if the num args != 2 File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/indexadvisor/AdvisorPlanParser.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/3429cf36_e057da85 PS31, Line 67: return null; Add missing { } in the if-blocks. There are few here. File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/indexadvisor/AdvisorScanPlanNode.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/a66bf699_c7b4b561 PS31, Line 54: List<AdvisorScanPlanNode> leafs = new ArrayList<>(); Same thing if callers are not modifying, return a singleton collection https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/7e019be0_92eccdf5 PS31, Line 61: new ArrayList<>(); If callers are not modifying the returned list, then return Collections.emptyList(). File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/indexadvisor/FakeIndex.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/0162d21f_740f40eb PS31, Line 38: Collections.nCopies(fieldNames.size(), 0) This is related to the meta record. 0 means the field is originating from the payload record (the main record), 1 means the field is originating from the meta record. We don't support secondary indexes on meta record fields today. File asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/cbo/indexadvisor/FakeIndexProvider.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/26f8dcee_17a8d960 PS31, Line 47: public FakeIndexProvider(CBOPlanStateTree planStateTree) Is it possible to refactor each logic into a separate method instead of having everything here? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/aa39fc98_8a07bd04 PS31, Line 48: Triple<String, DataverseName, String> Use DatasetFullyQualifiedName https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/5fb86ac0_95cc6d3e PS31, Line 55: DataSource dataSource = (DataSource) scanOperator.getDataSource(); Use DatasetDataSource and get the required information from there. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/a9117b72_e6ff20ef PS31, Line 68: singleDataSourceFieldNamesMap.get(key).getSecond() Can you reuse from above instead of looking up the value again? https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/45ed44bb_a5a3ae53 PS31, Line 91: LogicalVariable rhsVar = filterCondition.getRhsVar(); : DataSourceScanOperator rhsOp = planStateTree.getDataSourceScanVariableMap().get(rhsVar); Refactor the below since it's the same as above. File asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/cbo-join/CBOJoinQueries.xml: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/b446f04d_3fb6b052 PS31, Line 24: "cbo-join/index-advisor"> Remember to add tests for nested fields, typed indexes (array & CAST indexes), ... etc. Also include cases where you have: - WHERE pure_function(x) = 4 - WHERE non_pure_function(x) = 4 ... just to make sure nothing breaks. File asterixdb/asterix-app/src/test/resources/runtimets/results/cbo-join/index-advisor/join-advise/join-advise.4.adm: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/522c175d_1e2d963d PS31, Line 1: b_1 Make sure this builds nested fields correctly with proper backticks. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/e3066005_b9fff01f PS31, Line 1: Default We need to backtick those identifiers File asterixdb/asterix-lang-common/src/main/java/org/apache/asterix/lang/common/statement/Query.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/8ba1bdbe_d768b426 PS31, Line 34: advise Shouldn't we include "advise" in equals() and hashCode() similar to "explain"? File asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/439c773a_3d7437b5 PS31, Line 1017: ADVISE is not supported for this kind of statement ADVISE is not supported for statement kind + stmt.getKind() https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/ac04d53c_5ba140c8 PS31, Line 6014: <ADVISE : "advise"> Did we decide we are going to make this a keyword or a soft-keyword? File hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/base/IndexAdvisor.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/2e92fd6d_9fe053ff PS31, Line 32: boolean isAdvise; Make it final. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/7ea6a91a_d3d3a9c2 PS31, Line 41: mapper If "mapper" isn't going to be used, replace it with new ObjectMapper() so that we don't have to keep it as a member variable. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/b172851b_07b09392 PS31, Line 55: Remove this space https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763/comment/adb441a1_7e4db23b PS31, Line 117: // Enforced to be unique within a dataverse. Remove -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19763 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Iac773ebb49bdde18cfd340ef15d483dc324bdced Gerrit-Change-Number: 19763 Gerrit-PatchSet: 31 Gerrit-Owner: Preetham Poluparthi <preetha...@apache.org> Gerrit-Reviewer: Ali Alsuliman <ali.al.solai...@gmail.com> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Gerrit-Reviewer: Peeyush Gupta <peeyush.gu...@couchbase.com> Gerrit-Reviewer: Preetham Poluparthi <preetha...@apache.org> Gerrit-Attention: Peeyush Gupta <peeyush.gu...@couchbase.com> Gerrit-Attention: Preetham Poluparthi <preetha...@apache.org> Gerrit-Comment-Date: Wed, 06 Aug 2025 20:21:37 +0000 Gerrit-HasComments: Yes Gerrit-Has-Labels: No Gerrit-MessageType: comment