Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21109 )
Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple queries ...................................................................... Patch Set 10: (9 comments) http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java: http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@69 PS10, Line 69: nor or partitions Nit: "nor are partitions" http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/rel/node/ImpalaHdfsScanRel.java@97 PS10, Line 97: int totalCols = table.getColumns().size(); For my own understanding: Is it true that totalCols == numFields? It seems like it would be, because we modulus by totalCols into an array of size numFields. If that is true, then I would size the ArrayList and do the modulus with the same variable (maybe totalCols) and have a precondition that totalCols == getRowType().getFieldNames().size(). http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/ExecRequestCreator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/ExecRequestCreator.java: http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/ExecRequestCreator.java@70 PS10, Line 70: // O BENEVOLENT REVIEWER AND CODE INSPECTOR... : // TODO: Please hold off on reviewing this file. I held off on cleaning this up until : // this gets past the experimental stage. Some of the code in SingleNodePlanner : // is duplicated here, so this will involve a general rewrite. After more Calcite : // code gets committed and the planner works for a good portion of the queries, this : // will get rewritten into its final form. Is this comment still true? Are there rewrites to come for this file? http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeConverter.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeConverter.java: http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeConverter.java@107 PS10, Line 107: case DECIMAL: : RelDataType decimalDefinedRetType = factory.createSqlType(SqlTypeName.DECIMAL, : scalarType.decimalPrecision(), scalarType.decimalScale()); : return factory.createTypeWithNullability(decimalDefinedRetType, true); : case VARCHAR: : return createCharType(factory, SqlTypeName.VARCHAR, scalarType.getLength()); : case CHAR: : return createCharType(factory, SqlTypeName.CHAR, scalarType.getLength()); If I understand this right, we could omit DECIMAL, VARCHAR, and CHAR from the impalaToCalciteMap? Is that right or do we use it for something else? http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeConverter.java@169 PS10, Line 169: Charset charSetName = Charset.forName(ConversionUtil.NATIVE_UTF16_CHARSET_NAME); Does this charset do anything for execution? http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java: http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeSystemImpl.java@40 PS10, Line 40: private static final int MAX_BINARY_PRECISION = Integer.MAX_VALUE; : private static final int MAX_TIMESTAMP_PRECISION = 15; : private static final int MAX_TIMESTAMP_WITH_LOCAL_TIME_ZONE_PRECISION = 15; // nanos : private static final int DEFAULT_TINYINT_PRECISION = 3; : private static final int DEFAULT_SMALLINT_PRECISION = 5; : private static final int DEFAULT_INTEGER_PRECISION = 10; : private static final int DEFAULT_BIGINT_PRECISION = 19; : private static final int DEFAULT_FLOAT_PRECISION = 7; : private static final int DEFAULT_DOUBLE_PRECISION = 15; Could you include a comment that provides a quick definition of precision and how Calcite uses this? http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/validate/ImpalaConformance.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/validate/ImpalaConformance.java: http://gerrit.cloudera.org:8080/#/c/21109/10/java/calcite-planner/src/main/java/org/apache/impala/calcite/validate/ImpalaConformance.java@23 PS10, Line 23: * ImpalaConformance describes supported Calcite features. It would be nice to point to some resource that gives more detail about these. i.e. "For more information about each of these choices, see X" X could be a Calcite web page or code location, etc. http://gerrit.cloudera.org:8080/#/c/21109/10/tests/custom_cluster/test_calcite_planner.py File tests/custom_cluster/test_calcite_planner.py: http://gerrit.cloudera.org:8080/#/c/21109/10/tests/custom_cluster/test_calcite_planner.py@39 PS10, Line 39: def test_calcite_frontend(self, vector): One thing we can do is add "unique_database" to this list of arguments and then pass it into the call to "run_test_case". This causes the test to create a new database, run the test case inside that temporary database, and then drop it when the test is over. That is useful for cases where we create tables, as it means the table is created in the temporary database and goes away at the end of the test. https://github.com/apache/impala/blob/master/tests/custom_cluster/test_kudu.py#L73-L78 http://gerrit.cloudera.org:8080/#/c/21109/10/tests/custom_cluster/test_calcite_planner.py@40 PS10, Line 40: self.create_impala_client() Nit: I'm pretty sure you don't need this. -- To view, visit http://gerrit.cloudera.org:8080/21109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98 Gerrit-Change-Number: 21109 Gerrit-PatchSet: 10 Gerrit-Owner: Steve Carlin <scar...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Steve Carlin <scar...@cloudera.com> Gerrit-Comment-Date: Thu, 21 Mar 2024 22:24:19 +0000 Gerrit-HasComments: Yes