Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21109 )
Change subject: IMPALA-12872: Use Calcite for optimization - part 1: simple queries ...................................................................... Patch Set 18: (7 comments) Ok, great points! The eventual goal is to make this work with all existing tests. So it probably makes more sense to add these fallback tests when changes are made to make things work for the test framework. At that point, all Decimal V1 tests will fail, so I'll add the failback test at that point. So I went ahead and removed most of the code. One thing that has to remain is the canStmtBePlannedThroughCalcite. I need to make sure that DDL statements go back to the old path. I was hoping to use the Calcite parser to do this, but alas, I cannot. I need a way to distinguish between queries that fail due to syntax issues and queries that won't work because they're DDL. An earlier code review comment mentioned that the current code doesn't handle /* */ comments. A Jira has been filed for that. I also added the RUNTIME_PROFILE changes you suggested. That is definitely needed and was a great suggestion. It should be able to detect both the original planner and the new planner. I did leave in query checks for statements that begin with "values" and "with". Those aren't tested here, but I'd prefer to leave them in at this point because there are an abundance of tests in the main framework that use this, and the RUNTIME_PROFILE changes I made (from your comments) will ensure we don't miss this in the future. http://gerrit.cloudera.org:8080/#/c/21109/18/bin/set-classpath.sh File bin/set-classpath.sh: http://gerrit.cloudera.org:8080/#/c/21109/18/bin/set-classpath.sh@41 PS18, Line 41: > Nit: stray line Done http://gerrit.cloudera.org:8080/#/c/21109/18/bin/set-classpath.sh@56 PS18, Line 56: echo "USE_CALCITE_PLANNER" : echo $USE_CALCITE_PLANNER > Can we skip this debug statement? Done http://gerrit.cloudera.org:8080/#/c/21109/18/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java: http://gerrit.cloudera.org:8080/#/c/21109/18/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@144 PS18, Line 144: catch (Exception e) { : LOG.info("Calcite planner failed."); : LOG.info("Exception: " + e); : if (e != null) { : LOG.info("Stack Trace:" + ExceptionUtils.getStackTrace(e)); : throw new InternalException(e.getMessage()); : } : throw new RuntimeException(e); : } > For complete fallback support, this case would need to fall back. In practi Answered in the main reply http://gerrit.cloudera.org:8080/#/c/21109/18/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteJniFrontend.java@159 PS18, Line 159: private boolean canStmtBePlannedThroughCalcite(QueryContext queryCtx) { : String stringWithFirstRealWord = queryCtx.getStmt(); : String[] lines = stringWithFirstRealWord.split("\n"); : // Get rid of comments and blank lines which start the query. We need to find : // the first real word. : // TODO: IMPALA-12976: need to make this more generic. Certain patterns aren't caught : // here like /* */ : for (String line : lines) { : if (line.trim().startsWith("--") || line.trim().equals("")) { : stringWithFirstRealWord = stringWithFirstRealWord.replaceFirst(line + "\n", ""); : } else { : break; : } : } : stringWithFirstRealWord = stringWithFirstRealWord.trim(); : String beforeStripString; : do { : beforeStripString = stringWithFirstRealWord; : stringWithFirstRealWord = StringUtils.stripStart(stringWithFirstRealWord, "("); : stringWithFirstRealWord = StringUtils.stripStart(stringWithFirstRealWord, null); : } while (!stringWithFirstRealWord.equals(beforeStripString)); : return StringUtils.startsWithIgnoreCase(stringWithFirstRealWord, "select") || : StringUtils.startsWithIgnoreCase(stringWithFirstRealWord, "values") || : StringUtils.startsWithIgnoreCase(stringWithFirstRealWord, "with"); : } : : private void checkUnsupportedFeatures(QueryContext queryCtx) : throws CalcitePlannerUnsupportedException { : if (SEMI_JOIN.matcher(queryCtx.getStmt()).find()) { : throw new CalcitePlannerUnsupportedException(NotSupported.SEMI_JOINS); : } : : if (ANTI_JOIN.matcher(queryCtx.getStmt()).find()) { : throw new CalcitePlannerUnsupportedException(NotSupported.ANTI_JOINS); : } : : TQueryOptions options = queryCtx.getTQueryCtx().client_request.getQuery_options(); : if (!options.isDecimal_v2()) { : throw new CalcitePlannerUnsupportedException(NotSupported.DECIMAL_V1); : } : : if (options.isAppx_count_distinct()) { : throw new CalcitePlannerUnsupportedException(NotSupported.APPX_COUNT_DISTINCT); : } : } > Nit: These functions do something similar via different methods. One is a b Removed http://gerrit.cloudera.org:8080/#/c/21109/18/java/calcite-planner/src/main/java/org/apache/impala/calcite/util/NotSupported.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/util/NotSupported.java: http://gerrit.cloudera.org:8080/#/c/21109/18/java/calcite-planner/src/main/java/org/apache/impala/calcite/util/NotSupported.java@26 PS18, Line 26: ysupported > Nit: supported Removed http://gerrit.cloudera.org:8080/#/c/21109/18/testdata/workloads/functional-query/queries/QueryTest/calcite.test File testdata/workloads/functional-query/queries/QueryTest/calcite.test: http://gerrit.cloudera.org:8080/#/c/21109/18/testdata/workloads/functional-query/queries/QueryTest/calcite.test@11 PS18, Line 11: ---- QUERY : select * from calcite_alltypes; : ---- RESULTS : 0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1 : 1,false,1,1,1,10,1.100000023841858,10.1,'01/01/09','1',2009-01-01 00:01:00,2009,1 : 2,true,2,2,2,20,2.200000047683716,20.2,'01/01/09','2',2009-01-01 00:02:00.100000000,2009,1 : 3,false,3,3,3,30,3.299999952316284,30.3,'01/01/09','3',2009-01-01 00:03:00.300000000,2009,1 : 4,true,4,4,4,40,4.400000095367432,40.4,'01/01/09','4',2009-01-01 00:04:00.600000000,2009,1 : ---- TYPES : int,boolean,tinyint,smallint,int,bigint,float,double,string,string,timestamp,int,int : ==== > For tests like this, is there a way to verify that it is using Calcite (i.e Removed the fallback, but we still need the RUNTIME_PROFILE which has been added. http://gerrit.cloudera.org:8080/#/c/21109/18/testdata/workloads/functional-query/queries/QueryTest/calcite.test@32 PS18, Line 32: ==== : ---- QUERY : select d1,d2,d3,d4,d5,d6 from functional.decimal_tbl; : ---- RESULTS : 1234,2222,1.2345678900,0.12345678900000000000000000000000000000,12345.78900,1 : 12345,333,123.4567890000,0.12345678900000000000000000000000000000,11.22000,1 : 12345,333,1234.5678900000,0.12345678900000000000000000000000000000,0.10000,1 : 132842,333,12345.6789000000,0.12345678900000000000000000000000000000,0.77889,1 : 2345,111,12.3456789000,0.12345678900000000000000000000000000000,3.14100,1 : ---- TYPES : decimal,decimal,decimal,decimal,decimal,decimal : ==== > It would be nice to have tests for SQLs that bail out of Calcite planning e Removed the bail out support -- To view, visit http://gerrit.cloudera.org:8080/21109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I453fd75b7b705f4d7de1ed73c3e24cafad0b8c98 Gerrit-Change-Number: 21109 Gerrit-PatchSet: 18 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Wed, 10 Apr 2024 14:34:49 +0000 Gerrit-HasComments: Yes
