[
https://issues.apache.org/jira/browse/PHOENIX-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687171#comment-15687171
]
Gabriel Reid commented on PHOENIX-3471:
---------------------------------------
[~jamestaylor] [~maryannxue] [~julianhyde] as I mentioned last week, I've been
working on making some Hamcrest matchers for matching query plans (basing
myself on the JSON query plan representation) for use in test cases. The idea
is that it should be easy to match only certain parts of query plans (i.e. the
parts that are important in the test case), while skipping over other parts.
The main motivation is therefore getting rid of potentially brittle string
matching on the full query plan.
I've put up a WIP of what I've got so far here:
https://github.com/gabrielreid/phoenix/blob/PHOENIX-3471_explain_plan/phoenix-core/src/it/java/org/apache/phoenix/calcite/CalciteQueryPlanMatchers.java
A demo of its use is available here:
https://github.com/gabrielreid/phoenix/blob/PHOENIX-3471_explain_plan/phoenix-core/src/it/java/org/apache/phoenix/calcite/CalciteQueryPlanMatchersTest.java#L93
I've also updated a few of the tests in
[CalciteIT|https://github.com/gabrielreid/phoenix/blob/PHOENIX-3471_explain_plan/phoenix-core/src/it/java/org/apache/phoenix/calcite/CalciteIT.java]
to use these matchers.
Could you let me know if you've got any general thoughts on this approach?
Good, not good, too verbose, anything else? [~julianhyde] do you think that it
would be interesting to have this (the non-Phoenix parts) in Calcite proper? Or
would we rather still just stick to string matching (or another approach)? Any
comments more than welcome.
> Allow accessing full (legacy) Phoenix EXPLAIN information via Calcite
> ---------------------------------------------------------------------
>
> Key: PHOENIX-3471
> URL: https://issues.apache.org/jira/browse/PHOENIX-3471
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Gabriel Reid
> Assignee: Gabriel Reid
>
> The EXPLAIN syntax in Calcite-Phoenix (either "EXPLAIN <sql>" or "EXPLAIN
> PLAN FOR <sql>") currently returns the Calcite plan for a query. For example:
> {code}
> EXPLAIN SELECT MAX(I) FROM T1
> {code}
> results in the following Calcite explain plan:
> {code}
> PhoenixToEnumerableConverter
> PhoenixServerAggregate(group=[{}], EXPR$0=[MAX($0)])
> PhoenixTableScan(table=[[phoenix, T1]])
> {code}
> and the following (legacy) Phoenix explain plan:
> {code}
> CLIENT PARALLEL 1-WAY FULL SCAN OVER T1
> SERVER FILTER BY FIRST KEY ONLY
> {code}
> There are currently a large number of integration tests which depend on the
> legacy Phoenix format of explain plan, and this format is no longer available
> when running via Calcite. PHOENIX-3105 added support for accessing the
> explain plan via the "EXPLAIN <sql>" syntax, but this update to the syntax
> still only provides the Calcite-specific explain plan.
> There are three main approaches which can be taken here:
> h4. Option 1: Custom EXPLAIN execution
> This approach extends the work done in PHOENIX-3105 to plug in a custom
> SqlPhoenixExplain
> node which returns the legacy Phoenix explain plan, with the "EXPLAIN PLAN
> FOR <sql>"
> syntax still returning the Calcite explain plan.
> h4. Option 2: Add the legacy Phoenix explain plan to the Calcite plan as a
> top-level attribute
> This approach results in an explain plan that looks as follows:
> {code}
> PhoenixToEnumerableConverter(PhoenixExecutionPlan=[CLIENT PARALLEL 1-WAY FULL
> SCAN OVER T1
> SERVER FILTER BY FIRST KEY ONLY])
> PhoenixServerAggregate(group=[{}], EXPR$0=[MAX($0)])
> PhoenixTableScan(table=[[phoenix, T1]])
> {code}
> The disadvantage of this approach is that it's not really "correct" -- we're
> just tacking
> a different representation of the explain plan into the Calcite explain plan.
> The advantage of this approach is that it's very quick and easy to implement
> (i.e. it
> can be done immediately), and it will require minimal changes to the many
> test cases which have
> hard-coded explain plans that things are checked against. All we need to do
> is have a
> utility to extract the PhoenixExecutionPlan value from the full Calcite plan,
> and other
> than that all test cases stay the same.
> h4. Option 3: Add all relevant information to the correct parts of the
> Calcite explain plan
> This approach would result in an explain plan that looks as follows:
> {code}
> PhoenixToEnumerableConverter
> PhoenixServerAggregate(group=[{}], EXPR$0=[MAX($0)])
> PhoenixTableScan(table=[[phoenix, T1]], scanType[CLIENT PARALLEL 1-WAY
> FULL ])
> {code}
> This is undoubtedly the "right" way to do things. However, it has the major
> disadvantage
> that it will require a large amount of work to do the following:
> * add all relevant information into various implementations of
> {{AbstractRelNode.explainTerms}}
> * rework all test cases which verify things against an expected explain plan
> It is of course also an option is to start with option 2 here, and eventually
> migrate to option 3.
> If we go for option 2 or option 3, we should probably remove the custom
> EXPLAIN parsing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)