[
https://issues.apache.org/jira/browse/HIVE-29654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18087883#comment-18087883
]
Stamatis Zampetakis commented on HIVE-29654:
--------------------------------------------
One point that needs some more clarification is the part about "getting" the
CBO plan:
1. Do we want to return this to the user?
2. Do we want to put this in the HS2 logs?
3. Do we want to put this in the application logs?
4. Do we want to store it the query history (HIVE-28095) ?
The answer to the above will also determine if we need a new command (e.g.,
EXPLAIN EXECUTE) or not.
> Provide an easy way to execute a query and get the CBO plan at the same time
> ----------------------------------------------------------------------------
>
> Key: HIVE-29654
> URL: https://issues.apache.org/jira/browse/HIVE-29654
> Project: Hive
> Issue Type: New Feature
> Reporter: Thomas Rebele
> Priority: Major
>
> It would be nice if Hive provided an easy way to execute a query and get the
> CBO and stage plans of that execution. Currently this is easy to achieve for
> the stage plans, but not for the CBO plan (see background section below).
> Some approaches:
> * output CBO plans if {{hive.log.explain.output}} is enabled
> * add another config option (e.g., {{{}hive.log.cbo.plan{}}}) to output the
> CBO plans when the query is executed
> * add a new command EXPLAIN EXECUTE that prints the query plans (CBO and
> stage plans) and then executes these plans
> ----
> Background:
> I've looked into the results of a TPC-DS benchmark. I wanted to understand
> why the performance of a query has changed. I had the execution time of the
> query, some logs, and some EXPLAIN plans. Unfortunately the EXPLAIN plans
> were not helpful, as the plans were not the same (different operator ids).
> I did an experiment with TPC-DS metadata and KLL histograms enabled. I
> executed EXPLAIN CBO JOINCOST several times in the same session. For 8 runs,
> I got 5 different plans.
> I had a chat with an AI tool, and it suggested some options to get the plan.
> After a while I got it working:
> * stage plans: {{SET hive.server2.logging.operation.level = VERBOSE; SET
> hive.log.explain.output=true;}}
> * CBO plan: enable DEBUG logging for
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}} in
> {{{}hive-log4j2.properties{}}}. Also executing {{SET
> hive.server2.logging.operation.level = VERBOSE;}} helps, as the plans will
> then show up in beeline.
> It would be nice to make this a bit simpler. It might be the case that the
> user cannot easily change the log4j2 properties. Also, setting the DEBUG
> level globally just to get the plan for one query looks like a lot of
> overhead, if other users are running queries at the same time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)