[jira] [Created] (HIVE-17626) Query reoptimization using cached runtime statistics

Prasanth Jayachandran (JIRA) Wed, 27 Sep 2017 16:26:33 -0700

Prasanth Jayachandran created HIVE-17626:
--------------------------------------------


             Summary: Query reoptimization using cached runtime statistics
                 Key: HIVE-17626
                 URL: https://issues.apache.org/jira/browse/HIVE-17626
             Project: Hive
          Issue Type: New Feature
          Components: Logical Optimizer
    Affects Versions: 3.0.0
            Reporter: Prasanth Jayachandran


Something similar to "EXPLAIN ANALYZE" where we annotate explain plan with 
actual and estimated statistics. The runtime stats can be cached at query level 
and subsequent execution of the same query can make use of the cached 
statistics from the previous run for better optimization. 
Some use cases,
1) re-planning join query (mapjoin failures can be converted to shuffle joins)
2) better statistics for table scan operator if dynamic partition pruning is 
involved
3) Better estimates for bloom filter initialization (setting expected entries 
during merge)

This can extended to support wider queries by caching fragments of operator 
plans scanning same table(s) or matching some operator sequences.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (HIVE-17626) Query reoptimization using cached runtime statistics

Reply via email to