[ 
https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602018#comment-14602018
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11107:
----------------------------------------------------------

[~xuefuz] This benchmark is intended to make sure that subsequent changes to 
the optimizer or any hive code do not yield any unexpected plan changes. i.e. I 
don't intend to run the entire TPCDS query set, but just "explain plan" for the 
TPCDS queries. 

As part of this jira, I will manually verify that expected hive optimizations 
kick in for the queries (for given stats/dataset). If there is a difference in 
plan within this test suite due to a future commit, it needs to be analyzed and 
we need to make sure that it is not a regression.

In subsequent patches, I am planning to import stats from 1G (and possibly 
higher scales)  of TPCDS data-set before running the explain queries instead of 
adding the .dat files.

The test suite can be run in master branch by 
mvn test -Dtest=TestPerfCliDriver -Phadoop-2

I believe we don't have dedicated unit tests to cover the scenario mentioned 
here, hence this jira. I will add some of the details in the jira description 
for better clarity.

Thanks
Hari

> Support for Performance regression test suite with TPCDS
> --------------------------------------------------------
>
>                 Key: HIVE-11107
>                 URL: https://issues.apache.org/jira/browse/HIVE-11107
>             Project: Hive
>          Issue Type: Task
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: HIVE-11107.1.patch
>
>
> Support to add TPCDS queries to the performance regression test suite with 
> Hive CBO turned on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to