[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14997797#comment-14997797
 ] 

Ashutosh Chauhan edited comment on HIVE-12017 at 11/10/15 1:34 AM:
-------------------------------------------------------------------

I went through golden file plan changes and found following categories of plan 
diffs:
* 1) extra select operator : Many plans now have extra select operator in 
plans. e.g., auto_sortmerge_join_*.q
* 2) agg expr lost : In some tests, it seems like we dropped the aggregation 
altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q
* 3) Shuffle join warning : Some tests now are generating shuffle join warning, 
e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q
* 4) extra columns : seems like column pruning issue: 
auto_join1.q,auto_join10.q,auto_join11.q
* 5) PTF op missing : This one seems like ptf operator got dropped altogether 
ptfgroupbyjoin.q.
*  6) Non-skew-join plan : Seems like skew join optimization is broken and we 
drop that optimization. e.g., skewjoin_mapjoin*.q

Among these 1) & 4) are not a big concern. However, 2) & 5) could be 
correctness issue and 3) & 6) could be substantial perf losses.


was (Author: ashutoshc):
I went through golden file plan changes and found following categories of plan 
diffs:
* 1) extra select operator : Many plans now have extra select operator in 
plans. e.g., auto_sortmerge_join_*.q
* 2) agg expr lost : In some tests, it seems like we dropped the aggregation 
altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q
* 3) Shuffle join warning : Some tests now are generating shuffle join warning, 
e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q
* 4) extra columns : seems like column pruning issue: 
auto_join1.q,auto_join10.q,auto_join11.q
* 5) PTF op missing : This one seems like ptf operator got dropped altogether 
ptfgroupbyjoin.q.
*  6) Non-skew-join plan : Seems like skew join optimization is broken and we 
drop that optimization. e.g., skewjoin_mapjoin*.q

Among these 1) & 4) are not a big concern. However, 2) & 5) could be 
correctness issue and 3) & 7) could be substantial perf losses.

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-12017
>                 URL: https://issues.apache.org/jira/browse/HIVE-12017
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>    Affects Versions: 2.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>         Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, 
> HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to