[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

Yin Huai (Updated) (JIRA) Sun, 29 Jan 2012 09:57:35 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yin Huai updated HIVE-2206:
---------------------------

    Attachment: HIVE-2206.8-r1237253.patch.txt

@Kevin,
I wrongly assumed that all output names of the ReduceSinkOperator has a 
structure of "KEY/VALUE.internalName". I have solved this issue.

However, the current optimizer cannot handel the case that a table is directly 
connect to a post computation operator (in this case, table b directly connects 
to the operator join). I am planning to solve this issue after this patch. To 
walkaround, you can use ...
SET hive.optimize.reducededuplication=false;
SET hive.optimize.correlation=true;
SELECT * FROM (SELECT * FROM src DISTRIBUTE BY key SORT BY key) a JOIN (SELECT 
* FROM src DISTRIBUTE BY key SORT BY key) b ON a.key = b.key;. 
This query will be optimized and be executed in a single MapReduce job. 

Also, I have updated the patch and it is compatible with revision 1237253.
                
> add a new optimizer for query correlation discovery and optimization
> --------------------------------------------------------------------
>
>                 Key: HIVE-2206
>                 URL: https://issues.apache.org/jira/browse/HIVE-2206
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: Yin Huai
>         Attachments: HIVE-2206.1.patch.txt, HIVE-2206.2.patch.txt, 
> HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, 
> HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, 
> HIVE-2206.8-r1237253.patch.txt, HIVE-2206.8.r1224646.patch.txt, 
> YSmartPatchForHive.patch, testQueries.2.q
>
>
> reference:
> http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

Reply via email to