[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized

Yin Huai (JIRA) Wed, 05 Sep 2012 08:15:08 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448814#comment-13448814
 ]


Yin Huai commented on HIVE-3430:
--------------------------------

Yes, YSmart (https://issues.apache.org/jira/browse/HIVE-2206) can optimize this 
pattern. 

For the query shown below, two jobs will be generated. The first one takes care 
the join operation on "key", and the second one takes care group by and join 
operations on "value". 
{code:SQL}
select * from
(
  select c.value, count(1) as cnt from
  (
    select b.key, b.value from
    (
      select key, length(value) from T1 where ds = '1'
    ) a
    join
    T2 b on b.ds = '1' and a.key = b.key
  ) c
  group by c.value
) d
join
(
  select value, count(1) as cnt from T2 c where c.ds = '1' group by value
) e
on d.value = e.value;
{code}
                
> group by followed by join with the same key should be optimized
> ---------------------------------------------------------------
>
>                 Key: HIVE-3430
>                 URL: https://issues.apache.org/jira/browse/HIVE-3430
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized

Reply via email to