[jira] [Commented] (HIVE-3086) Skewed Join Optimization

alex gemini (JIRA) Tue, 26 Jun 2012 02:05:49 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401262#comment-13401262
 ]


alex gemini commented on HIVE-3086:
-----------------------------------

the design is very complicated IMO,what if we have a big table logs and a small 
table users, table users have a column 'age', if we have issue a query skewed 
by age which we can't pre-partition the big table.this design didn't handle 
it,right? I guess what we want is customer partition at runtime,for the above 
example, we need customer partition(or some hint)or tell the query plan we want 
to partition the users table at 'userid,age' column and also partition the logs 
table at 'userid' column, the partition number for same userid for two table 
need to be same for further join.
                
> Skewed Join Optimization
> ------------------------
>
>                 Key: HIVE-3086
>                 URL: https://issues.apache.org/jira/browse/HIVE-3086
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Nadeem Moidu
>            Assignee: Nadeem Moidu
>
> During a join operation, if one of the columns has a skewed key, it can cause 
> that particular reducer to become the bottleneck. The following feature will 
> address it:
> https://cwiki.apache.org/confluence/display/Hive/Skewed+Join+Optimization

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3086) Skewed Join Optimization

Reply via email to