[ 
https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913865#action_12913865
 ] 

John Sichi commented on HIVE-474:
---------------------------------

Some comments after a brief look:

* The patch is going to need to be rebased against trunk (I guess after 
HIVE-537 is committed)?

* We should make sure that in the case of a single distinct agg, we leave the 
plan as it is today, and only use the new plan generation when multiple 
distincts are present.  This may already be the case; I couldn't quite tell 
from the example plans in the test cases (it would be nice to have some simpler 
queries for that).

* Regarding moving expression evaluation to the reduce side:  in general, this 
is something that needs cost-based optimization, due to factors like (a) data 
size before and after expression evaluation and (b) parallelization benefit of 
spreading out the computation over lots of mappers (assuming many more mappers 
than reducers).


> Support for distinct selection on two or more columns
> -----------------------------------------------------
>
>                 Key: HIVE-474
>                 URL: https://issues.apache.org/jira/browse/HIVE-474
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Alexis Rondeau
>            Assignee: Mafish
>         Attachments: hive-474.0.4.2rc.patch
>
>
> The ability to select distinct several, individual columns as by example: 
> select count(distinct user), count(distinct session) from actions;   
> Currently returns the following failure: 
> FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns 
> not Supported user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to