[ 
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870860#action_12870860
 ] 

Arvind Prabhakar commented on HIVE-287:
---------------------------------------

Thanks for taking a look at this patch Namit. I have some questions and 
clarifcations regarding your feedback:

bq. 1. This should be independent of COUNT - so, all basically all aggregation 
functions should be supported with DISTINCT.
For eg: select avg(distinct c1,c2) from T

Not sure how this relates to the change I made. Even before making this change, 
the DISTINCT qualifier was allowed for any function invocation. Can you 
elaborate what you mean by this? Specifically, which part of the patch needs to 
be changed in order to accomodate this request.

bq. 2. It would be a good idea to maintain some compatibility for the existing 
interface - so, can we add another method to UDAFResolver, which has the new 
API - and a common class which invokes the default implementation, that would 
be better.

Here is what I understand your suggestion as: Add a new method to 
GenericUDAFResolver interface maintaining the old method. Create an abstract 
base class that implements the new interface method and invokes the old method 
by dropping isDistinct/isAllColumn arguments. Extend the current resolvers to 
override this method. Will this address your concern? If not, can you provide a 
concrete example.

bq. 3. Follows from 1 - more tests are needed

Are you suggesting more tests for array_contains UDF or to add more tests for 
other UDFs? Please clarify with examples if possible.

> count distinct on multiple columns does not work
> ------------------------------------------------
>
>                 Key: HIVE-287
>                 URL: https://issues.apache.org/jira/browse/HIVE-287
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Arvind Prabhakar
>             Fix For: 0.6.0
>
>         Attachments: HIVE-287-1.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to