[
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870860#action_12870860
]
Arvind Prabhakar commented on HIVE-287:
---------------------------------------
Thanks for taking a look at this patch Namit. I have some questions and
clarifcations regarding your feedback:
bq. 1. This should be independent of COUNT - so, all basically all aggregation
functions should be supported with DISTINCT.
For eg: select avg(distinct c1,c2) from T
Not sure how this relates to the change I made. Even before making this change,
the DISTINCT qualifier was allowed for any function invocation. Can you
elaborate what you mean by this? Specifically, which part of the patch needs to
be changed in order to accomodate this request.
bq. 2. It would be a good idea to maintain some compatibility for the existing
interface - so, can we add another method to UDAFResolver, which has the new
API - and a common class which invokes the default implementation, that would
be better.
Here is what I understand your suggestion as: Add a new method to
GenericUDAFResolver interface maintaining the old method. Create an abstract
base class that implements the new interface method and invokes the old method
by dropping isDistinct/isAllColumn arguments. Extend the current resolvers to
override this method. Will this address your concern? If not, can you provide a
concrete example.
bq. 3. Follows from 1 - more tests are needed
Are you suggesting more tests for array_contains UDF or to add more tests for
other UDFs? Please clarify with examples if possible.
> count distinct on multiple columns does not work
> ------------------------------------------------
>
> Key: HIVE-287
> URL: https://issues.apache.org/jira/browse/HIVE-287
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Query Processor
> Reporter: Namit Jain
> Assignee: Arvind Prabhakar
> Fix For: 0.6.0
>
> Attachments: HIVE-287-1.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.