[
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880000#action_12880000
]
Arvind Prabhakar commented on HIVE-287:
---------------------------------------
@John: I agree with your assessment above. Regarding the count(*), my earlier
comment was not to imply that there exists a UDAF today, but that it might
exist in the future. More importantly though, using an empty parameter list as
an indicator for * would blur the distinction between UDAF(*) vs UDAF()
invocation. This is one way of many perhaps where parameter overloading could
lead to confusion and hard to understand code.
I think introducing {{GenericUDAFResolver2}} interface is a great idea. I also
like the idea of using a call back for decoupling the invocation from parameter
list but am concerned that this could lead to perhaps redundant method call and
object creation. I am not sure if that would add to any significant performance
penalty in the long run or not.
I would love to know what the opinion of others interested in this issue is
regarding this route. If all agree that adding a new interface with callback
for parameter discovery is acceptable, I can start working on that patch.
> count distinct on multiple columns does not work
> ------------------------------------------------
>
> Key: HIVE-287
> URL: https://issues.apache.org/jira/browse/HIVE-287
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Query Processor
> Reporter: Namit Jain
> Assignee: Arvind Prabhakar
> Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.