[jira] Commented: (HIVE-287) count distinct on multiple columns does not work

Zheng Shao (JIRA) Wed, 07 Jul 2010 18:44:22 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886183#action_12886183
 ]


Zheng Shao commented on HIVE-287:
---------------------------------

Hi Arvind, sorry for coming late for the party. I have 2 questions on the new 
UDAF2 interface:

1. Why do we put the DISTINCT in the information? DISTINCT is currently done by 
the framework, instead of individual UDAF.
This is good because the logic of removing duplicates are common for all UDAFs. 
 We do support SUM(DISTINCT val).

2. Why do we special-case "*"? It seems to me that "*" is just a short-cut.  
Hive already supports regex-based multi-column specification, so that we can 
say `abc.*` for all columns with name starting with abc. The compiler should 
just expand * and give all the columns to the UDAF.

Since COUNT(*) is a special-case in the SQL standard (COUNT(*) is different 
from COUNT(col) even if the table has a single column col), I think we should 
just special-case that and replace that with count(1) at some place.

What do you think?

> count distinct on multiple columns does not work
> ------------------------------------------------
>
>                 Key: HIVE-287
>                 URL: https://issues.apache.org/jira/browse/HIVE-287
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Arvind Prabhakar
>         Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, 
> HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-287) count distinct on multiple columns does not work

Reply via email to