[jira] Commented: (HIVE-287) count distinct on multiple columns does not work

Zheng Shao (JIRA) Fri, 09 Jul 2010 14:14:49 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886882#action_12886882
 ]


Zheng Shao commented on HIVE-287:
---------------------------------

Talked with John offline also.

I agree that we can use the new interface going forward. Can you do these also 
in this patch:
1. Change the comments for the 2 new fields.  It's easy for UDAF writers to 
assume that the UDAF itself needs to handle whether it's distinct or whether 
it's all columns.
2. Deprecate the old interface, and move all existing GenericUDAF to inherit 
from the new one.

{code}
+  /**
+   * @return true if the UDAF invocation was qualified with <tt>DISTINCT</tt>
+   * keyword, false otherwise.
+   */
+  boolean isDistinct();
+
+  /**
+   * @return true if the UDAF invocation was done with a wildcard instead of
+   * explicit parameter list.
+   */
+  boolean isAllColumns();
{code}

After this patch is in, here is a list of follow-ups. Can you open JIRA for 
these:

1. Let UDAF and UDF support * and regex-based column specification
2. Special-case COUNT(*) because that does not require reading any columns, 
while MY_UDAF(*) needs all columns.


> count distinct on multiple columns does not work
> ------------------------------------------------
>
>                 Key: HIVE-287
>                 URL: https://issues.apache.org/jira/browse/HIVE-287
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Arvind Prabhakar
>         Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, 
> HIVE-287-4.patch, HIVE-287-5-branch-0.6.patch, HIVE-287-5-trunk.patch
>
>
> The following query does not work:
> select count(distinct col1, col2) from Tbl

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-287) count distinct on multiple columns does not work

Reply via email to