[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arvind Prabhakar updated HIVE-287: ---------------------------------- Status: Patch Available (was: Open) Fix Version/s: 0.6.0 *Summary* This patch fixes the {{count()}} aggregate function to be consistent with SQL. Specifically: * Provides support for {{SELECT count(*) FROM table}} queries, where it returns the total number of rows of the table. * Also extended the support for {{count()}} to include multiple expression list. {{count(DISTINCT expr1, exp2,...)}} returns the number of non-NULL and different valued rows from the evaluated expressions. *Details* * Modified HiveQL grammar to allow function invocation with a single * in place of parameter list. * Propagated the presence of * as parameter or specification of {{DISTINCT}} keyword in the UDF resolver framework so that it can be used by UDFs that behave differently when these are applicable. * Modified the {{count()}} UDAF to support the same semantics of handling NULL values as regular SQL. * Added test case to specifically exercise the newly introduced semantics of the count UDAF. *Testing* Ran all tests. Noted only two failures (input20.q, input33.q) which were found to be failing on the local trunk image as well. If and when this patch is committed to the trunk, I will go ahead and update the Hive Wiki with details and examples regarding the use of {{count()}} UDAF in various forms. > count distinct on multiple columns does not work > ------------------------------------------------ > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor > Reporter: Namit Jain > Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-287-1.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.