I created HIVE-12223 to track this issue. Thanks, Jesús
From: Jesus Camachorodriguez Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" Date: Friday, October 16, 2015 at 8:00 AM To: "user@hive.apache.org<mailto:user@hive.apache.org>" Subject: Re: How to use grouping__id in a query Hi Michal, Sorry I didn't catch your message before. The change of behavior might be due to a bug; certainly we should filter or at least produce a proper error. Could you file a JIRA case and assign it to me? I'll check further. Thanks, Jesús From: Michal Krawczyk Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" Date: Friday, October 16, 2015 at 8:15 AM To: "user@hive.apache.org<mailto:user@hive.apache.org>" Subject: Re: How to use grouping__id in a query Hi all, Unfortunately I didn't get any answer on this one, perhaps I asked the question incorrectly. I'll try another one then ;). Should it be possible to use grouping__id function in having clause to filter our null values in the same query. It used to work in Hive 0.11 and 0.13, but doesn't work in Hive 1.0. Thanks, Michal On Fri, Sep 25, 2015 at 1:14 PM, Michal Krawczyk <michal.krawc...@u2i.com<mailto:michal.krawc...@u2i.com>> wrote: Hi all, During the migration from Hive 0.11 to 1.0 on Amazon EMR I run to an issue with grouping__id function. I'd like to use it to filter out NULL values that didn't come from grouping sets. Here's an example: We have a simple table with some data: hive> create table grouping_test (col1 string, col2 string); hive> insert into grouping_test values (1, 2), (1, 3), (1, null), (null, 2); hive> select * from grouping_test; OK 1 2 1 3 1 NULL NULL 2 hive> select col1, col2, GROUPING__ID, count(*) from grouping_test group by col1, col2 grouping sets ((), (col1)) having !(col1 IS NULL AND ((CAST(GROUPING__ID as int) & 1) > 0)) I expect the query above to filter out NULL col1 for the col1 grouping set, it used to work on Hive 0.11. But on Hive 1.0 it doesn't filter any values and still returns NULL col1: NULL NULL 0 4 NULL NULL 1 1 <=== this row is expected to be removed by the having clause 1 NULL 1 3 I tried also a few other conditions on grouping__id in having clause and none of them seem to work correctly: select col1, col2, GROUPING__ID, count(*) from grouping_test group by col1, col2 grouping sets ((), (col1)) having GROUPING__ID = '1' This query doesn't return any data. I also tried to embed it into a subquery, but still no luck. It finally worked when I saved the output of the main query to a temp table and filtered out the data using where clause, but this looks like an overkill. So my question is: How to filter out values using grouping__id in Hive 1.0? Thanks for your help, Michal -- Michal Krawczyk Project Manager / Tech Lead Union Square Internet Development http://www.u2i.com/ -- Michal Krawczyk Project Manager / Tech Lead Union Square Internet Development http://www.u2i.com/