Rafit Izhak Ratzin created ABDERA-426: -----------------------------------------
Summary: Group by a divided value (e.g., time/10) returns NULL. Key: ABDERA-426 URL: https://issues.apache.org/jira/browse/ABDERA-426 Project: Abdera Issue Type: Bug Reporter: Rafit Izhak Ratzin Running with an environment that includes: phoenix version 4.5.0, HBase version 0.98, Cloudera 5.2 When I group by a divided value (e.g., time/10) the divided value will return as NULL. Please see the simple examples below, Thanks in advance, --Rafit create table test(time integer not null, hostname varchar not null,usage float constraint pk PRIMARY KEY(time, hostname)); upsert into test values(1439853441,'qa8',3.28); upsert into test values(1439853449,'qa8',3.28); upsert into test values(1439853459,'qa8',3.28); upsert into test values(1439853458,'qa8',3.27); upsert into test values(1439853457,'qa8',6.27); upsert into test values(1439853462,'qa8',8.27); upsert into test values(1439853462,'qa9',8.27); upsert into test values(1439853457,'qa9',6.27); 0: jdbc:phoenix:localhost> select * from test; +------------------------------------------+------------------------------------------+----------------------------------------+ | TIME | HOSTNAME | USAGE | +------------------------------------------+------------------------------------------+----------------------------------------+ | 1439853441 | qa8 | 3.28 | | 1439853449 | qa8 | 3.28 | | 1439853457 | qa8 | 6.27 | | 1439853457 | qa9 | 6.27 | | 1439853458 | qa8 | 3.27 | | 1439853459 | qa8 | 3.28 | | 1439853462 | qa8 | 8.27 | | 1439853462 | qa9 | 8.27 | +------------------------------------------+------------------------------------------+----------------------------------------+ 0: jdbc:phoenix:localhost> select time/10, hostname, usage from test; +----------------------------------------+------------------------------------------+------------------------------------------+ | (TIME / 10) | HOSTNAME | USAGE | +----------------------------------------+------------------------------------------+------------------------------------------+ | 143985344 | qa8 | 3.28 | | 143985344 | qa8 | 3.28 | | 143985345 | qa8 | 6.27 | | 143985345 | qa9 | 6.27 | | 143985345 | qa8 | 3.27 | | 143985345 | qa8 | 3.28 | | 143985346 | qa8 | 8.27 | | 143985346 | qa9 | 8.27 | +----------------------------------------+------------------------------------------+------------------------------------------+ 0: jdbc:phoenix:localhost> select time/10 as tm, hostname, avg(usage) from test group by hostname, tm; +----------------------------------------+------------------------------------------+------------------------------------------+ | TM | HOSTNAME | AVG(USAGE) | +----------------------------------------+------------------------------------------+------------------------------------------+ | null | | 3.2799 | | null | | 4.2733 | | null | | 6.2699 | | null | | 8.27 | | null | | 8.27 | +----------------------------------------+------------------------------------------+------------------------------------------+ hostname is empty, time/10 is null I thought it might be related to the fact that the time is a primary key so I ran the next test case: 0: jdbc:phoenix:localhost> create table test1(time integer not null, hostname varchar not null,usage float,period integer constraint pk PRIMARY KEY(time, hostname)); 0: jdbc:phoenix:localhost> upsert into test1 values(1439853462,'qa9',8.27,1439853462); 0: jdbc:phoenix:localhost> upsert into test1 values(1439853461,'qa9',8.27,1439853362); 0: jdbc:phoenix:localhost> upsert into test1 values(1439853461,'qa9',5.27,1439853461); 0: jdbc:phoenix:localhost> upsert into test1 values(1439853451,'qa9',4.27,1439853451); 0: jdbc:phoenix:localhost> select * from test1; +-------------------------------------+-------------------------------------+--------------------------------------+---------------------------------------+ | TIME | HOSTNAME | USAGE | PERIOD | +-------------------------------------+-------------------------------------+--------------------------------------+---------------------------------------+ | 1439853451 | qa9 | 4.27 | 1439853451 | | 1439853461 | qa9 | 5.27 | 1439853461 | | 1439853462 | qa9 | 8.27 | 1439853462 | +-------------------------------------+------------------------------------+--------------------------------------+----------------------------------------+ 0: jdbc:phoenix:localhost> select period/10 as tm, hostname, usage from test1 ; +----------------------------------------+------------------------------------------+------------------------------------------+ | TM | HOSTNAME | USAGE | +----------------------------------------+------------------------------------------+------------------------------------------+ | 143985345 | qa9 | 4.27 | | 143985346 | qa9 | 5.27 | | 143985346 | qa9 | 8.27 | +----------------------------------------+------------------------------------------+------------------------------------------+ +----------------------------------------+------------------------------------------+------------------------------------------+ | TM | HOSTNAME | AVG(USAGE) | +----------------------------------------+------------------------------------------+------------------------------------------+ | null | qa8 | 4.2699 | | null | qa9 | 4.2699 | | null | qa9 | 6.77 | +----------------------------------------+------------------------------------------+------------------------------------------+ Now we can see the hostname (it is not an empty field as before) but tm is still null. Please see below the response from James for a work arround: Hi Rafit, Looks like a bug. Please file a JIRA. The following seems to work as a workaround: select cast(time/10.0 as integer) as tm, hostname, avg(usage) from test group by hostname, tm; You might also consider using a date[1] type instead of an integer and then using the TRUNC function[2] which work with both numeric and time types. Thanks, James [1] https://phoenix.apache.org/language/datatypes.html#date_type [2] https://phoenix.apache.org/language/functions.html#trunc -- This message was sent by Atlassian JIRA (v6.3.4#6332)