[jira] [Updated] (PHOENIX-2789) Bug in GROUP BY with timestamp column

Alok Singh (JIRA) Tue, 22 Mar 2016 08:09:13 -0700

     [ 
https://issues.apache.org/jira/browse/PHOENIX-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alok Singh updated PHOENIX-2789:
--------------------------------
    Description: 
Table:
{code}
COST (
     ACCOUNT_ID VARCHAR NOT NULL,
     PRECISION TINYINT NOT NULL,
     START_DATE TIMESTAMP NOT NULL,
     SECONDARY_ACCOUNT_ID VARCHAR NOT NULL,
     TAG VARCHAR NOT NULL,
     VENDOR_ID VARCHAR NOT NULL,
     SERVICE VARCHAR NOT NULL,
....

CONSTRAINT PK PRIMARY KEY (
               ACCOUNT_ID,
               PRECISION,
               START_DATE,
               SECONDARY_ACCOUNT_ID,
               TAG,
               VENDOR_ID,
               SERVICE
) COMPRESSION='SNAPPY', SALT_BUCKETS=16;
{code}

When querying this table with a GROUP BY clause that contains the 'START_DATE" 
column, the results returned are incorrectly aggregated. I find multiple rows 
with the same "START_DATE" which should have been aggregated to a single row. 
The workaround is to do a TO_CHAR(START_DATE) which causes the query to return 
the correct results.

e.g:
{code}
----------------
select start_date, account_id,
       sum(quantity) as total,
       sum(discounted_quantity) as disc_total
from COST
where 
    start_date >= (to_date('2016-03-01'))
    and precision = 1
    and account_id in ('1234', '5678')
group start_date, account_id

--RESULT--
START_DATE                                ACCOUNT_ID                            
    TOTAL                                     DISC_TOTAL                        
        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2016-03-01 16:00:00.0                     1234                                  
    312                                       0                                 
        
....
2016-03-07 16:00:00.0                     1234                                  
    312                                       0                                 
        
2016-03-08 16:00:00.0                     1234                                  
    312                                       0                                 
        
2016-03-09 16:00:00.0                     1234                                  
    216                                       0                                 
        
2016-02-29 16:00:00.0                     5678                                  
    194                                       24                                
        
2016-03-01 16:00:00.0                     5678                                  
    262                                       0                                 
        
....
2016-03-07 16:00:00.0                     5678                                  
    237                                       48                                
        
2016-03-08 16:00:00.0                     5678                                  
    178                                       0                                 
        
2016-03-09 16:00:00.0                     1234                                  
    96                                        0                                 
        
2016-03-09 16:00:00.0                     5678                                  
    173                                       50                                
        
2016-03-10 16:00:00.0                     1234                                  
    324                                       0                                 
        
....
2016-03-07 16:00:00.0                     5678                                  
    178                                       24                                
        
2016-03-08 16:00:00.0                     5678                                  
    218                                       0                                 
        
2016-03-09 16:00:00.0                     5678                                  
    218                                       48                                
        
^^^^^^^^^^^^^^^^^^^^^^
Repeated rows

--------------------
{code}

Workaround:
{code}
select to_char(start_date), account_id,
       sum(quantity) as total,
       sum(discounted_quantity) as disc_total
from COST
where
   start_date >= (to_date('2016-03-01'))
   and precision = 1
  and account_id in ('1234', '5678')
group to_char(start_date), account_id

--RESULT--
START_DATE                                ACCOUNT_ID                            
    TOTAL                                     DISC_TOTAL                        
        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2016-03-01 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-01 00:00:00.000                   5678                                  
    3,465                                     384                               
        
2016-03-02 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-02 00:00:00.000                   5678                                  
    3,238                                     384                               
        
2016-03-03 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-03 00:00:00.000                   5678                                  
    3,243                                     384                               
        
2016-03-04 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-04 00:00:00.000                   5678                                  
    3,212                                     384                               
        
2016-03-05 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-05 00:00:00.000                   5678                                  
    2,907                                     384                               
        
2016-03-06 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-06 00:00:00.000                   5678                                  
    2,907                                     384                               
        
2016-03-07 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-07 00:00:00.000                   5678                                  
    2,919                                     384                               
        
2016-03-08 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-08 00:00:00.000                   5678                                  
    2,955                                     384                               
        
2016-03-09 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-09 00:00:00.000                   5678                                  
    2,955                                     384                               
        
2016-03-10 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-10 00:00:00.000                   5678                                  
    2,978                                     384                               
        
2016-03-11 00:00:00.000                   1234                                  
    324                                       0                                 
        
2016-03-11 00:00:00.000                   5678                                  
    3,125                                     384                               
        
2016-03-12 00:00:00.000                   1234                                  
    336                                       0                                 
        
2016-03-12 00:00:00.000                   5678                                  
    3,102                                     384                               
        
2016-03-13 00:00:00.000                   1234                                  
    336                                       0                                 
        
2016-03-13 00:00:00.000                   5678                                  
    3,117                                     384                               
        
2016-03-14 00:00:00.000                   1234                                  
    153                                       0                                 
        
2016-03-14 00:00:00.000                   5678                                  
    2,187                                     278                               
        

-----------------
{code}

  was:
Table:
COST (
     ACCOUNT_ID VARCHAR NOT NULL,
     PRECISION TINYINT NOT NULL,
     START_DATE TIMESTAMP NOT NULL,
     SECONDARY_ACCOUNT_ID VARCHAR NOT NULL,
     TAG VARCHAR NOT NULL,
     VENDOR_ID VARCHAR NOT NULL,
     SERVICE VARCHAR NOT NULL,
....

CONSTRAINT PK PRIMARY KEY (
               ACCOUNT_ID,
               PRECISION,
               START_DATE,
               SECONDARY_ACCOUNT_ID,
               TAG,
               VENDOR_ID,
               SERVICE
) COMPRESSION='SNAPPY', SALT_BUCKETS=16;

When querying this table with a GROUP BY clause that contains the 'START_DATE" 
column, the results returned are incorrectly aggregated. I find multiple rows 
with the same "START_DATE" which should have been aggregated to a single row. 
The workaround is to do a TO_CHAR(START_DATE) which causes the query to return 
the correct results.

e.g:
----------------
select start_date, account_id,
       sum(quantity) as total,
       sum(discounted_quantity) as disc_total
from COST
where 
    start_date >= (to_date('2016-03-01'))
    and precision = 1
    and account_id in ('1234', '5678')
group start_date, account_id

--RESULT--
START_DATE                                ACCOUNT_ID                            
    TOTAL                                     DISC_TOTAL                        
        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2016-03-01 16:00:00.0                     1234                                  
    312                                       0                                 
        
....
2016-03-07 16:00:00.0                     1234                                  
    312                                       0                                 
        
2016-03-08 16:00:00.0                     1234                                  
    312                                       0                                 
        
2016-03-09 16:00:00.0                     1234                                  
    216                                       0                                 
        
2016-02-29 16:00:00.0                     5678                                  
    194                                       24                                
        
2016-03-01 16:00:00.0                     5678                                  
    262                                       0                                 
        
....
2016-03-07 16:00:00.0                     5678                                  
    237                                       48                                
        
2016-03-08 16:00:00.0                     5678                                  
    178                                       0                                 
        
2016-03-09 16:00:00.0                     1234                                  
    96                                        0                                 
        
2016-03-09 16:00:00.0                     5678                                  
    173                                       50                                
        
2016-03-10 16:00:00.0                     1234                                  
    324                                       0                                 
        
....
2016-03-07 16:00:00.0                     5678                                  
    178                                       24                                
        
2016-03-08 16:00:00.0                     5678                                  
    218                                       0                                 
        
2016-03-09 16:00:00.0                     5678                                  
    218                                       48                                
        
^^^^^^^^^^^^^^^^^^^^^^
Repeated rows

--------------------
Workaround:
select to_char(start_date), account_id,
       sum(quantity) as total,
       sum(discounted_quantity) as disc_total
from COST
where
   start_date >= (to_date('2016-03-01'))
   and precision = 1
  and account_id in ('1234', '5678')
group to_char(start_date), account_id

--RESULT--
START_DATE                                ACCOUNT_ID                            
    TOTAL                                     DISC_TOTAL                        
        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2016-03-01 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-01 00:00:00.000                   5678                                  
    3,465                                     384                               
        
2016-03-02 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-02 00:00:00.000                   5678                                  
    3,238                                     384                               
        
2016-03-03 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-03 00:00:00.000                   5678                                  
    3,243                                     384                               
        
2016-03-04 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-04 00:00:00.000                   5678                                  
    3,212                                     384                               
        
2016-03-05 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-05 00:00:00.000                   5678                                  
    2,907                                     384                               
        
2016-03-06 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-06 00:00:00.000                   5678                                  
    2,907                                     384                               
        
2016-03-07 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-07 00:00:00.000                   5678                                  
    2,919                                     384                               
        
2016-03-08 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-08 00:00:00.000                   5678                                  
    2,955                                     384                               
        
2016-03-09 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-09 00:00:00.000                   5678                                  
    2,955                                     384                               
        
2016-03-10 00:00:00.000                   1234                                  
    312                                       0                                 
        
2016-03-10 00:00:00.000                   5678                                  
    2,978                                     384                               
        
2016-03-11 00:00:00.000                   1234                                  
    324                                       0                                 
        
2016-03-11 00:00:00.000                   5678                                  
    3,125                                     384                               
        
2016-03-12 00:00:00.000                   1234                                  
    336                                       0                                 
        
2016-03-12 00:00:00.000                   5678                                  
    3,102                                     384                               
        
2016-03-13 00:00:00.000                   1234                                  
    336                                       0                                 
        
2016-03-13 00:00:00.000                   5678                                  
    3,117                                     384                               
        
2016-03-14 00:00:00.000                   1234                                  
    153                                       0                                 
        
2016-03-14 00:00:00.000                   5678                                  
    2,187                                     278                               
        

-----------------


> Bug in GROUP BY with timestamp column
> -------------------------------------
>
>                 Key: PHOENIX-2789
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2789
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.5.1
>         Environment: * Phoenix 4.5.1
> * Hbase 1.1.2
> * Hadoop 2.7.1
> * JDK 1.7
>            Reporter: Alok Singh
>
> Table:
> {code}
> COST (
>      ACCOUNT_ID VARCHAR NOT NULL,
>      PRECISION TINYINT NOT NULL,
>      START_DATE TIMESTAMP NOT NULL,
>      SECONDARY_ACCOUNT_ID VARCHAR NOT NULL,
>      TAG VARCHAR NOT NULL,
>      VENDOR_ID VARCHAR NOT NULL,
>      SERVICE VARCHAR NOT NULL,
> ....
> CONSTRAINT PK PRIMARY KEY (
>                ACCOUNT_ID,
>                PRECISION,
>                START_DATE,
>                SECONDARY_ACCOUNT_ID,
>                TAG,
>                VENDOR_ID,
>                SERVICE
> ) COMPRESSION='SNAPPY', SALT_BUCKETS=16;
> {code}
> When querying this table with a GROUP BY clause that contains the 
> 'START_DATE" column, the results returned are incorrectly aggregated. I find 
> multiple rows with the same "START_DATE" which should have been aggregated to 
> a single row. The workaround is to do a TO_CHAR(START_DATE) which causes the 
> query to return the correct results.
> e.g:
> {code}
> ----------------
> select start_date, account_id,
>        sum(quantity) as total,
>        sum(discounted_quantity) as disc_total
> from COST
> where 
>     start_date >= (to_date('2016-03-01'))
>     and precision = 1
>     and account_id in ('1234', '5678')
> group start_date, account_id
> --RESULT--
> START_DATE                                ACCOUNT_ID                          
>       TOTAL                                     DISC_TOTAL                    
>             
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2016-03-01 16:00:00.0                     1234                                
>       312                                       0                             
>             
> ....
> 2016-03-07 16:00:00.0                     1234                                
>       312                                       0                             
>             
> 2016-03-08 16:00:00.0                     1234                                
>       312                                       0                             
>             
> 2016-03-09 16:00:00.0                     1234                                
>       216                                       0                             
>             
> 2016-02-29 16:00:00.0                     5678                                
>       194                                       24                            
>             
> 2016-03-01 16:00:00.0                     5678                                
>       262                                       0                             
>             
> ....
> 2016-03-07 16:00:00.0                     5678                                
>       237                                       48                            
>             
> 2016-03-08 16:00:00.0                     5678                                
>       178                                       0                             
>             
> 2016-03-09 16:00:00.0                     1234                                
>       96                                        0                             
>             
> 2016-03-09 16:00:00.0                     5678                                
>       173                                       50                            
>             
> 2016-03-10 16:00:00.0                     1234                                
>       324                                       0                             
>             
> ....
> 2016-03-07 16:00:00.0                     5678                                
>       178                                       24                            
>             
> 2016-03-08 16:00:00.0                     5678                                
>       218                                       0                             
>             
> 2016-03-09 16:00:00.0                     5678                                
>       218                                       48                            
>             
> ^^^^^^^^^^^^^^^^^^^^^^
> Repeated rows
> --------------------
> {code}
> Workaround:
> {code}
> select to_char(start_date), account_id,
>        sum(quantity) as total,
>        sum(discounted_quantity) as disc_total
> from COST
> where
>    start_date >= (to_date('2016-03-01'))
>    and precision = 1
>   and account_id in ('1234', '5678')
> group to_char(start_date), account_id
> --RESULT--
> START_DATE                                ACCOUNT_ID                          
>       TOTAL                                     DISC_TOTAL                    
>             
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2016-03-01 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-01 00:00:00.000                   5678                                
>       3,465                                     384                           
>             
> 2016-03-02 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-02 00:00:00.000                   5678                                
>       3,238                                     384                           
>             
> 2016-03-03 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-03 00:00:00.000                   5678                                
>       3,243                                     384                           
>             
> 2016-03-04 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-04 00:00:00.000                   5678                                
>       3,212                                     384                           
>             
> 2016-03-05 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-05 00:00:00.000                   5678                                
>       2,907                                     384                           
>             
> 2016-03-06 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-06 00:00:00.000                   5678                                
>       2,907                                     384                           
>             
> 2016-03-07 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-07 00:00:00.000                   5678                                
>       2,919                                     384                           
>             
> 2016-03-08 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-08 00:00:00.000                   5678                                
>       2,955                                     384                           
>             
> 2016-03-09 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-09 00:00:00.000                   5678                                
>       2,955                                     384                           
>             
> 2016-03-10 00:00:00.000                   1234                                
>       312                                       0                             
>             
> 2016-03-10 00:00:00.000                   5678                                
>       2,978                                     384                           
>             
> 2016-03-11 00:00:00.000                   1234                                
>       324                                       0                             
>             
> 2016-03-11 00:00:00.000                   5678                                
>       3,125                                     384                           
>             
> 2016-03-12 00:00:00.000                   1234                                
>       336                                       0                             
>             
> 2016-03-12 00:00:00.000                   5678                                
>       3,102                                     384                           
>             
> 2016-03-13 00:00:00.000                   1234                                
>       336                                       0                             
>             
> 2016-03-13 00:00:00.000                   5678                                
>       3,117                                     384                           
>             
> 2016-03-14 00:00:00.000                   1234                                
>       153                                       0                             
>             
> 2016-03-14 00:00:00.000                   5678                                
>       2,187                                     278                           
>             
> -----------------
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-2789) Bug in GROUP BY with timestamp column

Reply via email to