[ 
https://issues.apache.org/jira/browse/PHOENIX-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236468#comment-16236468
 ] 

James Taylor commented on PHOENIX-4287:
---------------------------------------

One more question, [~mujtabachohan]. Prior versions of this patch were 
mistakenly writing the USE_STATS_FOR_PARALLELIZATION value into the table 
metadata, even when it wasn't set. Is your testing using new tables so that 
this doesn't impact you? You can query the SYSTEM.CATALOG directly for the 
table & index to see if there's a value for USE_STATS_FOR_PARALLELIZATION. If 
there is, this prior issue may be affecting you. If you create a new table and 
index and you see a value, there's definitely still an issue.

> Incorrect aggregate query results when stats are disable for parallelization
> ----------------------------------------------------------------------------
>
>                 Key: PHOENIX-4287
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4287
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>         Environment: HBase 1.3.1
>            Reporter: Mujtaba Chohan
>            Assignee: Samarth Jain
>            Priority: Major
>              Labels: localIndex
>             Fix For: 4.13.0, 4.12.1
>
>         Attachments: PHOENIX-4287.patch, PHOENIX-4287_addendum.patch, 
> PHOENIX-4287_addendum2.patch, PHOENIX-4287_addendum3.patch, 
> PHOENIX-4287_addendum4.patch, PHOENIX-4287_v2.patch, PHOENIX-4287_v3.patch, 
> PHOENIX-4287_v3_wip.patch, PHOENIX-4287_v4.patch
>
>
> With {{phoenix.use.stats.parallelization}} set to {{false}}, aggregate query 
> returns incorrect results when stats are available.
> With local index and stats disabled for parallelization:
> {noformat}
> explain select count(*) from TABLE_T;
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+
> |                                                 PLAN                        
>                           | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO |
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+
> | CLIENT 0-CHUNK 332170 ROWS 625043899 BYTES PARALLEL 0-WAY RANGE SCAN OVER 
> TABLE_T [1]  | 625043899       | 332170         | 150792825 |
> |     SERVER FILTER BY FIRST KEY ONLY                                         
>                           | 625043899       | 332170         | 150792825 |
> |     SERVER AGGREGATE INTO SINGLE ROW                                        
>                           | 625043899       | 332170         | 150792825 |
> +-------------------------------------------------------------------------------------------------------+-----------------+----------------+-----------+
> select count(*) from TABLE_T;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 0         |
> +-----------+
> {noformat}
> Using data table
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +--------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> |                                               PLAN                          
>                      | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
> +--------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> | CLIENT 2-CHUNK 332151 ROWS 438492470 BYTES PARALLEL 1-WAY FULL SCAN OVER 
> TABLE_T  | 438492470       | 332151         | 1507928257617  |
> |     SERVER FILTER BY FIRST KEY ONLY                                         
>                      | 438492470       | 332151         | 1507928257617  |
> |     SERVER AGGREGATE INTO SINGLE ROW                                        
>                      | 438492470       | 332151         | 1507928257617  |
> +--------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 14        |
> +-----------+
> {noformat}
> Without stats available, results are correct:
> {noformat}
> explain select /*+NO_INDEX*/ count(*) from TABLE_T;
> +----------------------------------------------------------------------+-----------------+----------------+--------------+
> |                                 PLAN                                 | 
> EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
> +----------------------------------------------------------------------+-----------------+----------------+--------------+
> | CLIENT 2-CHUNK PARALLEL 1-WAY FULL SCAN OVER TABLE_T  | null            | 
> null           | null         |
> |     SERVER FILTER BY FIRST KEY ONLY                                  | null 
>            | null           | null         |
> |     SERVER AGGREGATE INTO SINGLE ROW                                 | null 
>            | null           | null         |
> +----------------------------------------------------------------------+-----------------+----------------+--------------+
> select /*+NO_INDEX*/ count(*) from TABLE_T;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 333327    |
> +-----------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to