[ 
https://issues.apache.org/jira/browse/TRAFODION-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hans Zeller updated TRAFODION-2965:
-----------------------------------
    Fix Version/s:     (was: 2.4)
                   2.3

> Hash partial groupby does not report a row count in operator statistics
> -----------------------------------------------------------------------
>
>                 Key: TRAFODION-2965
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2965
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>    Affects Versions: 2.0-incubating
>         Environment: any
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>            Priority: Major
>             Fix For: 2.3
>
>
> Here is a test case that demonstrates this:
> {noformat}
> update statistics for table hive.hive.time_dim on every column;
> control query shape groupby(exchange(groupby(exchange(groupby(scan)))));
> prepare s from
> select count(distinct t_time_id) from hive.hive.time_dim; 
> explain options 'f' s;
> execute s;
> get statistics for qid current default;
> {noformat}
> The actual statistics show a "0" as the row count (ActRowsUsed) for the 
> lowest EX_HASH_GRBY with id 2:
> {noformat}
>    LC   RC   Id PaId ExId Frag TDB Name                   DOP   Dispatches    
>     OperCpuTime        EstRowsUsed        ActRowsUsed        ActDataUsed    
> Details
>    13    .   14    .    8    0 EX_ROOT                      1            2    
>              37                  0                  1                  8 3658
>    12    .   13   14    7    0 EX_SORT_GRBY                 1            5    
>              61                  1                  1                  8
>    11    .   12   13    6    0 EX_SPLIT_TOP                 1            8    
>             171                  1                  4                 32
>    10    .   11   12    6    0 EX_SEND_TOP                  4           16    
>           3,389                  1                  4                 64
>     9    .   10   11    6    2 EX_SEND_BOTTOM               4           12    
>             683                  1                  4                 64
>     8    .    9   10    6    2 EX_SPLIT_BOTTOM              4           19    
>             597                  1                  4                 32 
> 833874
>     7    .    8    9    5    2 EX_SORT_GRBY                 4        2,259    
>         289,200                  1                  4                 32
>     6    .    7    8    4    2 EX_HASH_GRBY                 4        2,259    
>         394,235               1451             86,400      2,765,059,200 0|0|0
>     5    .    6    7    3    2 EX_SPLIT_TOP                 4        2,211    
>          51,034             110007             86,400      2,765,059,200
>     4    .    5    6    3    2 EX_SEND_TOP                  8        2,436    
>          98,125             110007             86,400      2,766,528,000
>     3    .    4    5    3    3 EX_SEND_BOTTOM               8       10,658    
>         196,714             110007             86,400      2,766,528,000
>     2    .    3    4    3    3 EX_SPLIT_BOTTOM              2        2,663    
>          95,246             110007             86,400      2,765,059,200 
> 1547521
>     1    .    2    3    2    3 EX_HASH_GRBY                 2        4,521    
>         303,319            55003.5                  0                  0
>     .    .    1    2    1    3 EX_HDFS_SCAN                 2        2,650    
>         952,242             116085             86,400      2,765,059,200 
> HIVE.HIVE.TIME_DIM|86400|5288324
> {noformat}
> The reason is that the hash groupby reports its row count in the BMO stats. 
> However, a partial hash groupby is not considered a BMO (Big Memory 
> Operator), so no rowcount gets reported. The fix is to increment the rowcount 
> in the generic stats entry that is present in both partial and full groupby 
> operators.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to