[jira] [Closed] (TRAFODION-2965) Hash partial groupby does not report a row count in operator statistics

2018-02-22 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hans Zeller closed TRAFODION-2965.
--
Resolution: Fixed

> Hash partial groupby does not report a row count in operator statistics
> ---
>
> Key: TRAFODION-2965
> URL: https://issues.apache.org/jira/browse/TRAFODION-2965
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-exe
>Affects Versions: 2.0-incubating
> Environment: any
>Reporter: Hans Zeller
>Assignee: Hans Zeller
>Priority: Major
> Fix For: 2.3
>
>
> Here is a test case that demonstrates this:
> {noformat}
> update statistics for table hive.hive.time_dim on every column;
> control query shape groupby(exchange(groupby(exchange(groupby(scan);
> prepare s from
> select count(distinct t_time_id) from hive.hive.time_dim; 
> explain options 'f' s;
> execute s;
> get statistics for qid current default;
> {noformat}
> The actual statistics show a "0" as the row count (ActRowsUsed) for the 
> lowest EX_HASH_GRBY with id 2:
> {noformat}
>LC   RC   Id PaId ExId Frag TDB Name   DOP   Dispatches
> OperCpuTimeEstRowsUsedActRowsUsedActDataUsed
> Details
>13.   14.80 EX_ROOT  12
>  37  0  1  8 3658
>12.   13   1470 EX_SORT_GRBY 15
>  61  1  1  8
>11.   12   1360 EX_SPLIT_TOP 18
> 171  1  4 32
>10.   11   1260 EX_SEND_TOP  4   16
>   3,389  1  4 64
> 9.   10   1162 EX_SEND_BOTTOM   4   12
> 683  1  4 64
> 8.9   1062 EX_SPLIT_BOTTOM  4   19
> 597  1  4 32 
> 833874
> 7.8952 EX_SORT_GRBY 42,259
> 289,200  1  4 32
> 6.7842 EX_HASH_GRBY 42,259
> 394,235   1451 86,400  2,765,059,200 0|0|0
> 5.6732 EX_SPLIT_TOP 42,211
>  51,034 110007 86,400  2,765,059,200
> 4.5632 EX_SEND_TOP  82,436
>  98,125 110007 86,400  2,766,528,000
> 3.4533 EX_SEND_BOTTOM   8   10,658
> 196,714 110007 86,400  2,766,528,000
> 2.3433 EX_SPLIT_BOTTOM  22,663
>  95,246 110007 86,400  2,765,059,200 
> 1547521
> 1.2323 EX_HASH_GRBY 24,521
> 303,31955003.5  0  0
> ..1213 EX_HDFS_SCAN 22,650
> 952,242 116085 86,400  2,765,059,200 
> HIVE.HIVE.TIME_DIM|86400|5288324
> {noformat}
> The reason is that the hash groupby reports its row count in the BMO stats. 
> However, a partial hash groupby is not considered a BMO (Big Memory 
> Operator), so no rowcount gets reported. The fix is to increment the rowcount 
> in the generic stats entry that is present in both partial and full groupby 
> operators.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (TRAFODION-2965) Hash partial groupby does not report a row count in operator statistics

2018-02-21 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hans Zeller closed TRAFODION-2965.
--

> Hash partial groupby does not report a row count in operator statistics
> ---
>
> Key: TRAFODION-2965
> URL: https://issues.apache.org/jira/browse/TRAFODION-2965
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-exe
>Affects Versions: 2.0-incubating
> Environment: any
>Reporter: Hans Zeller
>Assignee: Hans Zeller
>Priority: Major
> Fix For: 2.4
>
>
> Here is a test case that demonstrates this:
> {noformat}
> update statistics for table hive.hive.time_dim on every column;
> control query shape groupby(exchange(groupby(exchange(groupby(scan);
> prepare s from
> select count(distinct t_time_id) from hive.hive.time_dim; 
> explain options 'f' s;
> execute s;
> get statistics for qid current default;
> {noformat}
> The actual statistics show a "0" as the row count (ActRowsUsed) for the 
> lowest EX_HASH_GRBY with id 2:
> {noformat}
>LC   RC   Id PaId ExId Frag TDB Name   DOP   Dispatches
> OperCpuTimeEstRowsUsedActRowsUsedActDataUsed
> Details
>13.   14.80 EX_ROOT  12
>  37  0  1  8 3658
>12.   13   1470 EX_SORT_GRBY 15
>  61  1  1  8
>11.   12   1360 EX_SPLIT_TOP 18
> 171  1  4 32
>10.   11   1260 EX_SEND_TOP  4   16
>   3,389  1  4 64
> 9.   10   1162 EX_SEND_BOTTOM   4   12
> 683  1  4 64
> 8.9   1062 EX_SPLIT_BOTTOM  4   19
> 597  1  4 32 
> 833874
> 7.8952 EX_SORT_GRBY 42,259
> 289,200  1  4 32
> 6.7842 EX_HASH_GRBY 42,259
> 394,235   1451 86,400  2,765,059,200 0|0|0
> 5.6732 EX_SPLIT_TOP 42,211
>  51,034 110007 86,400  2,765,059,200
> 4.5632 EX_SEND_TOP  82,436
>  98,125 110007 86,400  2,766,528,000
> 3.4533 EX_SEND_BOTTOM   8   10,658
> 196,714 110007 86,400  2,766,528,000
> 2.3433 EX_SPLIT_BOTTOM  22,663
>  95,246 110007 86,400  2,765,059,200 
> 1547521
> 1.2323 EX_HASH_GRBY 24,521
> 303,31955003.5  0  0
> ..1213 EX_HDFS_SCAN 22,650
> 952,242 116085 86,400  2,765,059,200 
> HIVE.HIVE.TIME_DIM|86400|5288324
> {noformat}
> The reason is that the hash groupby reports its row count in the BMO stats. 
> However, a partial hash groupby is not considered a BMO (Big Memory 
> Operator), so no rowcount gets reported. The fix is to increment the rowcount 
> in the generic stats entry that is present in both partial and full groupby 
> operators.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)