[ 
https://issues.apache.org/jira/browse/TRAFODION-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wayne Birdsall resolved TRAFODION-823.
--------------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.1-incubating

> LP Bug: 1402031 - Update stats gets wrong row counts with the first run after 
> loading data
> ------------------------------------------------------------------------------------------
>
>                 Key: TRAFODION-823
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-823
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Weishiun Tsai
>            Assignee: David Wayne Birdsall
>            Priority: Critical
>             Fix For: 2.1-incubating
>
>
> This problem has been seen several times in the past few builds, the latest 
> one being the v1209_0830 build.  When QA populates the g_tpch2x tables on QA 
> clusters, the scripts load the data, do select count(*) on each table, and 
> then run update stats on each table.  The select count(*) prior to the update 
> stats show the correct row counts.  But showstats after update stats show 
> that the row count in the stats for each table is way off.  Some tables get 
> more row counts than the actual row counts and some less.
> Rerunning the same set of update stats statements often correct this 
> situation.  But this is causing a huge problem for testing.  When the stats 
> are this bad, some of the larger queries using these tables would lapse back 
> to nested join and would then hang for 20 hours without finishing.  
> This problem can’t be reliably reproduced on all clusters, but it does show 
> up frequently enough to cause problems.  This case is created to document 
> this problem.  Investigation needs to be done in the implementation of update 
> stats to see why select count(*) gets the correct counts while update stats 
> afterwards does not.
> =======================================================
> Here is the execution output of the select count(*) statements and the update 
> stats statements, in that order, after the data loading.
> SQL>select count(*) from region;
> (EXPR)
> --------------------
>                    5
> --- 1 row(s) selected.
> SQL>select count(*) from nation;
> (EXPR)
> --------------------
>                   25
> --- 1 row(s) selected.
> SQL>select count(*) from supplier;
> (EXPR)
> --------------------
>                20000
> --- 1 row(s) selected.
> SQL>select count(*) from customer;
> (EXPR)
> --------------------
>               300000
> --- 1 row(s) selected.
> SQL>select count(*) from part;
> (EXPR)
> --------------------
>               400000
> --- 1 row(s) selected.
> SQL>select count(*) from partsupp;
> (EXPR)
> --------------------
>              1600000
> --- 1 row(s) selected.
> SQL>select count(*) from orders;
> (EXPR)
> --------------------
>              3000000
> --- 1 row(s) selected.
> SQL>select count(*) from lineitem;
> (EXPR)
> --------------------
>             11997996
> --- 1 row(s) selected.
> -------------------------------------------------------------------------------
> == TEST: tcase.test003
> -------------------------------------------------------------------------------
> SQL>update statistics for table region on every column;
> --- SQL operation complete.
> SQL>update statistics for table nation on every column;
> --- SQL operation complete.
> SQL>update statistics for table supplier on every column;
> --- SQL operation complete.
> SQL>update statistics for table customer on every column;
> --- SQL operation complete.
> SQL>update statistics for table part on every column;
> --- SQL operation complete.
> SQL>update statistics for table partsupp on every column;
> --- SQL operation complete.
> SQL>update statistics for table orders on every column sample random 10 
> percent;
> --- SQL operation complete.
> SQL>update statistics for table lineitem on every column sample random 10 
> percent;
> --- SQL operation complete.
> =======================================================
> Here is the showstats output for each table after update stats statements 
> were first run.  The row counts are completely off.
> >>set schema g_tpch2x;
> --- SQL operation complete.
> >>showstats for table CUSTOMER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
> Table ID: 314741800727389741
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1392898339     48     1100767     1100767 C_CUSTKEY
> 1392898334     62     1100767     1100767 C_NAME
> 1392898329     62     1100767     1100767 C_ADDRESS
> 1392898324     25     1100767           0 C_NATIONKEY
> 1392898319     62     1100767     1100767 C_PHONE
> 1392898314     36     1100767           6 C_ACCTBAL
> 1392898309      5     1100767           0 C_MKTSEGMENT
> 1392898304     62     1100767           0 C_COMMENT
> 1392898299      8     1100767           0 "_SALT_"
> 1392898344      1     1100767     1100767 "_SALT_", C_CUSTKEY
> --- SQL operation complete.
> >>showstats for table LINEITEM on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
> Table ID: 314741800727390211
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1537555901     28    12720858     2555961 L_ORDERKEY
> 1537555896     52    12720858      399694 L_PARTKEY
> 1537555891     48    12720858       20000 L_SUPPKEY
> 1537555886      7    12720858           7 L_LINENUMBER
> 1537555881     50    12720858          50 L_QUANTITY
> 1537555876     39    12720858      939869 L_EXTENDEDPRICE
> 1537555871     11    12720858          11 L_DISCOUNT
> 1537555866      9    12720858           9 L_TAX
> 1537555861      3    12720858           3 L_RETURNFLAG
> 1537555856      2    12720858           2 L_LINESTATUS
> 1537555851     50    12720858        2526 L_SHIPDATE
> 1537555846     50    12720858        2466 L_COMMITDATE
> 1537555841     50    12720858        2550 L_RECEIPTDATE
> 1537555836      4    12720858           4 L_SHIPINSTRUCT
> 1537555831      7    12720858           7 L_SHIPMODE
> 1537555826     62    12720858     7447670 L_COMMENT
> 1537555821      8    12720858           8 "_SALT_"
> 1537555906      1    12720858    12720858 "_SALT_", L_SHIPDATE, L_ORDERKEY, 
> L_LINENUMBER
> --- SQL operation complete.
> >>showstats for table NATION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.NATION
> Table ID: 314741800727389515
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1374739566     25          25          25 N_NATIONKEY
> 1374739561     25          25          25 N_NAME
> 1374739556      5          25           5 N_REGIONKEY
> 1374739551     25          25          25 N_COMMENT
> 1374739546      8          25           8 "_SALT_"
> 1374739571      1          25          25 "_SALT_", N_NATIONKEY
> --- SQL operation complete.
> >>showstats for table ORDERS on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
> Table ID: 314741800727390091
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1464778717     42     4585155     4585155 O_ORDERKEY
> 1464778712     56     4585155      193098 O_CUSTKEY
> 1464778707      3     4585155           3 O_ORDERSTATUS
> 1464778702     57     4585155     4045135 O_TOTALPRICE
> 1464778697     50     4585155        2406 O_ORDERDATE
> 1464778692      5     4585155           5 O_ORDERPRIORITY
> 1464778687     62     4585155        2000 O_CLERK
> 1464778682      1     4585155           1 O_SHIPPRIORITY
> 1464778677     62     4585155     4259632 O_COMMENT
> 1464778672      8     4585155           8 "_SALT_"
> 1464778722      1     4585155     4585155 "_SALT_", O_ORDERKEY
> --- SQL operation complete.
> >>showstats for table PART on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PART
> Table ID: 314741800727389857
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1403671365     48     1725243     1725243 P_PARTKEY
> 1403671360     62     1725243     1279978 P_NAME
> 1403671355      5     1725243           0 P_MFGR
> 1403671350     25     1725243           0 P_BRAND
> 1403671345    150     1725243           0 P_TYPE
> 1403671340     50     1725243           0 P_SIZE
> 1403671335     40     1725243           0 P_CONTAINER
> 1403671330     55     1725243           0 P_RETAILPRICE
> 1403671325     76     1725243           0 P_COMMENT
> 1403671320      8     1725243           0 "_SALT_"
> 1403671370      1     1725243     1725243 "_SALT_", P_PARTKEY
> --- SQL operation complete.
> >>showstats for table PARTSUPP on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PARTSUPP
> Table ID: 314741800727389978
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1433360836     48     2972264           0 PS_PARTKEY
> 1433360831     48     2972264           0 PS_SUPPKEY
> 1433360826     48     2972264           0 PS_AVAILQTY
> 1433360821     48     2972264           0 PS_SUPPLYCOST
> 1433360816     62     2972264     2205231 PS_COMMENT
> 1433360811      8     2972264           0 "_SALT_"
> 1433360841      1     2972264     2972264 "_SALT_", PS_PARTKEY, PS_SUPPKEY
> --- SQL operation complete.
> >>showstats for table REGION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.REGION
> Table ID: 314741800727389399
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1369726849      5           5           5 R_REGIONKEY
> 1369726844      5           5           5 R_NAME
> 1369726839      5           5           5 R_COMMENT
> 1369726834      3           5           3 "_SALT_"
> 1369726854      1           5           5 "_SALT_", R_REGIONKEY
> --- SQL operation complete.
> >>showstats for table SUPPLIER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.SUPPLIER
> Table ID: 314741800727389626
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1380401507     48       64035       64035 S_SUPPKEY
> 1380401502     62       64035       64035 S_NAME
> 1380401497     62       64035       64035 S_ADDRESS
> 1380401492     25       64035           0 S_NATIONKEY
> 1380401487     62       64035       64035 S_PHONE
> 1380401482     56       64035           0 S_ACCTBAL
> 1380401477     62       64035       47479 S_COMMENT
> 1380401472      8       64035           0 "_SALT_"
> 1380401512      1       64035       64035 "_SALT_", S_SUPPKEY
> --- SQL operation complete.
> =======================================================
> Here is the showstats output after rerunning the same set of update stats 
> statements.  The row counts are now correct.
> >>set schema g_tpch2x;
> --- SQL operation complete.
> >>showstats for table CUSTOMER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
> Table ID: 314741800727389741
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1392898338     48      300000      300000 C_CUSTKEY
> 1392898335     62      300000      300000 C_NAME
> 1392898328     62      300000      300000 C_ADDRESS
> 1392898325     25      300000          25 C_NATIONKEY
> 1392898318     62      300000      300000 C_PHONE
> 1392898315     36      300000      262499 C_ACCTBAL
> 1392898308      5      300000           5 C_MKTSEGMENT
> 1392898305     62      300000      299263 C_COMMENT
> 1392898298      8      300000           8 "_SALT_"
> 1392898345      1      300000      300000 "_SALT_", C_CUSTKEY
> --- SQL operation complete.
> >>showstats for table LINEITEM on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
> Table ID: 314741800727390211
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1537555900     27    11997996     2536382 L_ORDERKEY
> 1537555897     60    11997996      399669 L_PARTKEY
> 1537555890     48    11997996       20000 L_SUPPKEY
> 1537555887      7    11997996           7 L_LINENUMBER
> 1537555880     50    11997996          50 L_QUANTITY
> 1537555877     40    11997996      940199 L_EXTENDEDPRICE
> 1537555870     11    11997996          11 L_DISCOUNT
> 1537555867      9    11997996           9 L_TAX
> 1537555860      3    11997996           3 L_RETURNFLAG
> 1537555857      2    11997996           2 L_LINESTATUS
> 1537555850     50    11997996        2526 L_SHIPDATE
> 1537555847     50    11997996        2466 L_COMMITDATE
> 1537555840     50    11997996        2549 L_RECEIPTDATE
> 1537555837      4    11997996           4 L_SHIPINSTRUCT
> 1537555830      7    11997996           7 L_SHIPMODE
> 1537555827     62    11997996     7101582 L_COMMENT
> 1537555820      8    11997996           8 "_SALT_"
> 1537555907      1    11997996    11997996 "_SALT_", L_SHIPDATE, L_ORDERKEY, 
> L_LINENUMBER
> --- SQL operation complete.
> >>showstats for table NATION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.NATION
> Table ID: 314741800727389515
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1374739567     25          25          25 N_NATIONKEY
> 1374739560     25          25          25 N_NAME
> 1374739557      5          25           5 N_REGIONKEY
> 1374739550     25          25          25 N_COMMENT
> 1374739547      8          25           8 "_SALT_"
> 1374739570      1          25          25 "_SALT_", N_NATIONKEY
> --- SQL operation complete.
> >>showstats for table ORDERS on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
> Table ID: 314741800727390091
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1464778716     30     3000000     3000000 O_ORDERKEY
> 1464778713     59     3000000      191492 O_CUSTKEY
> 1464778706      3     3000000           3 O_ORDERSTATUS
> 1464778703     53     3000000     2759999 O_TOTALPRICE
> 1464778696     50     3000000        2406 O_ORDERDATE
> 1464778693      5     3000000           5 O_ORDERPRIORITY
> 1464778686     62     3000000        2000 O_CLERK
> 1464778683      1     3000000           1 O_SHIPPRIORITY
> 1464778676     62     3000000     2898572 O_COMMENT
> 1464778673      8     3000000           8 "_SALT_"
> 1464778723      1     3000000     3000000 "_SALT_", O_ORDERKEY
> --- SQL operation complete.
> >>showstats for table PART on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PART
> Table ID: 314741800727389857
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1403671364     48      400000      400000 P_PARTKEY
> 1403671361     62      400000      399982 P_NAME
> 1403671354      5      400000           5 P_MFGR
> 1403671351     25      400000          25 P_BRAND
> 1403671344    150      400000         150 P_TYPE
> 1403671341     50      400000          50 P_SIZE
> 1403671334     40      400000          40 P_CONTAINER
> 1403671331     55      400000       22097 P_RETAILPRICE
> 1403671324     76      400000       69074 P_COMMENT
> 1403671321      8      400000           8 "_SALT_"
> 1403671371      1      400000      400000 "_SALT_", P_PARTKEY
> --- SQL operation complete.
> >>showstats for table PARTSUPP on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PARTSUPP
> Table ID: 314741800727389978
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1433360837     48     1600000      400000 PS_PARTKEY
> 1433360830     48     1600000       20000 PS_SUPPKEY
> 1433360827     48     1600000        9999 PS_AVAILQTY
> 1433360820     48     1600000       99901 PS_SUPPLYCOST
> 1433360817     62     1600000     1599985 PS_COMMENT
> 1433360810      8     1600000           8 "_SALT_"
> 1433360840      1     1600000     1600000 "_SALT_", PS_PARTKEY, PS_SUPPKEY
> --- SQL operation complete.
> >>showstats for table REGION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.REGION
> Table ID: 314741800727389399
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1369726848      5           5           5 R_REGIONKEY
> 1369726845      5           5           5 R_NAME
> 1369726838      5           5           5 R_COMMENT
> 1369726835      3           5           3 "_SALT_"
> 1369726855      1           5           5 "_SALT_", R_REGIONKEY
> --- SQL operation complete.
> >>showstats for table SUPPLIER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.SUPPLIER
> Table ID: 314741800727389626
>    Hist ID # Ints    Rowcount         UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1380401506     48       20000       20000 S_SUPPKEY
> 1380401503     62       20000       20000 S_NAME
> 1380401496     62       20000       20000 S_ADDRESS
> 1380401493     25       20000          25 S_NATIONKEY
> 1380401486     62       20000       20000 S_PHONE
> 1380401483     56       20000       19803 S_ACCTBAL
> 1380401476     62       20000       19972 S_COMMENT
> 1380401473      8       20000           8 "_SALT_"
> 1380401513      1       20000       20000 "_SALT_", S_SUPPKEY
> --- SQL operation complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to