[
https://issues.apache.org/jira/browse/TRAFODION-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Wayne Birdsall resolved TRAFODION-823.
--------------------------------------------
Resolution: Fixed
Fix Version/s: 2.1-incubating
> LP Bug: 1402031 - Update stats gets wrong row counts with the first run after
> loading data
> ------------------------------------------------------------------------------------------
>
> Key: TRAFODION-823
> URL: https://issues.apache.org/jira/browse/TRAFODION-823
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Reporter: Weishiun Tsai
> Assignee: David Wayne Birdsall
> Priority: Critical
> Fix For: 2.1-incubating
>
>
> This problem has been seen several times in the past few builds, the latest
> one being the v1209_0830 build. When QA populates the g_tpch2x tables on QA
> clusters, the scripts load the data, do select count(*) on each table, and
> then run update stats on each table. The select count(*) prior to the update
> stats show the correct row counts. But showstats after update stats show
> that the row count in the stats for each table is way off. Some tables get
> more row counts than the actual row counts and some less.
> Rerunning the same set of update stats statements often correct this
> situation. But this is causing a huge problem for testing. When the stats
> are this bad, some of the larger queries using these tables would lapse back
> to nested join and would then hang for 20 hours without finishing.
> This problem can’t be reliably reproduced on all clusters, but it does show
> up frequently enough to cause problems. This case is created to document
> this problem. Investigation needs to be done in the implementation of update
> stats to see why select count(*) gets the correct counts while update stats
> afterwards does not.
> =======================================================
> Here is the execution output of the select count(*) statements and the update
> stats statements, in that order, after the data loading.
> SQL>select count(*) from region;
> (EXPR)
> --------------------
> 5
> --- 1 row(s) selected.
> SQL>select count(*) from nation;
> (EXPR)
> --------------------
> 25
> --- 1 row(s) selected.
> SQL>select count(*) from supplier;
> (EXPR)
> --------------------
> 20000
> --- 1 row(s) selected.
> SQL>select count(*) from customer;
> (EXPR)
> --------------------
> 300000
> --- 1 row(s) selected.
> SQL>select count(*) from part;
> (EXPR)
> --------------------
> 400000
> --- 1 row(s) selected.
> SQL>select count(*) from partsupp;
> (EXPR)
> --------------------
> 1600000
> --- 1 row(s) selected.
> SQL>select count(*) from orders;
> (EXPR)
> --------------------
> 3000000
> --- 1 row(s) selected.
> SQL>select count(*) from lineitem;
> (EXPR)
> --------------------
> 11997996
> --- 1 row(s) selected.
> -------------------------------------------------------------------------------
> == TEST: tcase.test003
> -------------------------------------------------------------------------------
> SQL>update statistics for table region on every column;
> --- SQL operation complete.
> SQL>update statistics for table nation on every column;
> --- SQL operation complete.
> SQL>update statistics for table supplier on every column;
> --- SQL operation complete.
> SQL>update statistics for table customer on every column;
> --- SQL operation complete.
> SQL>update statistics for table part on every column;
> --- SQL operation complete.
> SQL>update statistics for table partsupp on every column;
> --- SQL operation complete.
> SQL>update statistics for table orders on every column sample random 10
> percent;
> --- SQL operation complete.
> SQL>update statistics for table lineitem on every column sample random 10
> percent;
> --- SQL operation complete.
> =======================================================
> Here is the showstats output for each table after update stats statements
> were first run. The row counts are completely off.
> >>set schema g_tpch2x;
> --- SQL operation complete.
> >>showstats for table CUSTOMER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
> Table ID: 314741800727389741
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1392898339 48 1100767 1100767 C_CUSTKEY
> 1392898334 62 1100767 1100767 C_NAME
> 1392898329 62 1100767 1100767 C_ADDRESS
> 1392898324 25 1100767 0 C_NATIONKEY
> 1392898319 62 1100767 1100767 C_PHONE
> 1392898314 36 1100767 6 C_ACCTBAL
> 1392898309 5 1100767 0 C_MKTSEGMENT
> 1392898304 62 1100767 0 C_COMMENT
> 1392898299 8 1100767 0 "_SALT_"
> 1392898344 1 1100767 1100767 "_SALT_", C_CUSTKEY
> --- SQL operation complete.
> >>showstats for table LINEITEM on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
> Table ID: 314741800727390211
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1537555901 28 12720858 2555961 L_ORDERKEY
> 1537555896 52 12720858 399694 L_PARTKEY
> 1537555891 48 12720858 20000 L_SUPPKEY
> 1537555886 7 12720858 7 L_LINENUMBER
> 1537555881 50 12720858 50 L_QUANTITY
> 1537555876 39 12720858 939869 L_EXTENDEDPRICE
> 1537555871 11 12720858 11 L_DISCOUNT
> 1537555866 9 12720858 9 L_TAX
> 1537555861 3 12720858 3 L_RETURNFLAG
> 1537555856 2 12720858 2 L_LINESTATUS
> 1537555851 50 12720858 2526 L_SHIPDATE
> 1537555846 50 12720858 2466 L_COMMITDATE
> 1537555841 50 12720858 2550 L_RECEIPTDATE
> 1537555836 4 12720858 4 L_SHIPINSTRUCT
> 1537555831 7 12720858 7 L_SHIPMODE
> 1537555826 62 12720858 7447670 L_COMMENT
> 1537555821 8 12720858 8 "_SALT_"
> 1537555906 1 12720858 12720858 "_SALT_", L_SHIPDATE, L_ORDERKEY,
> L_LINENUMBER
> --- SQL operation complete.
> >>showstats for table NATION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.NATION
> Table ID: 314741800727389515
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1374739566 25 25 25 N_NATIONKEY
> 1374739561 25 25 25 N_NAME
> 1374739556 5 25 5 N_REGIONKEY
> 1374739551 25 25 25 N_COMMENT
> 1374739546 8 25 8 "_SALT_"
> 1374739571 1 25 25 "_SALT_", N_NATIONKEY
> --- SQL operation complete.
> >>showstats for table ORDERS on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
> Table ID: 314741800727390091
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1464778717 42 4585155 4585155 O_ORDERKEY
> 1464778712 56 4585155 193098 O_CUSTKEY
> 1464778707 3 4585155 3 O_ORDERSTATUS
> 1464778702 57 4585155 4045135 O_TOTALPRICE
> 1464778697 50 4585155 2406 O_ORDERDATE
> 1464778692 5 4585155 5 O_ORDERPRIORITY
> 1464778687 62 4585155 2000 O_CLERK
> 1464778682 1 4585155 1 O_SHIPPRIORITY
> 1464778677 62 4585155 4259632 O_COMMENT
> 1464778672 8 4585155 8 "_SALT_"
> 1464778722 1 4585155 4585155 "_SALT_", O_ORDERKEY
> --- SQL operation complete.
> >>showstats for table PART on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PART
> Table ID: 314741800727389857
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1403671365 48 1725243 1725243 P_PARTKEY
> 1403671360 62 1725243 1279978 P_NAME
> 1403671355 5 1725243 0 P_MFGR
> 1403671350 25 1725243 0 P_BRAND
> 1403671345 150 1725243 0 P_TYPE
> 1403671340 50 1725243 0 P_SIZE
> 1403671335 40 1725243 0 P_CONTAINER
> 1403671330 55 1725243 0 P_RETAILPRICE
> 1403671325 76 1725243 0 P_COMMENT
> 1403671320 8 1725243 0 "_SALT_"
> 1403671370 1 1725243 1725243 "_SALT_", P_PARTKEY
> --- SQL operation complete.
> >>showstats for table PARTSUPP on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PARTSUPP
> Table ID: 314741800727389978
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1433360836 48 2972264 0 PS_PARTKEY
> 1433360831 48 2972264 0 PS_SUPPKEY
> 1433360826 48 2972264 0 PS_AVAILQTY
> 1433360821 48 2972264 0 PS_SUPPLYCOST
> 1433360816 62 2972264 2205231 PS_COMMENT
> 1433360811 8 2972264 0 "_SALT_"
> 1433360841 1 2972264 2972264 "_SALT_", PS_PARTKEY, PS_SUPPKEY
> --- SQL operation complete.
> >>showstats for table REGION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.REGION
> Table ID: 314741800727389399
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1369726849 5 5 5 R_REGIONKEY
> 1369726844 5 5 5 R_NAME
> 1369726839 5 5 5 R_COMMENT
> 1369726834 3 5 3 "_SALT_"
> 1369726854 1 5 5 "_SALT_", R_REGIONKEY
> --- SQL operation complete.
> >>showstats for table SUPPLIER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.SUPPLIER
> Table ID: 314741800727389626
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1380401507 48 64035 64035 S_SUPPKEY
> 1380401502 62 64035 64035 S_NAME
> 1380401497 62 64035 64035 S_ADDRESS
> 1380401492 25 64035 0 S_NATIONKEY
> 1380401487 62 64035 64035 S_PHONE
> 1380401482 56 64035 0 S_ACCTBAL
> 1380401477 62 64035 47479 S_COMMENT
> 1380401472 8 64035 0 "_SALT_"
> 1380401512 1 64035 64035 "_SALT_", S_SUPPKEY
> --- SQL operation complete.
> =======================================================
> Here is the showstats output after rerunning the same set of update stats
> statements. The row counts are now correct.
> >>set schema g_tpch2x;
> --- SQL operation complete.
> >>showstats for table CUSTOMER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.CUSTOMER
> Table ID: 314741800727389741
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1392898338 48 300000 300000 C_CUSTKEY
> 1392898335 62 300000 300000 C_NAME
> 1392898328 62 300000 300000 C_ADDRESS
> 1392898325 25 300000 25 C_NATIONKEY
> 1392898318 62 300000 300000 C_PHONE
> 1392898315 36 300000 262499 C_ACCTBAL
> 1392898308 5 300000 5 C_MKTSEGMENT
> 1392898305 62 300000 299263 C_COMMENT
> 1392898298 8 300000 8 "_SALT_"
> 1392898345 1 300000 300000 "_SALT_", C_CUSTKEY
> --- SQL operation complete.
> >>showstats for table LINEITEM on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.LINEITEM
> Table ID: 314741800727390211
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1537555900 27 11997996 2536382 L_ORDERKEY
> 1537555897 60 11997996 399669 L_PARTKEY
> 1537555890 48 11997996 20000 L_SUPPKEY
> 1537555887 7 11997996 7 L_LINENUMBER
> 1537555880 50 11997996 50 L_QUANTITY
> 1537555877 40 11997996 940199 L_EXTENDEDPRICE
> 1537555870 11 11997996 11 L_DISCOUNT
> 1537555867 9 11997996 9 L_TAX
> 1537555860 3 11997996 3 L_RETURNFLAG
> 1537555857 2 11997996 2 L_LINESTATUS
> 1537555850 50 11997996 2526 L_SHIPDATE
> 1537555847 50 11997996 2466 L_COMMITDATE
> 1537555840 50 11997996 2549 L_RECEIPTDATE
> 1537555837 4 11997996 4 L_SHIPINSTRUCT
> 1537555830 7 11997996 7 L_SHIPMODE
> 1537555827 62 11997996 7101582 L_COMMENT
> 1537555820 8 11997996 8 "_SALT_"
> 1537555907 1 11997996 11997996 "_SALT_", L_SHIPDATE, L_ORDERKEY,
> L_LINENUMBER
> --- SQL operation complete.
> >>showstats for table NATION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.NATION
> Table ID: 314741800727389515
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1374739567 25 25 25 N_NATIONKEY
> 1374739560 25 25 25 N_NAME
> 1374739557 5 25 5 N_REGIONKEY
> 1374739550 25 25 25 N_COMMENT
> 1374739547 8 25 8 "_SALT_"
> 1374739570 1 25 25 "_SALT_", N_NATIONKEY
> --- SQL operation complete.
> >>showstats for table ORDERS on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.ORDERS
> Table ID: 314741800727390091
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1464778716 30 3000000 3000000 O_ORDERKEY
> 1464778713 59 3000000 191492 O_CUSTKEY
> 1464778706 3 3000000 3 O_ORDERSTATUS
> 1464778703 53 3000000 2759999 O_TOTALPRICE
> 1464778696 50 3000000 2406 O_ORDERDATE
> 1464778693 5 3000000 5 O_ORDERPRIORITY
> 1464778686 62 3000000 2000 O_CLERK
> 1464778683 1 3000000 1 O_SHIPPRIORITY
> 1464778676 62 3000000 2898572 O_COMMENT
> 1464778673 8 3000000 8 "_SALT_"
> 1464778723 1 3000000 3000000 "_SALT_", O_ORDERKEY
> --- SQL operation complete.
> >>showstats for table PART on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PART
> Table ID: 314741800727389857
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1403671364 48 400000 400000 P_PARTKEY
> 1403671361 62 400000 399982 P_NAME
> 1403671354 5 400000 5 P_MFGR
> 1403671351 25 400000 25 P_BRAND
> 1403671344 150 400000 150 P_TYPE
> 1403671341 50 400000 50 P_SIZE
> 1403671334 40 400000 40 P_CONTAINER
> 1403671331 55 400000 22097 P_RETAILPRICE
> 1403671324 76 400000 69074 P_COMMENT
> 1403671321 8 400000 8 "_SALT_"
> 1403671371 1 400000 400000 "_SALT_", P_PARTKEY
> --- SQL operation complete.
> >>showstats for table PARTSUPP on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.PARTSUPP
> Table ID: 314741800727389978
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1433360837 48 1600000 400000 PS_PARTKEY
> 1433360830 48 1600000 20000 PS_SUPPKEY
> 1433360827 48 1600000 9999 PS_AVAILQTY
> 1433360820 48 1600000 99901 PS_SUPPLYCOST
> 1433360817 62 1600000 1599985 PS_COMMENT
> 1433360810 8 1600000 8 "_SALT_"
> 1433360840 1 1600000 1600000 "_SALT_", PS_PARTKEY, PS_SUPPKEY
> --- SQL operation complete.
> >>showstats for table REGION on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.REGION
> Table ID: 314741800727389399
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1369726848 5 5 5 R_REGIONKEY
> 1369726845 5 5 5 R_NAME
> 1369726838 5 5 5 R_COMMENT
> 1369726835 3 5 3 "_SALT_"
> 1369726855 1 5 5 "_SALT_", R_REGIONKEY
> --- SQL operation complete.
> >>showstats for table SUPPLIER on every column;
> Histogram data for Table TRAFODION.G_TPCH2X.SUPPLIER
> Table ID: 314741800727389626
> Hist ID # Ints Rowcount UEC Colname(s)
> ========== ====== =========== =========== ===========================
> 1380401506 48 20000 20000 S_SUPPKEY
> 1380401503 62 20000 20000 S_NAME
> 1380401496 62 20000 20000 S_ADDRESS
> 1380401493 25 20000 25 S_NATIONKEY
> 1380401486 62 20000 20000 S_PHONE
> 1380401483 56 20000 19803 S_ACCTBAL
> 1380401476 62 20000 19972 S_COMMENT
> 1380401473 8 20000 8 "_SALT_"
> 1380401513 1 20000 20000 "_SALT_", S_SUPPKEY
> --- SQL operation complete.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)