IMPALA-6781: expand ORDER BY in some TPCH queries Fix determinism in TPCH Q3, Q10, Q18 by adding another column to the queries' ORDER BY to guarantee deterministic results. With TPCH 10000 these queries were producing differing results across stress test runs. They were all valid, but the LIMIT without the more specific ORDER BY meant that several different result sets were possible. By adding these columns, we sort by a column that has uniqueness across all rows returned.
Testing: Repeated runs of these specific TPCH queries via: impala-py.test -k Q18 tests/query_test/test_tpch_queries.py Stress test on a 140-node cluster with TPCH 10000 loaded. Previously when using these queries, the stress test would incorrectly report incorrect results. Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae Reviewed-on: http://gerrit.cloudera.org:8080/10351 Reviewed-by: Michael Brown <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/ca0b8eb1 Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/ca0b8eb1 Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/ca0b8eb1 Branch: refs/heads/2.x Commit: ca0b8eb11cc34657ae894ed6285057cdda7bfe4f Parents: 6ba41b6 Author: Michael Brown <[email protected]> Authored: Fri May 4 09:42:09 2018 -0700 Committer: Impala Public Jenkins <[email protected]> Committed: Wed May 9 23:10:16 2018 +0000 ---------------------------------------------------------------------- testdata/workloads/tpch/queries/tpch-q10.test | 6 ++++-- testdata/workloads/tpch/queries/tpch-q18.test | 6 ++++-- testdata/workloads/tpch/queries/tpch-q3.test | 6 ++++-- 3 files changed, 12 insertions(+), 6 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/impala/blob/ca0b8eb1/testdata/workloads/tpch/queries/tpch-q10.test ---------------------------------------------------------------------- diff --git a/testdata/workloads/tpch/queries/tpch-q10.test b/testdata/workloads/tpch/queries/tpch-q10.test index 45038d2..b877487 100644 --- a/testdata/workloads/tpch/queries/tpch-q10.test +++ b/testdata/workloads/tpch/queries/tpch-q10.test @@ -2,6 +2,7 @@ ---- QUERY: TPCH-Q10 # Q10 - Returned Item Reporting Query # Converted select from multiple tables to joins +# IMPALA-6781: add c_custkey to ORDER BY list for determinstic results select c_custkey, c_name, @@ -32,7 +33,8 @@ group by c_address, c_comment order by - revenue desc + revenue desc, + c_custkey limit 20 ---- RESULTS 57040,'Customer#000057040',734235.2455,632.87,'JAPAN','Eioyzjf4pp','22-895-641-3466','sits. slyly regular requests sleep alongside of the regular inst' @@ -57,4 +59,4 @@ limit 20 23431,'Customer#000023431',554269.5360,3381.86,'ROMANIA','HgiV0phqhaIa9aydNoIlb','29-915-458-2654','nusual, even instructions: furiously stealthy n' ---- TYPES bigint, string, decimal, decimal, string, string, string, string -==== \ No newline at end of file +==== http://git-wip-us.apache.org/repos/asf/impala/blob/ca0b8eb1/testdata/workloads/tpch/queries/tpch-q18.test ---------------------------------------------------------------------- diff --git a/testdata/workloads/tpch/queries/tpch-q18.test b/testdata/workloads/tpch/queries/tpch-q18.test index db39222..bdf74da 100644 --- a/testdata/workloads/tpch/queries/tpch-q18.test +++ b/testdata/workloads/tpch/queries/tpch-q18.test @@ -1,6 +1,7 @@ ==== ---- QUERY: TPCH-Q18 # Q18 - Large Value Customer Query +# IMPALA-6781: add o_orderkey to ORDER BY list for determinstic results select c_name, c_custkey, @@ -33,7 +34,8 @@ group by o_totalprice order by o_totalprice desc, - o_orderdate + o_orderdate, + o_orderkey limit 100 ---- RESULTS 'Customer#000128120',128120,4722021,'1994-04-07',544089.09,323.00 @@ -95,4 +97,4 @@ limit 100 'Customer#000088703',88703,2995076,'1994-01-30',363812.12,302.00 ---- TYPES STRING, BIGINT, BIGINT, STRING, DECIMAL, DECIMAL -==== \ No newline at end of file +==== http://git-wip-us.apache.org/repos/asf/impala/blob/ca0b8eb1/testdata/workloads/tpch/queries/tpch-q3.test ---------------------------------------------------------------------- diff --git a/testdata/workloads/tpch/queries/tpch-q3.test b/testdata/workloads/tpch/queries/tpch-q3.test index 3decb17..9b57607 100644 --- a/testdata/workloads/tpch/queries/tpch-q3.test +++ b/testdata/workloads/tpch/queries/tpch-q3.test @@ -1,6 +1,7 @@ ==== ---- QUERY: TPCH-Q3 # Q3 - Shipping Priority Query +# IMPALA-6781: add order by l_orderkey for determinstic results select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, @@ -22,7 +23,8 @@ group by o_shippriority order by revenue desc, - o_orderdate + o_orderdate, + l_orderkey limit 10 ---- RESULTS 2456423,406181.0111,'1995-03-05',0 @@ -37,4 +39,4 @@ limit 10 2300070,367371.1452,'1995-03-13',0 ---- TYPES BIGINT, DECIMAL, STRING, INT -==== \ No newline at end of file +====
