Hi! I'm about to do some benchmarking on -HEAD one some hardware I have available and it seems I'm hitting a rather weird issue causing the osdl dbt3 benchmark to run very slow and eating CPU time for hours ...
it seems that the issue is caused by the following query: (in case it gets linewrapped: http://www.kaltenbrunner.cc/files/dbt3_with_hashjoin.txt) select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'AFRICA' and o_orderdate >= date '1993-01-01' and o_orderdate < date '1993-01-01' + interval '1 year' group by n_name order by revenue desc; that results in the following plan on my box: Sort (cost=2543391.75..2543391.81 rows=25 width=37) Sort Key: sum((lineitem.l_extendedprice * (1::double precision - lineitem.l_discount))) -> HashAggregate (cost=2543390.73..2543391.17 rows=25 width=37) -> Hash Join (cost=440864.81..2543027.40 rows=72666 width=37) Hash Cond: ((orders.o_custkey = customer.c_custkey) AND (supplier.s_nationkey = customer.c_nationkey)) -> Hash Join (cost=377714.59..2415568.01 rows=1816643 width=49) Hash Cond: (lineitem.l_orderkey = orders.o_orderkey) -> Nested Loop (cost=13.65..1719683.85 rows=12000672 width=49) -> Merge Join (cost=0.00..10248.66 rows=20000 width=41) Merge Cond: (nation.n_nationkey = supplier.s_nationkey) -> Nested Loop (cost=0.00..19.19 rows=5 width=33) -> Index Scan using pk_nation on nation (cost=0.00..9.38 rows=25 width=37) -> Index Scan using pk_region on region (cost=0.00..0.38 rows=1 width=4) Index Cond: (nation.n_regionkey = region.r_regionkey) Filter: (r_name = 'AFRICA'::bpchar) -> Index Scan using i_s_nationkey on supplier (cost=0.00..9779.46 rows=100000 width=8) -> Bitmap Heap Scan on lineitem (cost=13.65..77.16 rows=665 width=16) Recheck Cond: (lineitem.l_suppkey = supplier.s_suppkey) -> Bitmap Index Scan on i_l_suppkey (cost=0.00..13.65 rows=665 width=0) Index Cond: (lineitem.l_suppkey = supplier.s_suppkey) -> Hash (cost=372023.51..372023.51 rows=2270971 width=8) -> Bitmap Heap Scan on orders (cost=41391.94..372023.51 rows=2270971 width=8) Recheck Cond: ((o_orderdate >= '1993-01-01'::date) AND (o_orderdate < '1994-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on i_o_orderdate (cost=0.00..41391.94 rows=2270971 width=0) Index Cond: ((o_orderdate >= '1993-01-01'::date) AND (o_orderdate < '1994-01-01 00:00:00'::timestamp without time zone)) -> Hash (cost=55647.15..55647.15 rows=1500615 width=8) -> Seq Scan on customer (cost=0.00..55647.15 rows=1500615 width=8) (27 rows) so it really thinks that doing hashes with gigantic amounts of data is a good idea generally - this seems to be independent on work_mem - the plan looks the same with 1MB vs 126MB(which I had during the run). the profile of the backend eating the cpu looks similiar to: 26351 27.9047 ExecScanHashBucket 8239 8.7248 hash_seq_search 6984 7.3958 hash_search_with_hash_value setting hash_join to off results in a runtime of about 2,5minutes: (http://www.kaltenbrunner.cc/files/dbt3_without_hashjoin.txt) Sort (cost=3700257.38..3700257.45 rows=25 width=37) (actual time=286820.962..286820.968 rows=5 loops=1) Sort Key: sum((lineitem.l_extendedprice * (1::double precision - lineitem.l_discount))) -> HashAggregate (cost=3700256.37..3700256.80 rows=25 width=37) (actual time=286820.932..286820.941 rows=5 loops=1) -> Nested Loop (cost=730956.43..3699893.04 rows=72666 width=37) (actual time=43551.767..286488.555 rows=72441 loops=1) Join Filter: (customer.c_nationkey = supplier.s_nationkey) -> Merge Join (cost=730956.43..3624153.73 rows=1816643 width=49) (actual time=43281.710..257082.739 rows=1822547 loops=1) Merge Cond: (lineitem.l_orderkey = orders.o_orderkey) -> Index Scan using i_l_orderkey on lineitem (cost=0.00..2715943.34 rows=60003360 width=16) (actual time=32.868..123668.380 rows=59991868 loops=1) -> Sort (cost=730956.43..732091.92 rows=454194 width=41) (actual time=43248.797..45754.223 rows=1822547 loops=1) Sort Key: orders.o_orderkey -> Merge Join (cost=670885.68..688278.21 rows=454194 width=41) (actual time=34469.359..42050.059 rows=455262 loops=1) Merge Cond: (customer.c_custkey = orders.o_custkey) -> Sort (cost=59105.79..59856.10 rows=300123 width=41) (actual time=8113.826..8491.532 rows=299493 loops=1) Sort Key: customer.c_custkey -> Nested Loop (cost=781.13..31801.81 rows=300123 width=41) (actual time=107.537..7461.355 rows=299493 loops=1) -> Nested Loop (cost=1.06..11.00 rows=5 width=33) (actual time=0.030..0.296 rows=5 loops=1) Join Filter: (nation.n_regionkey = region.r_regionkey) -> Index Scan using pk_nation on nation (cost=0.00..9.38 rows=25 width=37) (actual time=0.007..0.063 rows=25 loops=1) -> Materialize (cost=1.06..1.07 rows=1 width=4) (actual time=0.002..0.004 rows=1 loops=25) -> Seq Scan on region (cost=0.00..1.06 rows=1 width=4) (actual time=0.009..0.018 rows=1 loops=1) Filter: (r_name = 'AFRICA'::bpchar) -> Bitmap Heap Scan on customer (cost=780.07..5607.85 rows=60025 width=8) (actual time=61.150..1331.466 rows=59899 loops=5) Recheck Cond: (nation.n_nationkey = customer.c_nationkey) -> Bitmap Index Scan on i_c_nationkey (cost=0.00..780.07 rows=60025 width=0) (actual time=44.637..44.637 rows=59899 loops=5) Index Cond: (nation.n_nationkey = customer.c_nationkey) -> Sort (cost=611779.89..617457.31 rows=2270971 width=8) (actual time=26355.515..29471.963 rows=2276859 loops=1) Sort Key: orders.o_custkey -> Bitmap Heap Scan on orders (cost=41391.94..372023.51 rows=2270971 width=8) (actual time=1630.604..16266.102 rows=2276859 loops=1) Recheck Cond: ((o_orderdate >= '1993-01-01'::date) AND (o_orderdate < '1994-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on i_o_orderdate (cost=0.00..41391.94 rows=2270971 width=0) (actual time=1352.037..1352.037 rows=2276859 loops=1) Index Cond: ((o_orderdate >= '1993-01-01'::date) AND (o_orderdate < '1994-01-01 00:00:00'::timestamp without time zone)) -> Index Scan using pk_supplier on supplier (cost=0.00..0.03 rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=1822547) Index Cond: (lineitem.l_suppkey = supplier.s_suppkey) Total runtime: 286984.386 ms (34 rows) (about 120s seem to be explain analyze overhead here) fwiw the box in question is a Dual 2,6Ghz Opteron with 8GB or RAM - wal is on the BBWC-onboard Smartarray (RAID 10 on 4 disks) and the data is on a 14 disk Linux Software RAID 10 running Debian Sarge/AMD64 with Kernel 2.6.17.7. the dbt3 database got initialized with scaling factor of 10 (running with just 1 works fine). Stefan ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly