Re: [PERFORM] Effects of cascading references in foreign keys
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Tom Lane wrote: > Michael Fuhr <[EMAIL PROTECTED]> writes: > > On Sat, Oct 29, 2005 at 09:49:47AM -0500, Bruno Wolff III wrote: > >> It looks like this feature was added last May, so I think it only applies > >> to 8.1. > > > Earlier versions appear to have at least some kind of optimization. > > Yeah. IIRC, for quite some time we've had tests inside the FK update > triggers to not bother to search the other table if the key value hasn't > changed. What we did in 8.1 was to push that test further upstream, so > that the trigger event isn't even queued if the key value hasn't > changed. (This is why you don't see the trigger shown as being called > even once.) > > Looking at this, I wonder if there isn't a bug or at least an > inefficiency in 8.1. The KeysEqual short circuit tests are still there > in ri_triggers.c; aren't they now redundant with the test in triggers.c? > And don't they need to account for the special case mentioned in the > comment in triggers.c, that the RI check must still be done if we are > looking at a row updated by the same transaction that created it? > > regards, tom lane > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Incorrect row estimates in plan?
Hi, I am having some trouble understanding a plan and was wondering if anyone could guide me. The query in question here seems to be showing some incorrect row counts. I have vacuumed and analyzed the table, but the estimate versus the actual total seems to be way out (est 2870 vs actual 85k). Perhaps I am reading the plan incorrectly though. (hopefully the plan below is readable) db=# select version(); version - PostgreSQL 8.2.4 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) db=# show shared_buffers ; shared_buffers 300MB #4GB ram, 2 SATA striped, XFS db=# show default_statistics_target; default_statistics_target --- 100 # stats have been raised to 1000 on both the destip and srcip columns # create index slog_gri_idx on slog (gid,rule,(case when rule in (8,9) then destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10)); # vacuum analyze verbose slog; db=# show random_page_cost ; random_page_cost -- 3 db=# select count(*) from slog count - 1,019,121 db=#select count(*) as total from slog where gid=1::INTEGER and rule in (1,2,8,9,10) and (case when rule in (8,9) then destip else srcip end)='192.168.10.23'::INET; total --- 83,538 # problematic query explain analyze select coalesce(uri,host((case when rule in (8,9) then srcip else destip end))) as destip, case when rule in (8,9) then 'ext' else 'int' end as tp, count(*) as total, coalesce(sum(destbytes),0)+coalesce(sum(srcbytes),0) as bytes from slog where gid=1::INTEGER and rule in (1,2,8,9,10) and (case when rule in (8,9) then destip else srcip end)='192.168.10.23'::INET group by destip,tp order by bytes desc,total desc,destip limit 20 Limit (cost=6490.18..6490.23 rows=20 width=61) (actual time=2036.968..2037.220 rows=20 loops=1) -> Sort (cost=6490.18..6490.90 rows=288 width=61) (actual time=2036.960..2037.027 rows=20 loops=1) Sort Key: (COALESCE(sum(destbytes), 0::numeric) + COALESCE(sum(srcbytes), 0::numeric)), count(*), COALESCE(uri, host(CASE WHEN (rule = ANY ('{8,9}'::integer[])) THEN srcip ELSE destip END)) -> HashAggregate (cost=6470.50..6478.42 rows=288 width=61) (actual time=2008.478..2022.125 rows=2057 loops=1) -> Bitmap Heap Scan on slog (cost=82.98..6434.62 rows=2870 width=61) (actual time=50.235..1237.948 rows=83538 loops=1) Recheck Cond: ((gid = 1) AND (rule = ANY ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY ('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet)) -> Bitmap Index Scan on slog_gri_idx (cost=0.00..82.26 rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1) Index Cond: ((gid = 1) AND (rule = ANY ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY ('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet)) Total runtime: 2037.585 ms Does anyone have any suggestions? Thanks! -- View this message in context: http://www.nabble.com/Incorrect-row-estimates-in-plan--tf4522692.html#a12902068 Sent from the PostgreSQL - performance mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Incorrect row estimates in plan?
pgdba <[EMAIL PROTECTED]> writes: > -> Bitmap Heap Scan on slog (cost=82.98..6434.62 rows=2870 > width=61) (actual time=50.235..1237.948 rows=83538 loops=1) > Recheck Cond: ((gid = 1) AND (rule = ANY > ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY > ('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet)) > -> Bitmap Index Scan on slog_gri_idx (cost=0.00..82.26 > rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1) > Index Cond: ((gid = 1) AND (rule = ANY > ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY > ('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet)) [ blink... ] Pray tell, what is the definition of this index? With such a bizarre scan condition, it's unlikely you'll get any really accurate row estimate. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Searching for the cause of a bad plan
> Csaba, please can you copy that data into fresh tables, re-ANALYZE and > then re-post the EXPLAINs, with stats data. Here you go, fresh experiment attached. Cheers, Csaba. db=# \d temp_table_a Table "public.temp_table_a" Column | Type | Modifiers ++--- a | bigint | not null b | bigint | not null Indexes: "temp_pk_table_a" PRIMARY KEY, btree (a, b) db=# \d temp_table_b1 Table "public.temp_table_b1" Column | Type | Modifiers ++--- b | bigint | not null Indexes: "temp_pk_table_b1" PRIMARY KEY, btree (b) db=# \d temp_table_b2 Table "public.temp_table_b2" Column | Type | Modifiers ++--- b | bigint | not null Indexes: "temp_pk_table_b2" PRIMARY KEY, btree (b) Foreign-key constraints: "temp_fk_table_b2_b1" FOREIGN KEY (b) REFERENCES temp_table_b1(b) db=# analyze verbose temp_table_a; INFO: analyzing "public.temp_table_a" INFO: "temp_table_a": scanned 3000 of 655299 pages, containing 1887000 live rows and 0 dead rows; 3000 rows in sample, 412183071 estimated total rows db=# analyze verbose temp_table_b1; INFO: analyzing "public.temp_table_b1" INFO: "temp_table_b1": scanned 3000 of 57285 pages, containing 2232000 live rows and 0 dead rows; 3000 rows in sample, 42620040 estimated total rows db=# analyze verbose temp_table_b2; INFO: analyzing "public.temp_table_b2" INFO: "temp_table_b2": scanned 57 of 57 pages, containing 41967 live rows and 0 dead rows; 3000 rows in sample, 41967 estimated total rows db=# select * from pg_stats where tablename = 'temp_table_a'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals |most_common_freqs | histogram_bounds| correlation +--+-+---+---++---+-++- public | temp_table_a | a | 0 | 8 | 1261 | {74117,700022128,72317,79411,700023682,76025,72843,700014833,76505,78694} | {0.015,0.0116667,0.0097,0.0097,0.0097,0.009,0.0087,0.008,0.0077,0.0077} | {70010872,70035,73086,75843,78974,700011369,700013305,700015988,700019048,700022257,83151} |0.850525 public | temp_table_a | b | 0 | 8 | -1 | | | {41708986,700707712,803042997,7004741432,7007455842,7009719495,7013869874,7016501748,7019139288,7025078292,7037930133} |0.646759 db=# select * from pg_stats where tablename = 'temp_table_b1'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +---+-+---+---++--+---+-+- public | temp_table_b1 | b | 0 | 8 | -1 | | | {41719236,801608645,7003211583,7007403678,7011591097,7016707278,7021089839,7025573684,7029316772,7033888226,8002470137} |0.343186 db=# select * from pg_stats where tablename = 'temp_table_b2'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +---+-+---+---++--+---+---+- public | temp_table_b2 | b | 0 | 8 | -1 | | | {70054,7000352893,7000357745,7000362304,7000367025,7000371629,7000376587,7000381229,7009567724,7023749432,7034300740} | -0.216073 db=# prepare test_001(bigint) as db-# SELECT tb.* db-# FROM
Re: [PERFORM] Searching for the cause of a bad plan
Csaba Nagy <[EMAIL PROTECTED]> writes: > db=# analyze verbose temp_table_a; > INFO: analyzing "public.temp_table_a" > INFO: "temp_table_a": scanned 3000 of 655299 pages, containing 1887000 live > rows and 0 dead rows; 3000 rows in sample, 412183071 estimated total rows Hmm. So the bottom line here is probably that that's not a big enough sample to derive a realistic n_distinct value. Or maybe it is ... how many values of "a" are there really, and what's the true distribution of counts? Do the plan estimates get closer to reality if you set a higher statistics target? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Incorrect row estimates in plan?
Hi Tom, Tom Lane-2 wrote: > > pgdba <[EMAIL PROTECTED]> writes: >> -> Bitmap Heap Scan on slog (cost=82.98..6434.62 rows=2870 >> width=61) (actual time=50.235..1237.948 rows=83538 loops=1) >> Recheck Cond: ((gid = 1) AND (rule = ANY >> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY >> ('{8,9}'::integer[])) THEN destip ELSE srcip END = >> '192.168.10.23'::inet)) >> -> Bitmap Index Scan on slog_gri_idx >> (cost=0.00..82.26 >> rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1) >> Index Cond: ((gid = 1) AND (rule = ANY >> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY >> ('{8,9}'::integer[])) THEN destip ELSE srcip END = >> '192.168.10.23'::inet)) > > [ blink... ] Pray tell, what is the definition of this index? > > With such a bizarre scan condition, it's unlikely you'll get any really > accurate row estimate. > > regards, tom lane > > Original index: "create index slog_gri_idx on slog (gid,rule,(case when rule in (8,9) then destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10))" The purpose of that index is to match a specific query (one that gets run frequently and needs to be fast). It is using the destip when rule 8/9, and srcip when other, but only for a subset of the rules (1,2,8,9,10). There are about 18 rules in total, but I'm only interested in those 5. I have tried a couple of indices like: create index test_destip_idx on slog (gid,destip) where rule in (8,9); create index test_srcip_idx on slog (gid,srcip) where rule in (1,2,10); But the original slog_gri_idx index was used instead. Is there a way that I can rewrite that index then? Not that I'm a fan of a CASE statement in a functional index, but I'm at a loss as to how else I can create this. Or what else I can look into to make this faster? -- View this message in context: http://www.nabble.com/Incorrect-row-estimates-in-plan--tf4522692.html#a12903194 Sent from the PostgreSQL - performance mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Incorrect row estimates in plan?
pgdba <[EMAIL PROTECTED]> writes: > Tom Lane-2 wrote: > -> Bitmap Index Scan on slog_gri_idx > (cost=0.00..82.26 > rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1) > Index Cond: ((gid = 1) AND (rule = ANY > ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY > ('{8,9}'::integer[])) THEN destip ELSE srcip END = > '192.168.10.23'::inet)) >> >> [ blink... ] Pray tell, what is the definition of this index? > Original index: "create index slog_gri_idx on slog (gid,rule,(case when rule > in (8,9) then > destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10))" > The purpose of that index is to match a specific query (one that gets run > frequently and needs to be fast). Ah. I didn't think you would've put such a specific thing into an index definition, but if you're stuck supporting such badly written queries, maybe there's no other way. I rather doubt that you're going to be able to make this query any faster than it is, short of buying enough RAM to keep the whole table RAM-resident. Pulling 8 random rows in 1200 msec doesn't sound all that slow to me. The ultimate solution might be to rethink your table designs ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Incorrect row estimates in plan?
Tom Lane-2 wrote: > > pgdba <[EMAIL PROTECTED]> writes: >> Tom Lane-2 wrote: >> -> Bitmap Index Scan on slog_gri_idx >> (cost=0.00..82.26 >> rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1) >> Index Cond: ((gid = 1) AND (rule = ANY >> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY >> ('{8,9}'::integer[])) THEN destip ELSE srcip END = >> '192.168.10.23'::inet)) >>> >>> [ blink... ] Pray tell, what is the definition of this index? > >> Original index: "create index slog_gri_idx on slog (gid,rule,(case when >> rule >> in (8,9) then >> destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10))" > >> The purpose of that index is to match a specific query (one that gets run >> frequently and needs to be fast). > > Ah. I didn't think you would've put such a specific thing into an index > definition, but if you're stuck supporting such badly written queries, > maybe there's no other way. > > I rather doubt that you're going to be able to make this query any > faster than it is, short of buying enough RAM to keep the whole table > RAM-resident. Pulling 8 random rows in 1200 msec doesn't sound > all that slow to me. > > The ultimate solution might be to rethink your table designs ... > > regards, tom lane > Badly written the query may be, but I do have the opportunity to change it. Part of the problem is that I cannot come up with a better way of writing it. What about the discrepancy between the estimated row count and the actual row count for that index access? "Bitmap Index Scan on slog_gri_idx (cost=0.00..82.26 rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1)" Is there anything I can do to influence that (not that it is likely to change the plan, but...). I vacuumed and analyzed after I created the index, so the stats should be at least be close (with stats target set to 1000 there). -- View this message in context: http://www.nabble.com/Incorrect-row-estimates-in-plan--tf4522692.html#a12905186 Sent from the PostgreSQL - performance mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Difference in query plan when using = or > in where clause
Hi, I am curious as to why this occurs. Why does an = change the query plan so drastically? When my query is: Select count(*) from View_A WHERE tradedate = '20070801'; The query plan is as below: I see that the scan on the alloctbl is being indexed on k_alloctbl_blockid_status -> Bitmap Index Scan on idx_tradeblocktbl_tradeate (cost=0.00..50.47rows=1444 width=0) (actual time= 0.040..0.040 rows=106 loops=1) Index Cond: ((tradedate >= '2007-08-01'::date) AND (tradedate <= '2007-09-24'::date)) -> Bitmap Heap Scan on alloctbl a (cost=4.59..270.73rows=70 width=16) (actual time= 0.010..0.011 rows=1 loops=7) Recheck Cond: (tr.recid = a.blockid) -> Bitmap Index Scan on k_alloctbl_blockid_status (cost= 0.00..4.59 rows=70 width=0) (actual time=0.007..0.007 rows=1 loops=7) Index Cond: (tr.recid = a.blockid) Total runtime: 1.453 ms But when my query is: Select count(*) from View_A WHERE tradedate BETWEEN '20070801' and '20070901'; The query plan is: - -> Bitmap Heap Scan on tradeblocktbl tr (cost= 50.47..2849.67 rows=1444 width=80) (actual time=0.095..0.218 rows=104 loops=1) Recheck Cond: ((tradedate >= '2007-08-01'::date) AND (tradedate <= '2007-09-24'::date)) -> Bitmap Index Scan on idx_tradeblocktbl_tradeate (cost=0.00..50.47 rows=1444 width=0) (actual time=0.050..0.050 rows=106 loops=1) Index Cond: ((tradedate >= '2007-08-01'::date) AND (tradedate <= '2007-09-24'::date)) -> Sort (cost=99007.79..100479.68 rows=588755 width=16) (actual time=2660.009..3150.887 rows=588755 loops=1) Sort Key: a.blockid -> Seq Scan on alloctbl a (cost=0.00..20442.55rows=588755 width=16) (actual time= 0.026..764.833 rows=588755 loops=1) Total runtime: 3590.715 ms Thank you. Radhika -- It is all a matter of perspective. You choose your view by choosing where to stand. --Larry Wall