Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

2007-04-14 Thread Tom Lane
Dann Corbit [EMAIL PROTECTED] writes: Instead of sorting, I suggest the quickselect() algorithm, which is O(n). What for? Common cases have less than half a dozen entries. That is not the place we need to be spending engineering effort --- what we need to worry about is what's the choice

Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

2007-04-14 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: I think the concern about condition redundancy should be attacked separately. How about just comparing whether they have common prefixes of conditions? I admit I don't understand what would happen with indexes defined like (lower(A), B, C) versus (A,

Re: [HACKERS] choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

2007-04-14 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Has anyone got any thoughts about the best way to do this? How about doing both: sort the index by index scan cost; then pick the first index on the list and start adding indexes when they lower the cost. When adding each index,

choose_bitmap_and again (was Re: [PERFORM] Strangely Variable Query Performance)

2007-04-13 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: [ strange planner misbehavior in 8.2.3 ] After some off-list investigation (thanks, Steve, for letting me poke at your machine), the short answer is that the heuristics used by choose_bitmap_and() suck. The problem query is like select ... from ds where

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: On a Postgres 8.2.3 server, I've got a query that is running very slow in some cases. Could we see the exact definition of that table and its indexes? It looks like the planner is missing the bitmap scan for some reason, but I've not seen a case like that

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Could we see the exact definition of that table and its indexes? It looks like the planner is missing the bitmap scan for some reason, but I've not seen a case like that before. Also, I assume the restriction on receipt date is very nonselective? It doesn't seem to have changed the estimated

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Scott Marlowe
On Thu, 2007-04-12 at 16:03, Steve wrote: Hey there; On a Postgres 8.2.3 server, I've got a query that is running very slow in some cases. With some work, I've determined the 'slow part' of the query. :) This is a query on a table with like 10 million rows or something like that.

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Oy vey ... I hope this is a read-mostly table, because having that many indexes has got to be killing your insert/update performance. Hahaha yeah these are read-only tables. Nightly inserts/updates. Takes a few hours, depending on how many records (between 4 and 10 usually). But during

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4) Filter: ((receipt = '1998-12-30'::date) AND (encounter_id = ANY

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Scott Marlowe
On Thu, 2007-04-12 at 17:04, Steve wrote: Seq Scan on detail_summary ds (cost=0.00..1902749.83 rows=9962 width=4) Filter: ((receipt = '1998-12-30'::date) AND (encounter_id = ANY

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: Datum ddd_text_revcmp(PG_FUNCTION_ARGS){ char* arg1=(char*)VARDATA(PG_GETARG_TEXT_P(0)); char* arg2=(char*)VARDATA(PG_GETARG_TEXT_P(1)); if((*arg1) != (*arg2)){ PG_RETURN_INT32(*arg2 - *arg1);

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
[ itch... ] That code is just completely wrong, because the contents of a TEXT datum aren't guaranteed null-terminated. It'd be better to invoke bttextcmp and negate its result. That's not relevant to your immediate problem, but if you've noticed any strange behavior with your text_revop

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
So there's a misjudgment of the number of rows returned by a factor of about 88. That's pretty big. Since you had the same number without the receipt date (I think...) then it's the encounter_id that's not being counted right. Try upping the stats target on that column and running analyze

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Scott Marlowe [EMAIL PROTECTED] writes: So there's a misjudgment of the number of rows returned by a factor of about 88. That's pretty big. Since you had the same number without the receipt date (I think...) then it's the encounter_id that's not being counted right. I don't think that's

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Here's my planner parameters: seq_page_cost = 1.0 # measured on an arbitrary scale random_page_cost = 1.5 # same scale as above cpu_tuple_cost = 0.001 # same scale as above cpu_index_tuple_cost = 0.0005 # same scale as above

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: Here's my planner parameters: I copied all these, and my 8.2.x still likes the bitmap scan a lot better than the seqscan. Furthermore, I double-checked the CVS history and there definitely haven't been any changes in that area in REL8_2 branch since 8.2.3. So

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: ... even if I force it to use the indexes (enable_seqscan=off) it doesn't make it any faster really :/ Does that change the plan, or do you still get a seqscan? BTW, how big is this table really (how many rows)? regards, tom lane

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
It's a redhat enterprise machine running AMD x64 processors. Linux ers3.dddcorp.com 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST 2007 x86_64 x86_64 x86_64 GNU/Linux It was compiled by me, straight up, nothing weird at all, no odd compiler options or wahtever :) So yeah :/ I'm quite

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Table size: 16,037,728 rows With enable_seqscan=off I get: Bitmap Heap Scan on detail_summary ds (cost=4211395.20..4213045.32 rows=1099 width=4) Recheck Cond: ((receipt = '1998-12-30'::date) AND (encounter_id = ANY

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Here's the explain analyze with seqscan = off: Bitmap Heap Scan on detail_summary ds (cost=4211395.20..4213045.32 rows=1099 width=4) (actual time=121288.825..121305.908 rows=112 loops=1) Recheck Cond: ((receipt = '1998-12-30'::date) AND (encounter_id = ANY

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: With enable_seqscan=off I get: - Bitmap Index Scan on detail_summary_receipt_encounter_idx (cost=0.00..4211395.17 rows=1099 width=0) Index Cond: ((receipt = '1998-12-30'::date) AND (encounter_id = ANY ... The explain analyze is pending,

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
If the other indexes are removed, with enable_seqscan=on: Bitmap Heap Scan on detail_summary ds (cost=154.10..1804.22 rows=1099 width=4) Recheck Cond: (encounter_id = ANY

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: Either way, it runs perfectly fast. So it looks like the indexes are confusing this query like you suspected. Any advise? Wow --- sometimes grasping at straws pays off. I was testing here with just a subset of the indexes to save build time, but I bet that

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Okay -- I started leaving indexes on one by one. The explain broke when the detail_summary_receipt_encounter_idx index was left on (receipt, encounter_id). Just dropping that index had no effect, but there's a LOT of indexes that refer to receipt. So on a hunch I tried dropping all indexes

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: Okay -- I started leaving indexes on one by one. ... So does this mean I should experiment with dropping those indexes? No, I think this means there's a planner bug to fix. I haven't quite scoped out what it is yet, though.

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Tom Lane
Steve [EMAIL PROTECTED] writes: Just dropping that index had no effect, but there's a LOT of indexes that refer to receipt. So on a hunch I tried dropping all indexes that refer to receipt date and that worked -- so it's the indexes that contain receipt date that are teh problem. I'm

Re: [PERFORM] Strangely Variable Query Performance

2007-04-12 Thread Steve
Here you go: detail_summary_b_record_status_idx detail_summary_batch_id_idx detail_summary_batchnum_idx detail_summary_carrier_id_idx detail_summary_duplicate_id_idx detail_summary_e_record_status_idx detail_summary_encounter_id_idx detail_summary_encounternum_idx