Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Mitch Skinner
On Thu, 2005-11-10 at 12:23 -0500, Tom Lane wrote:
 Apparently, you are using a platform and/or locale in which strcoll() is
 spectacularly, god-awfully slow --- on the order of 10 msec per comparison.

The version with the condition is definitely doing more I/O.  The
version without the condition doesn't read at all.  I strace'd an
explain analyze for each separately, and this is what I ended up with
(the first is with the condition, the second is without):

bash-2.05b$ cut '-d(' -f1 subsourcestrace | sort | uniq -c
   7127 gettimeofday
  75213 _llseek
  1 Process 30227 attached - interrupt to quit
  1 Process 30227 detached
 148671 read
  2 recv
  4 semop
  4 send
bash-2.05b$ cut '-d(' -f1 subsourcestrace-nocond | sort | uniq -c
   9103 gettimeofday
  7 _llseek
  1 Process 30227 attached - interrupt to quit
  1 Process 30227 detached
  2 recv
  4 send

For the moment, all of the rows in the view I'm selecting from satisfy
the condition, so the output of both queries is the same.  The relevant
rows of the underlying tables are probably pretty contiguous (all of the
rows satisfying the condition and the join were inserted at the same
time).  Could it just be the result of a weird physical distribution of
data in the table/index files?  For the fast query, the actual number of
rows is a lot less than the planner expects.

 This is a bit hard to believe but I can't make sense of those numbers
 any other way.  What is the platform exactly, and what database locale
 and encoding are you using?

It's RHEL 3 on x86:
[EMAIL PROTECTED] root]# uname -a
Linux rehoboam 2.4.21-32.0.1.ELsmp #1 SMP Tue May 17 17:52:23 EDT 2005
i686 i686 i386 GNU/Linux

The glibc version is 2.3.2.

statgen=# select current_setting('lc_collate');
 current_setting
-
 en_US.UTF-8

Not sure what's relevant, but here's some more info:
The machine has 4.5GiB of RAM and a 5-disk Raid 5.  It's a dual xeon
3.2ghz.

   relname   | relpages |  reltuples
-+--+-
 external_id_map |   126883 | 1.55625e+07
 external_id_map_primary_key |64607 | 1.55625e+07
 subject |   31 |1186
 subject_pkey|   19 |1186

I've attached the output of select name, setting from pg_settings.

And, in case my original message isn't handy, the explain analyze output
and table/view info is below.

Thanks for taking a look,
Mitch

statgen= explain analyze select * from subject_source;

QUERY PLAN 

 Merge Join  (cost=0.00..330.72 rows=1186 width=46) (actual
time=0.051..8.890 rows=1186 loops=1)
   Merge Cond: (outer.id = inner.target_id)
   -  Index Scan using subject_pkey on subject norm  (cost=0.00..63.36
rows=1186 width=28) (actual time=0.022..1.441 rows=1186 loops=1)
   -  Index Scan using external_id_map_primary_key on external_id_map
eim  (cost=0.00..2485226.70 rows=15562513 width=26) (actual
time=0.016..2.532 rows=2175 loops=1)
 Total runtime: 9.592 ms
(5 rows)

statgen= explain analyze select * from subject_source where
source='SCH';

QUERY PLAN 

 Merge Join  (cost=0.00..1147.33 rows=1 width=46) (actual
time=0.054..20258.161 rows=1186 loops=1)
   Merge Cond: (outer.id = inner.target_id)
   -  Index Scan using subject_pkey on subject norm  (cost=0.00..63.36
rows=1186 width=28) (actual time=0.022..1.478 rows=1186 loops=1)
   -  Index Scan using external_id_map_primary_key on external_id_map
eim  (cost=0.00..2524132.99 rows=2335 width=26) (actual
time=0.022..20252.326 rows=1186 loops=1)
 Filter: (source = 'SCH'::bpchar)
 Total runtime: 20258.922 ms
(6 rows)

statgen= \d subject_source
 View public.subject_source
  Column   | Type  | Modifiers
---+---+---
 id| bigint|
 sex   | integer   |
 parent1   | bigint|
 parent2   | bigint|
 source| character(3)  |
 source_id | character varying(32) |
View definition:
 SELECT norm.id, norm.sex, norm.parent1, norm.parent2, eim.source,
eim.source_id
   FROM subject norm
   JOIN util.external_id_map eim ON norm.id = eim.target_id;

statgen= \d subject
Table public.subject
 Column  |  Type   | Modifiers
-+-+---
 id  | bigint  | not null
 sex | integer |
 parent1 | bigint  |
 parent2 | bigint  |
Indexes:
subject_pkey PRIMARY KEY, btree (id)
Foreign-key constraints:
subject_parent1 FOREIGN KEY (parent1) REFERENCES subject(id)
DEFERRABLE INITIALLY DEFERRED
subject_parent2 FOREIGN KEY (parent2) REFERENCES subject(id)

Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Richard Huxton

Mitch Skinner wrote:


The version with the condition is definitely doing more I/O.  The
version without the condition doesn't read at all.  

[snip]

   relname   | relpages |  reltuples
-+--+-
 external_id_map |   126883 | 1.55625e+07
 external_id_map_primary_key |64607 | 1.55625e+07
 subject |   31 |1186
 subject_pkey|   19 |1186


Does external_id_map really have 15 million rows? If not, try a VACUUM 
FULL on it. Be prepared to give it some time to complete.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Tom Lane
Mitch Skinner [EMAIL PROTECTED] writes:
 On Thu, 2005-11-10 at 12:23 -0500, Tom Lane wrote:
 Apparently, you are using a platform and/or locale in which strcoll() is
 spectacularly, god-awfully slow --- on the order of 10 msec per comparison.

 The version with the condition is definitely doing more I/O.  The
 version without the condition doesn't read at all.

That's pretty interesting, but what file(s) is it reading exactly?

It could still be strcoll's fault.  The only plausible explanation
I can think of for strcoll being so slow is if for some reason it were
re-reading the locale definition file every time, instead of setting up
just once.

If it is hitting Postgres files, it'd be interesting to look at exactly
which files and what the distribution of seek offsets is.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Tom Lane
Mitch Skinner [EMAIL PROTECTED] writes:
 On Fri, 2005-11-11 at 09:17 -0500, Tom Lane wrote:
 Please don't, actually, until we understand what's going on.

 Ack, I was the middle of the vacuum full already when I got this.

Given what you said about no deletions or updates, the vacuum should
have no effect anyway, so don't panic.

 I put up some files at: http://arctur.us/pgsql/

Great, I'll take a look ...

 Pardon my ignorance, but do the visibility check and the check of the
 condition happen at different stages of execution?  Would it end up
 checking the condition for all 15M rows, but only checking visibility
 for the 1200 rows that come back from the join?

No, the visibility check happens first.  The timing does seem consistent
with the idea that the comparison is being done at all 15M rows, but
your other EXPLAIN shows that only 2K rows are actually retrieved, which
presumably is because the merge doesn't need the rest.  (Merge will stop
scanning either input when it runs out of rows on the other side; so
this sort of plan is very fast if the range of keys on one side is
smaller than the range on the other.  The numbers from the no-comparison
EXPLAIN ANALYZE indicate that that is happening for your case.)  So the
comparison should happen for at most 2K rows too.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Tom Lane
I wrote:
 No, the visibility check happens first.  The timing does seem consistent
 with the idea that the comparison is being done at all 15M rows, but
 your other EXPLAIN shows that only 2K rows are actually retrieved, which
 presumably is because the merge doesn't need the rest.  (Merge will stop
 scanning either input when it runs out of rows on the other side; so
 this sort of plan is very fast if the range of keys on one side is
 smaller than the range on the other.  The numbers from the no-comparison
 EXPLAIN ANALYZE indicate that that is happening for your case.)  So the
 comparison should happen for at most 2K rows too.

After re-reading your explanation of what you're doing with the data,
I thought of a possible explanation.  Is the source value exactly
correlated with the external_id_map primary key?  What could be
happening is this:

1. We can see from the EXPLAIN ANALYZE for the no-comparison case that
the merge join stops after fetching only 2175 rows from external_id_map.
This implies that the subject table joins to the first couple thousand
entries in external_id_map and nothing beyond that.  In particular, the
merge join must have observed that the join key in the 2175'th row (in
index order) of external_id_map was larger than the last (largest) join
key in subject.

2. Let's suppose that source = 'SCH' is false for the 2175'th row of
external_id_map and every one after that.  Then what will happen is that
the index scan will vainly seek through the entire external_id_map,
looking for a row that its filter allows it to return, not knowing that
the merge join has no use for any of those rows.

If this is the story, and you need to make this sort of query fast,
then what you need to do is incorporate the source value into the
external_id_map index key somehow.  Then the index scan would be able to
realize that there is no possibility of finding another row with source
= 'SCH'.  The simplest way is just to make a 2-column index, but I
wonder whether the source isn't actually redundant with the
external_id_map primary key already ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Mitchell Skinner
On Fri, 2005-11-11 at 10:53 -0500, Tom Lane wrote:
 After re-reading your explanation of what you're doing with the data,
 I thought of a possible explanation.  Is the source value exactly
 correlated with the external_id_map primary key?

Sort of.  In this case, at the beginning of external_id_map, yes, though
further down the table they're not.  For example, if we got new subjects
from 'SCH' at this point, they'd get assigned external_id_map.target_id
(the primary key) values that are totally unrelated to what the current
set are (the values in the external_id_map primary key just come off of
a sequence that we use for everything).

Right now though, since the 'SCH' data came in a contiguous chunk right
at the beginning and hasn't changed or grown since then, the correlation
is pretty exact, I think.  It's true that there are no 'SCH' rows in the
table after the first contiguous set (when I get back to work I'll check
exactly what row that is).  It's interesting that there are these
correlations in the the data that didn't exist at all in my mental
model.

 what you need to do is incorporate the source value into the
 external_id_map index key somehow.  Then the index scan would be able to
 realize that there is no possibility of finding another row with source
 = 'SCH'.  The simplest way is just to make a 2-column index

I thought that's what I had done with the
external_id_map_source_target_id index:

statgen= \d util.external_id_map
 Table util.external_id_map
  Column   | Type  | Modifiers
---+---+---
 source_id | character varying(32) | not null
 source| character(3)  | not null
 target_id | bigint| not null
Indexes:
external_id_map_primary_key PRIMARY KEY, btree (target_id)
external_id_map_source_source_id_unique UNIQUE, btree (source,
source_id)
external_id_map_source btree (source)
external_id_map_source_target_id btree (source, target_id)
Foreign-key constraints:
external_id_map_source FOREIGN KEY (source) REFERENCES
util.source(id)

So if I understand your suggestion correctly, we're back to the why
isn't this query using index foo FAQ.  For the external_id_map table,
the statistics target for source is 200; the other two columns are at
the default level because I didn't think of them as being very
interesting statistics-wise.  I suppose I should probably go ahead and
raise the targets for every column of that table; I expect the planning
time is negligible, and our queries tend to be large data-wise.  Beyond
that, I'm not sure how else to encourage the use of that index.  If I
changed that index to be (target_id, source) would it make a difference?

Thanks for your help,
Mitch 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-11 Thread Tom Lane
Mitchell Skinner [EMAIL PROTECTED] writes:
 On Fri, 2005-11-11 at 10:53 -0500, Tom Lane wrote:
 what you need to do is incorporate the source value into the
 external_id_map index key somehow.  Then the index scan would be able to
 realize that there is no possibility of finding another row with source
 = 'SCH'.  The simplest way is just to make a 2-column index

 I thought that's what I had done with the
 external_id_map_source_target_id index:
 external_id_map_source_target_id btree (source, target_id)

 If I changed that index to be (target_id, source) would it make a difference?

[ fools around with a test case ... ]  Seems like not :-(.  PG is not
bright enough to realize that an index on (source, target_id) can be
used with a mergejoin on target_id, because the index sort order isn't
compatible.  (Given the equality constraint on source, there is an
effective compatibility.  I had thought that 8.1 might be able to
detect this, but it seems not to in a simple test case --- there may be
a bug involved there.  In any case 8.0 definitely won't see it.)  An
index on (target_id, source) would be recognized as mergejoinable, but
that doesn't solve the problem because an index condition on the second
column doesn't provide enough information to know that the scan can stop
early.

Given your comment that the correlation is accidental, it may be that
there's not too much point in worrying.  The planner is picking this
plan only because it notices the asymmetry in key ranges, and as soon
as some more rows get added with higher-numbered target_ids it will
shift to something else (probably a hash join).

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] same plan, add 1 condition, 1900x slower

2005-11-10 Thread Tom Lane
Mitch Skinner [EMAIL PROTECTED] writes:
 This is with Postgres 8.0.3.  Any advice is appreciated.

These are exactly the same plan, except for the addition of the extra
filter condition ...

-  Index Scan using external_id_map_primary_key on external_id_map
 eim  (cost=0.00..2345747.01 rows=15560708 width=26) (actual
 time=0.061..2.944 rows=2175 loops=1)

-  Index Scan using external_id_map_primary_key on external_id_map
 eim  (cost=0.00..2384648.78 rows=4150 width=26) (actual
 time=0.020..21068.508 rows=1186 loops=1)
  Filter: (source = 'SCH'::bpchar)

Apparently, you are using a platform and/or locale in which strcoll() is
spectacularly, god-awfully slow --- on the order of 10 msec per comparison.
This is a bit hard to believe but I can't make sense of those numbers
any other way.  What is the platform exactly, and what database locale
and encoding are you using?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match