Re: [PERFORM] poor performance of db?

John Arbash Meinel Mon, 24 Jan 2005 20:12:01 -0800

SpaceBallOne wrote:

Thanks for the reply John,

There are approximately 800 rows total in our job table (which stays
approximately the same because 'completed' jobs get moved to a
'job_archive' table).The other jobs not shown by the specific query
could be on backorder status, temporary deleted status, etc etc.

You are correct in assuming the _id and _no (stands for 'number')
fields are unique - this was one of the first pages I built when I
started learning postgres, so not knowing how to set up primary and
foriegn keys at the time, I did it that way ... it is normalised to a
point (probably rather sloppy, but its a juggling act between learning
on the fly, what I'd like to have, and time constraints of being the
only I.T. guy in the company!)...

I think I will definitely focus on converting my database and php
pages to using proper primary keys in postgres - especially if they
automatically index themselves. I didn't do a vacuum analyse on them
so that may explain why they didn't seem to do much.

You probably can add them now if you don't want to do a lot of redesign.
ALTER TABLE job ADD PRIMARY KEY (id);

If they are not unique this will cause problems, but as they should be
unique, I think it will work.

I'm not sure how much help indexes will be if you only have 800 rows,
and your queries use 300+ of them.

You might need re-think the query/table design.

You might try doing nested queries, or explicit joins, rather than one
big query with a WHERE clause.

Meaning do stuff like:

SELECT
 (job JOIN customer ON job.customer_no = customer.customer_no) as jc
 JOIN location on jc.location_no = location.location_no
...

I also see that the planner seems to mis-estimate the number of rows in
some cases. Like here:

              ->  Hash  (cost=14.53..14.53 rows=853 width=19) (actual
time=9.79..9.79 rows=0 loops=1)
                    ->  Seq Scan on street  (cost=0.00..14.53 rows=853
width=19) (actual time=0.01..5.12 rows=853 loops=1)
        ->  Hash  (cost=9.91..9.91 rows=491 width=20) (actual
time=5.73..5.73 rows=0 loops=1)
              ->  Seq Scan on ubd  (cost=0.00..9.91 rows=491 width=20)
(actual time=0.02..2.98 rows=491


Where it thinks the hash will return all of the rows from the sequential
scan, when in reality it returns none.

I think problems with the planner fall into 3 categories.

  1. You didn't VACUUM ANALYZE.
  2. You did, but the planner doesn't keep sufficient statistics (ALTER
     TABLE job ALTER COLUMN no SET STATISTICS <a number>)
  3. You're join needs cross column statistics, which postgres doesn't
     support (yet).

If you only have 800 rows, I don't think you have to worry about
statistics, so that leaves things at 1 or 3. If you did do 1, then I
don't know what to tell you.

John
=:->

PS> I'm not a guru at this stuff, so some of what I say may be wrong.
But hopefully I point you in the right direction.


Thanks,
Dave
[EMAIL PROTECTED]

signature.asc
Description: OpenPGP digital signature

Re: [PERFORM] poor performance of db?

Reply via email to