Re: [pgsql-hackers-win32] [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-07 Thread Magnus Hagander
  What platform is this on?  It seems very strange/fishy 
 that all the 
  actual-time values are exact integral milliseconds.
 
  My machine is WinXP professional, athon xp 2100, but I get similar 
  results on my Intel P4 3.0Ghz as well (which is also 
 running WinXP).  
  Why do you ask?
 
 Well, what it suggests is that gettimeofday() is only 
 returning a result good to the nearest millisecond.  (Win32 
 hackers, does that sound right?)

Yes. The gettimeofday() implementation (in
src/backend/port/gettimeofday.c).
Actually, in reality you don't even get millisecond resolution it seems
(after some reading up). More along the line of
10-millisecond-resolution.

See for example
http://msdn.microsoft.com/msdnmag/issues/04/03/HighResolutionTimer/.



 Most modern machines seem to have clocks that can count 
 elapsed time down to near the microsecond level.  Anyone know 
 if it's possible to get such numbers out of Windows, or are 
 we stuck with milliseconds?

There are, see link above. But it's definitly not easy. I don't think we
can just take the complete code from their exmaple (due to licensing).
We could go with the middle way, but it has a couple of pitfalls.

Do we need actual high precision time, or do we just need to be able to
get high precision differences? Getting the differences is fairly easy,
but if you need to sync up any drif then it becomes a bit more
difficult.


//Magnus

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-hackers-win32] [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-07 Thread Magnus Hagander
  Do we need actual high precision time, or do we just need 
 to be able 
  to get high precision differences? Getting the differences 
 is fairly 
  easy, but if you need to sync up any drif then it becomes 
 a bit more 
  difficult.
 
 You're right, we only care about differences not absolute 
 time.  If there's something like a microseconds-since-bootup 
 counter, it would be fine.

There is. I beleive QueryPerformanceCounter has sub-mirosecond
resolution.

Can we just replace gettimeofday() with a version that's basically:
if (never_run_before)
   GetSystemTime() and get current timer for baseline.
now = baseline + current timer - baseline timer;
return now;


Or do we need to make changes at the points where the function is
actually called?


//Magnus

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [pgsql-hackers-win32] [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-07 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 There is. I beleive QueryPerformanceCounter has sub-mirosecond
 resolution.

 Can we just replace gettimeofday() with a version that's basically:

No, because it's also used for actual time-of-day calls.  It'd be
necessary to hack executor/instrument.c in particular.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-hackers-win32] [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-07 Thread Tom Lane
John A Meinel [EMAIL PROTECTED] writes:
 Can we just replace gettimeofday() with a version that's basically:
 
 No, because it's also used for actual time-of-day calls.  It'd be
 necessary to hack executor/instrument.c in particular.

 Or we modify the win32 gettimeofday call to something like:

That's what Magnus was talking about, but it's really no good because
it would cause Postgres' now() function to fail to track post-boot-time
changes in the system date setting.  Which I think would rightly be
considered a bug.

The EXPLAIN ANALYZE instrumentation code will really be happier with a
straight time-since-bootup counter; by using gettimeofday, it is
vulnerable to giving wrong answers if someone changes the date setting
while the EXPLAIN is running.  But there is (AFAIK) no such call among
the portable Unix syscalls.  It seems reasonable to me to #ifdef that
code to make use of QueryPerformanceCounter on Windows.  This does not
mean we want to alter the behavior of gettimeofday() where it's being
used to find out the time of day.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread Ken
Richard,
What do you mean by summary table?  Basically a cache of the query into a 
table with replicated column names of all the joins?  I'd probably have to 
whipe out the table every minute and re-insert the data for each carrier in 
the system.  I'm not sure how expensive this operation would be, but I'm 
guessing it would be fairly heavy-weight.  And maintaince would be a lot 
harder because of the duplicated columns, making refactorings on the 
database more error-prone.  Am I understanding your suggestion correctly? 
Please correct me if I am.

Can you turn the problem around? Calculate what you want for all users 
(once every 60 seconds) and stuff those results into a summary table. Then 
let the users query the summary table as often as they like (with the 
understanding that the figures aren't going to update any faster than once 
a minute)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread John Arbash Meinel
Ken wrote:
Richard,
What do you mean by summary table?  Basically a cache of the query
into a table with replicated column names of all the joins?  I'd
probably have to whipe out the table every minute and re-insert the
data for each carrier in the system.  I'm not sure how expensive this
operation would be, but I'm guessing it would be fairly heavy-weight.
And maintaince would be a lot harder because of the duplicated
columns, making refactorings on the database more error-prone.  Am I
understanding your suggestion correctly? Please correct me if I am.
Can you turn the problem around? Calculate what you want for all
users (once every 60 seconds) and stuff those results into a summary
table. Then let the users query the summary table as often as they
like (with the understanding that the figures aren't going to update
any faster than once a minute)

It's the same idea of a materialized view, or possibly just a lazy cache.
Just try this query:
CREATE TABLE cachedview AS
select p.id as person_id, s.*, ss.*
from shipment s
inner join shipment_status ss on s.current_status_id=ss.id
inner join release_code rc on ss.release_code_id=rc.id
left outer join driver d on s.driver_id=d.id
left outer join carrier_code cc on s.carrier_code_id=cc.id
where s.carrier_code_id in (
select cc.id
from person p
 inner join carrier_to_person ctp on p.id=ctp.person_id
 inner join carrier c on ctp.carrier_id=c.id
 inner join carrier_code cc on cc.carrier_id = c.id
)
and s.current_status_id is not null
and s.is_purged=false
and(rc.number='9' )
and(ss.date=current_date-31 )
order by ss.date desc ;
Notice that I took out the internal p.id = blah.
Then you can do:
CREATE INDEX cachedview_person_id_idx ON cachedview(person_id);
Then from the client side, you can just run:
SELECT * from cachedview WHERE person_id = id;
Now, this assumes that rc.number='9' is what you always want. If that
isn't the case, you could refactor a little bit.
This unrolls all of the work, a table which should be really fast to
query. If this query takes less than 10s to generate, than just have a
service run it every 60s. I think for refreshing, it is actually faster
to drop the table and recreate it, rather than deleteing the entries.
Dropping also has the advantage that if you ever add more rows to s or
ss, then the table automatically gets the new entries.
Another possibility, is to have the cachedview not use s.*, ss.*,
but instead just include whatever the primary keys are for those tables.
Then your final query becomes:
SELECT s.*, ss.* FROM cachedview cv, s, ss WHERE cv.person_id = id,
cv.s_id = s.pkey, cv.ss_id = ss.pkey;
Again, this should be really fast, because you should have an index on
cv.person_id and only have say 300 rows there, and then you are just
fetching a few rows from s and ss. You can also use this time to do some
of your left joins against other tables.
Does this make sense? The biggest advantage you have is your 60s
statement. With that in hand, I think you can do a lot of caching
optimizations.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread John Arbash Meinel
Ken Egervari wrote:
Josh,
...
I thought about this, but it's very important since shipment and
shipment_status are both updated in real time 24/7/365.  I think I
might be able to cache it within the application for 60 seconds at
most, but it would make little difference since people tend to refresh
within that time anyway. It's very important that real-time
inforamtion exists though.
Is 60s real-time enough for you? That's what it sounds like. It would be
nice if you could have 1hr, but there's still a lot of extra work you
can do in 60s.
You could also always throw more hardware at it. :) If the
shipment_status is one of the bottlenecks, create a 4-disk raid10 and
move the table over.
I don't remember what your hardware is, but I don't remember it being a
quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction
log on a solid state disk. :)

That sounds like an awesome system.  I loved to have something like
that. Unfortunately, the production server is just a single processor
machine with 1 GB ram.  I think throwing more disks at it is probably
the best bet, moving the shipment and shipment_status tables over as
you suggested. That's great advice.
Well, disk I/O is one side, but probably sticking another 1GB (2GB
total) also would be a fairly economical upgrade for performance.
You are looking for query performance, not really update performance,
right? So buy a 4-port SATA controller, and some WD Raptor 10k SATA
disks. With this you can create a RAID10 for  $2k (probably like $1k).
30ms is a good target, although I guess I was naive for setting that
goal perhaps.  I've just taken queries that ran at 600ms and with 1 or
2 indexes, they went down to 15ms.
It all depends on your query. If you have a giant table (1M rows), and
you are doing a seqscan for only 5 rows, then adding an index will give
you enormous productivity gains. But you are getting 30k rows, and
combining them with 6k rows, plus a bunch of other stuff. I think we've
tuned the query about as far as we can.
Let's say we have 200 users signed into the application at the same
time. The application refreshes their shipment information
automatically to make sure it's up to date on the user's screen.  The
application will execute the query we are trying to tune every 60
seconds for most of these users.  Users can set the refresh time to be
higher, but 60 is the lowest amount so I'm just assuming everyone has
it at 60.
Anyway, if you have 200 users logged in, that's 200 queries in the 60
second period, which is about 3-4 queries every second.  As you can
see, it's getting maxed out, and because of bad luck, the queries are
bunched together and are being called at the same time, making 8-9
queries in the same second and that's where the performance is
starting to degrade.  I just know that if I could get this down to 30
ms, or even 100, we'd be okay for a few months without throwing
hardware at the problem.   Also keep in mind that other application
logic and Hibernate mapping is occuring to, so 3-4 queries a second is
already no good when everything is running on a single machine.
The other query I just sent, where you do the query for all users at
once, and then cache the result, *might* be cheaper than doing a bunch
of different queries.
However, you may find that doing the query for *all* users takes to
long. So you could keep another table indicating who the most recent
people logged in are, and then only cache the info for those people.
This does start getting a little more involved, so see if you can do all
users before heading down this road.
This isn't the best setup, but it's the best we can afford.  We are
just a new startup company.  Cheaper servers and open source keep our
costs low. But money is starting to come in after 10 months of hard
work, so we'll be able to replace our server within the next 2
months.  It'll be a neccessity because we are signing on some big
clientsnow and they'll have 40 or 50 users for a single company.  If
they are all logged in at the same time, that's a lot of queries.
Sure. Just realize you can't really support 200 concurrent connections
with a single P4 and 1GB of ram.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread Josh Berkus
Ken,

 I did everything you said and my query does perform a bit better.  I've
 been getting speeds from 203 to 219 to 234 milliseconds now.   I tried
 increasing the work mem and the effective cache size from the values you
 provided, but I didn't see any more improvement.  I've tried to looking
 into setting the shared buffers for Windows XP, but I'm not sure how to do
 it.  I'm looking in the manual at:

Now that you know how to change the shared_buffers, want to go ahead and run 
the query again?

I'm pretty concerned about your case, because based on your description I 
would expect  100ms on a Linux machine.So I'm wondering if this is a 
problem with WindowsXP performance, or if it's something we can fix through 
tuning.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread John A Meinel
John Arbash Meinel wrote:
Ken wrote:
Richard,
What do you mean by summary table?  Basically a cache of the query
into a table with replicated column names of all the joins?  I'd
probably have to whipe out the table every minute and re-insert the
data for each carrier in the system.  I'm not sure how expensive this
operation would be, but I'm guessing it would be fairly heavy-weight.
And maintaince would be a lot harder because of the duplicated
columns, making refactorings on the database more error-prone.  Am I
understanding your suggestion correctly? Please correct me if I am.
Can you turn the problem around? Calculate what you want for all
users (once every 60 seconds) and stuff those results into a summary
table. Then let the users query the summary table as often as they
like (with the understanding that the figures aren't going to update
any faster than once a minute)

It's the same idea of a materialized view, or possibly just a lazy cache.
...
This unrolls all of the work, a table which should be really fast to
query. If this query takes less than 10s to generate, than just have a
service run it every 60s. I think for refreshing, it is actually faster
to drop the table and recreate it, rather than deleteing the entries.
Dropping also has the advantage that if you ever add more rows to s or
ss, then the table automatically gets the new entries.
Just as a small update. If completely regenerating the cache takes to 
long, the other way to do it, is to create insert and update triggers on 
s and ss, such that as they change, they also update the cachedview table.

Something like
CREATE TRIGGER on_ss_ins AFTER INSERT ON ss FOR EACH ROW EXECUTE
   INSERT INTO cached_view SELECT p.id as person_id, s.*, ss.* FROM 
the big stuff WHERE s.id = NEW.id;

This runs the same query, but notice that the WHERE means it only allows 
the new row. So this query should run fast. It is a little bit of 
overhead on each of your inserts, but it should keep the cache 
up-to-date. With something like this, I would have the final client 
query still include the date restriction, since you accumulate older 
rows into the cached view. But you can run a daily process that prunes 
out everything older than 31 days, which keeps the cachedview from 
getting really large.

John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-03 Thread Ken Egervari
2) Force PG to drop the merge join via SET ENABLE_MERGEJOIN = FALSE;
Actually, it was 312 milliseconds, so it got worse.
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-03 Thread Josh Berkus
Ken,

Well, I'm a bit stumped on troubleshooting the actual query since Windows' 
poor time resolution makes it impossible to trust the actual execution times.  
Obviously this is something we need to look into for the Win32 port for 
8.1 ..

 shared_buffers = 1000

This may be slowing up that merge join.  Try resetting it to 6000.I'm not 
sure what system settings you might have to do on Windows to get it to 
support higher shared buffers; see the docs.

 #work_mem = 1024

Up this to 4096 for testing purposes; your production value will vary 
depending on several factors; see link below.

 #effective_cache_size = 1000

Increase this to the actual amount of RAM you have available, about 750MB (you 
do the math)

 #random_page_cost = 4

Leave this for now.   

See www.powerpostgresql.com/PerfList for more information.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-03 Thread Ken Egervari
Josh,
I did everything you said and my query does perform a bit better.  I've been 
getting speeds from 203 to 219 to 234 milliseconds now.   I tried increasing 
the work mem and the effective cache size from the values you provided, but 
I didn't see any more improvement.  I've tried to looking into setting the 
shared buffers for Windows XP, but I'm not sure how to do it.  I'm looking 
in the manual at:
http://www.postgresql.org/docs/8.0/interactive/kernel-resources.html#SYSVIPC-PARAMETERS

It doesn't mention windows at all.  Does anyone have any ideas on have to 
fix this?

Here is the new explain analyze.
Sort  (cost=1996.21..1996.26 rows=17 width=165) (actual 
time=297.000..297.000 rows=39 loops=1)
 Sort Key: ss.date
 -  Merge Right Join  (cost=1951.26..1995.87 rows=17 width=165) (actual 
time=297.000..297.000 rows=39 loops=1)
   Merge Cond: (outer.id = inner.driver_id)
   -  Index Scan using driver_pkey on driver d  (cost=0.00..42.16 
rows=922 width=43) (actual time=0.000..0.000 rows=922 loops=1)
   -  Sort  (cost=1951.26..1951.30 rows=17 width=122) (actual 
time=297.000..297.000 rows=39 loops=1)
 Sort Key: s.driver_id
 -  Hash Join  (cost=586.48..1950.91 rows=17 width=122) 
(actual time=219.000..297.000 rows=39 loops=1)
   Hash Cond: (outer.carrier_code_id = inner.id)
   -  Merge Join  (cost=571.97..1931.95 rows=830 width=87) 
(actual time=219.000..297.000 rows=310 loops=1)
 Merge Cond: (outer.current_status_id = 
inner.id)
 -  Index Scan using 
shipment_current_status_id_idx on shipment s  (cost=0.00..2701.26 rows=60307 
width=66) (actual time=0.000..62.000 rows=27711 loops=1)
   Filter: (is_purged = false)
 -  Sort  (cost=571.97..576.38 rows=1766 width=21) 
(actual time=125.000..156.000 rows=6902 loops=1)
   Sort Key: ss.id
   -  Hash Join  (cost=1.11..476.72 rows=1766 
width=21) (actual time=0.000..93.000 rows=6902 loops=1)
 Hash Cond: (outer.release_code_id = 
inner.id)
 -  Index Scan Backward using 
current_status_date_idx on shipment_status ss  (cost=0.00..387.35 rows=14122 
width=21) (actual time=0.000..16.000 rows=14925 loops=1)
   Index Cond: (date = 
(('now'::text)::date - 31))
 -  Hash  (cost=1.10..1.10 rows=1 
width=4) (actual time=0.000..0.000 rows=0 loops=1)
   -  Seq Scan on release_code rc 
(cost=0.00..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
 Filter: ((number)::text = 
'9'::text)
   -  Hash  (cost=14.51..14.51 rows=2 width=35) (actual 
time=0.000..0.000 rows=0 loops=1)
 -  Nested Loop  (cost=4.92..14.51 rows=2 
width=35) (actual time=0.000..0.000 rows=2 loops=1)
   -  Index Scan using person_pkey on person p 
(cost=0.00..5.73 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
 Index Cond: (id = 355)
   -  Hash Join  (cost=4.92..8.75 rows=2 
width=39) (actual time=0.000..0.000 rows=2 loops=1)
 Hash Cond: (outer.id = 
inner.carrier_id)
 -  Seq Scan on carrier c 
(cost=0.00..3.54 rows=54 width=4) (actual time=0.000..0.000 rows=54 loops=1)
 -  Hash  (cost=4.92..4.92 rows=2 
width=43) (actual time=0.000..0.000 rows=0 loops=1)
   -  Hash Join  (cost=3.04..4.92 
rows=2 width=43) (actual time=0.000..0.000 rows=2 loops=1)
 Hash Cond: 
(outer.carrier_id = inner.carrier_id)
 -  Seq Scan on 
carrier_code cc  (cost=0.00..1.57 rows=57 width=35) (actual 
time=0.000..0.000 rows=57 loops=1)
 -  Hash  (cost=3.04..3.04 
rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1)
   -  Index Scan using 
carrier_to_person_person_id_idx on carrier_to_person ctp  (cost=0.00..3.04 
rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=1)
 Index Cond: 
(355 = person_id)
Total runtime: 297.000 ms 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-03 Thread John Arbash Meinel
Ken Egervari wrote:
Josh,
I did everything you said and my query does perform a bit better.
I've been getting speeds from 203 to 219 to 234 milliseconds now.   I
tried increasing the work mem and the effective cache size from the
values you provided, but I didn't see any more improvement.  I've
tried to looking into setting the shared buffers for Windows XP, but
I'm not sure how to do it.  I'm looking in the manual at:
http://www.postgresql.org/docs/8.0/interactive/kernel-resources.html#SYSVIPC-PARAMETERS

You probably don't need to change anything for Windows. If you set
shared_buffers too high, then postgres won't start. If it is starting,
then you don't need to modify the OS to get more shared buffers. (For
instance, on my Mac, I can't get shared_buffers  500 without changing
things, but on windows I run with 3000 and no modification).
It doesn't mention windows at all.  Does anyone have any ideas on have
to fix this?
Do you need the interior sort? It's taking ~93ms to get 7k rows from
shipment_status, and then another 30ms to sort them. This isn't a lot,
so it might be fine.
Also, did you ever try CLUSTER current_status_date_idx ON shipment_status.
This groups the rows in shipment_status by their status date, which
helps put items with the same date next to eachother. This may effect
other portions of the query, or other queries. Also, if you are
inserting sequentially, it would seem that the items would already be
naturally near eachother based on date.
The next big cost is having to merge the 28k rows with the fast hash
plan, which takes about 80ms.
I guess the biggest issue is that you are doing a lot of work, and it
takes time to do it. Also, I've noticed that this query is being run
with exactly the same data. Which is good to compare two methods. But
remember to test on multiple potential values. You might be better off
one way with this query, but much worse for a different dataset. I
noticed that this seems to have fewer rows than what postgres thinks the
*average* number would be. (It predicts 60k and you only get 28k rows).
If this query is performed a lot, and you can be okay with a slight
delay in updating, you could always switch to some sort of lazy
materialized view.
You could also always throw more hardware at it. :) If the
shipment_status is one of the bottlenecks, create a 4-disk raid10 and
move the table over.
I don't remember what your hardware is, but I don't remember it being a
quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction
log on a solid state disk. :)
Why do you need the query to be 30ms? ~250ms is still pretty fast. If
you are needing updates faster than that, you might look more into *why*
and then handle it from a higher level.
And naturally, the most important this is to test it under load. 250ms
is pretty good, but if under load it goes back to 6s, then we probably
should look for different alternatives. Also, what is the load that is
causing the problem? Is it that you have some other big seqscans which
are causing all of your tables to go out of cache?
Also, I believe I remember you saying that your production server is a
P4, is that a single P4? Because I know postgres prefers Opterons to
Pentium Xeons when in a multiprocessor machine. Look through the
archives about spinlocks and the context switch bug. (context storm,
etc). Plus, since opterons are 64-bit, you can throw a lot more RAM at
them. I believe opterons outperform xeons for the same cost, *and* you
can scale them up with extra ram.
But remember, the biggest bottleneck is almost *always* the I/O. So put
more  faster disks into the system first.
John
=:-
Here is the new explain analyze.
Sort  (cost=1996.21..1996.26 rows=17 width=165) (actual
time=297.000..297.000 rows=39 loops=1)
 Sort Key: ss.date
 -  Merge Right Join  (cost=1951.26..1995.87 rows=17 width=165)
(actual time=297.000..297.000 rows=39 loops=1)
   Merge Cond: (outer.id = inner.driver_id)
   -  Index Scan using driver_pkey on driver d  (cost=0.00..42.16
rows=922 width=43) (actual time=0.000..0.000 rows=922 loops=1)
   -  Sort  (cost=1951.26..1951.30 rows=17 width=122) (actual
time=297.000..297.000 rows=39 loops=1)
 Sort Key: s.driver_id
 -  Hash Join  (cost=586.48..1950.91 rows=17 width=122)
(actual time=219.000..297.000 rows=39 loops=1)
   Hash Cond: (outer.carrier_code_id = inner.id)
   -  Merge Join  (cost=571.97..1931.95 rows=830
width=87) (actual time=219.000..297.000 rows=310 loops=1)
 Merge Cond: (outer.current_status_id =
inner.id)
 -  Index Scan using
shipment_current_status_id_idx on shipment s  (cost=0.00..2701.26
rows=60307 width=66) (actual time=0.000..62.000 rows=27711 loops=1)
   Filter: (is_purged = false)
 -  Sort  (cost=571.97..576.38 rows=1766
width=21) (actual time=125.000..156.000 rows=6902 loops=1)
   Sort Key: 

Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-03 Thread Ken Egervari
Josh,
Thanks so much for your comments.  They are incredibly insightful and you 
clearly know your stuff.  It's so great that I'm able to learn so much from 
you.  I really appreciate it.

Do you need the interior sort? It's taking ~93ms to get 7k rows from
shipment_status, and then another 30ms to sort them. This isn't a lot,
so it might be fine.
Running the query without the sort doesn't actually improve performance 
unfortunately, which I find strange.  I think the analyze is giving bad 
feedback because taking all sorts out completely makes no difference in 
performance.  Dan Tow's book actually said the same thing... how sorting 
rarely takes up the bulk of the work.  Although I know you didn't like his 
book much, but I had observed that in my experience too.

Also, did you ever try CLUSTER current_status_date_idx ON shipment_status.
This groups the rows in shipment_status by their status date, which
helps put items with the same date next to eachother. This may effect
other portions of the query, or other queries. Also, if you are
inserting sequentially, it would seem that the items would already be
naturally near eachother based on date.
Yes, this was one of the first things I tried actually and it is currently 
clustered.  Since shipment status comes into our system at real time, the 
dates are more or less in order as well.

The next big cost is having to merge the 28k rows with the fast hash
plan, which takes about 80ms.
I guess the biggest issue is that you are doing a lot of work, and it
takes time to do it. Also, I've noticed that this query is being run
with exactly the same data. Which is good to compare two methods. But
remember to test on multiple potential values. You might be better off
one way with this query, but much worse for a different dataset. I
noticed that this seems to have fewer rows than what postgres thinks the
*average* number would be. (It predicts 60k and you only get 28k rows).
Well, the example where p.id = 355 is an above normal case where performance 
is typically bad.  If a user's company has very few shipments and 
shipment_status rows, performance isn't going to matter much and those 
queries usually perform much faster.  I really needed to tune this for the 
larger customers who do have thousands of rows for their entire company and 
will probably reach 6 digits by the end of next year.  For the person 355, 
they've only been on the system for 3 months and they already have 27700 
rows.  Even if this makes the smaller customers a bit slower, I think it's 
worth it if I can speed up cases like this, who all have very similar data 
distribution.

If this query is performed a lot, and you can be okay with a slight
delay in updating, you could always switch to some sort of lazy
materialized view.
I thought about this, but it's very important since shipment and 
shipment_status are both updated in real time 24/7/365.  I think I might be 
able to cache it within the application for 60 seconds at most, but it would 
make little difference since people tend to refresh within that time anyway. 
It's very important that real-time inforamtion exists though.

You could also always throw more hardware at it. :) If the
shipment_status is one of the bottlenecks, create a 4-disk raid10 and
move the table over.
I don't remember what your hardware is, but I don't remember it being a
quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction
log on a solid state disk. :)
That sounds like an awesome system.  I loved to have something like that. 
Unfortunately, the production server is just a single processor machine with 
1 GB ram.  I think throwing more disks at it is probably the best bet, 
moving the shipment and shipment_status tables over as you suggested. 
That's great advice.

Why do you need the query to be 30ms? ~250ms is still pretty fast. If
you are needing updates faster than that, you might look more into *why*
and then handle it from a higher level.
30ms is a good target, although I guess I was naive for setting that goal 
perhaps.  I've just taken queries that ran at 600ms and with 1 or 2 indexes, 
they went down to 15ms.

Let's say we have 200 users signed into the application at the same time. 
The application refreshes their shipment information automatically to make 
sure it's up to date on the user's screen.  The application will execute the 
query we are trying to tune every 60 seconds for most of these users.  Users 
can set the refresh time to be higher, but 60 is the lowest amount so I'm 
just assuming everyone has it at 60.

Anyway, if you have 200 users logged in, that's 200 queries in the 60 second 
period, which is about 3-4 queries every second.  As you can see, it's 
getting maxed out, and because of bad luck, the queries are bunched together 
and are being called at the same time, making 8-9 queries in the same second 
and that's where the performance is starting to degrade.  I just know that 
if I could get this down to 30 ms, or even 

Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-02 Thread Ken Egervari
it might help the planner estimate better the number of cs rows
affected. whether this improves performance depends on whether
the best plans are sensitive to this.
I managed to try this and see if it did anything.  Unfortunately, it made no 
difference.  It's still 250 milliseconds.  It was a good suggestion though. 
I believed it work too.

an EXPLAIN ANALYSE might reduce the guessing.
Okay, here is the explain analyze I managed to get from work.  It came out 
to 312ms here, but without the analyze it actually runs at ~250ms.  It is 
using indexes, so my guess is that there are too many joins or it's not 
driving on person fast enough.  Release code is such a small table that I 
dont think that sequencial scan matters.  Thanks for taking the time to 
analyze this.

Sort  (cost=1902.27..1902.31 rows=17 width=91) (actual time=312.000..312.000 
rows=39 loops=1)
 Sort Key: ss.date
 -  Hash Join  (cost=617.07..1901.92 rows=17 width=91) (actual 
time=234.000..312.000 rows=39 loops=1)
   Hash Cond: (outer.carrier_code_id = inner.id)
   -  Merge Join  (cost=602.54..1882.73 rows=870 width=91) (actual 
time=234.000..312.000 rows=310 loops=1)
 Merge Cond: (outer.current_status_id = inner.id)
 -  Index Scan using shipment_current_status_id_idx on 
shipment s  (cost=0.00..2552.13 rows=60327 width=66) (actual 
time=0.000..61.000 rows=27711 loops=1)
   Filter: (is_purged = false)
 -  Sort  (cost=602.54..607.21 rows=1866 width=25) (actual 
time=125.000..125.000 rows=6934 loops=1)
   Sort Key: ss.id
   -  Hash Join  (cost=1.11..501.17 rows=1866 width=25) 
(actual time=0.000..78.000 rows=6934 loops=1)
 Hash Cond: (outer.release_code_id = inner.id)
 -  Index Scan using current_status_date_idx on 
shipment_status ss  (cost=0.00..406.78 rows=14924 width=25) (actual 
time=0.000..47.000 rows=15053 loops=1)
   Index Cond: (date = (('now'::text)::date - 
31))
   Filter: (id IS NOT NULL)
 -  Hash  (cost=1.10..1.10 rows=1 width=4) (actual 
time=0.000..0.000 rows=0 loops=1)
   -  Seq Scan on release_code rc 
(cost=0.00..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
 Filter: ((number)::text = '9'::text)
   -  Hash  (cost=14.53..14.53 rows=2 width=4) (actual 
time=0.000..0.000 rows=0 loops=1)
 -  Nested Loop  (cost=4.92..14.53 rows=2 width=4) (actual 
time=0.000..0.000 rows=2 loops=1)
   -  Index Scan using person_pkey on person p 
(cost=0.00..5.75 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
 Index Cond: (id = 355)
   -  Hash Join  (cost=4.92..8.75 rows=2 width=8) (actual 
time=0.000..0.000 rows=2 loops=1)
 Hash Cond: (outer.id = inner.carrier_id)
 -  Seq Scan on carrier c  (cost=0.00..3.54 
rows=54 width=4) (actual time=0.000..0.000 rows=54 loops=1)
 -  Hash  (cost=4.92..4.92 rows=2 width=16) 
(actual time=0.000..0.000 rows=0 loops=1)
   -  Hash Join  (cost=3.04..4.92 rows=2 
width=16) (actual time=0.000..0.000 rows=2 loops=1)
 Hash Cond: (outer.carrier_id = 
inner.carrier_id)
 -  Seq Scan on carrier_code cc 
(cost=0.00..1.57 rows=57 width=8) (actual time=0.000..0.000 rows=57 loops=1)
 -  Hash  (cost=3.04..3.04 rows=1 
width=8) (actual time=0.000..0.000 rows=0 loops=1)
   -  Index Scan using 
carrier_to_person_person_id_idx on carrier_to_person ctp  (cost=0.00..3.04 
rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=1)
 Index Cond: (355 = 
person_id)
Total runtime: 312.000 ms

Ken 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-02 Thread Ken Egervari
Ken Egervari [EMAIL PROTECTED] writes:
Okay, here is the explain analyze I managed to get from work.
What platform is this on?  It seems very strange/fishy that all the
actual-time values are exact integral milliseconds.
regards, tom lane
My machine is WinXP professional, athon xp 2100, but I get similar results 
on my Intel P4 3.0Ghz as well (which is also running WinXP).  Why do you 
ask? 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-02 Thread Josh Berkus
Ken,

         -  Merge Join  (cost=602.54..1882.73 rows=870 width=91) (actual
 time=234.000..312.000 rows=310 loops=1)
               Merge Cond: (outer.current_status_id = inner.id)

Hmmm ... this merge join appears to be the majority of your execution 
time  at least within the resolution that PGWin allows us.Please try 
two things, and give us Explain Analyzes:

1) To determine your query order ala Dan Tow and drive off of person, please 
SET JOIN_COLLAPSE_LIMIT = 1 and then run Mark Kirkwood's version of the 
query.  (Not that I believe in Dan Tow ... see previous message ... but it 
would be interesting to see the results.

2) Force PG to drop the merge join via SET ENABLE_MERGEJOIN = FALSE;

Also, please let us know some about the server you're using and your 
configuration parameters, particularly:
shared_buffers
work_mem
effective_cache_size
random_page_cost

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-02 Thread Ken Egervari
Josh,
1) To determine your query order ala Dan Tow and drive off of person, 
please
SET JOIN_COLLAPSE_LIMIT = 1 and then run Mark Kirkwood's version of the
query.  (Not that I believe in Dan Tow ... see previous message ... but it
would be interesting to see the results.
Unfortunately, the query still takes 250 milliseconds.  I tried it with 
other queries and the results are the same as before.  Here is the explain 
analayze anyway:

Sort  (cost=2036.83..2036.87 rows=16 width=103) (actual 
time=328.000..328.000 rows=39 loops=1)
 Sort Key: cs.date
 -  Nested Loop Left Join  (cost=620.61..2036.51 rows=16 width=103) 
(actual time=250.000..328.000 rows=39 loops=1)
   -  Hash Join  (cost=620.61..1984.90 rows=16 width=78) (actual 
time=250.000..328.000 rows=39 loops=1)
 Hash Cond: (outer.carrier_code_id = inner.id)
 -  Merge Join  (cost=606.11..1965.99 rows=825 width=74) 
(actual time=250.000..328.000 rows=310 loops=1)
   Merge Cond: (outer.current_status_id = inner.id)
   -  Index Scan using shipment_current_status_id_idx on 
shipment s  (cost=0.00..2701.26 rows=60307 width=66) (actual 
time=0.000..77.000 rows=27711 loops=1)
 Filter: (is_purged = false)
   -  Sort  (cost=606.11..610.50 rows=1756 width=12) 
(actual time=141.000..141.000 rows=6902 loops=1)
 Sort Key: cs.id
 -  Hash Join  (cost=1.11..511.48 rows=1756 
width=12) (actual time=0.000..109.000 rows=6902 loops=1)
   Hash Cond: (outer.release_code_id = 
inner.id)
   -  Index Scan Backward using 
current_status_date_idx on shipment_status cs  (cost=0.01..422.58 rows=14047 
width=16) (actual time=0.000..78.000 rows=14925 loops=1)
 Index Cond: ((date = 
(('now'::text)::date - 31)) AND (date = ('now'::text)::date))
 Filter: (cs.* IS NOT NULL)
   -  Hash  (cost=1.10..1.10 rows=1 width=4) 
(actual time=0.000..0.000 rows=0 loops=1)
 -  Seq Scan on release_code rc 
(cost=0.00..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
   Filter: ((number)::text = 
'9'::text)
 -  Hash  (cost=14.49..14.49 rows=2 width=8) (actual 
time=0.000..0.000 rows=0 loops=1)
   -  Nested Loop  (cost=6.87..14.49 rows=2 width=8) 
(actual time=0.000..0.000 rows=2 loops=1)
 -  Index Scan using person_pkey on person p 
(cost=0.00..5.73 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
   Index Cond: (id = 355)
 -  Hash Join  (cost=6.87..8.74 rows=2 width=8) 
(actual time=0.000..0.000 rows=2 loops=1)
   Hash Cond: (outer.carrier_id = 
inner.carrier_id)
   -  Seq Scan on carrier_code cc 
(cost=0.00..1.57 rows=57 width=8) (actual time=0.000..0.000 rows=57 loops=1)
   -  Hash  (cost=6.86..6.86 rows=1 width=12) 
(actual time=0.000..0.000 rows=0 loops=1)
 -  Hash Join  (cost=3.04..6.86 rows=1 
width=12) (actual time=0.000..0.000 rows=1 loops=1)
   Hash Cond: (outer.id = 
inner.carrier_id)
   -  Seq Scan on carrier c 
(cost=0.00..3.54 rows=54 width=4) (actual time=0.000..0.000 rows=54 loops=1)
   -  Hash  (cost=3.04..3.04 
rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1)
 -  Index Scan using 
carrier_to_person_person_id_idx on carrier_to_person ctp  (cost=0.00..3.04 
rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=1)
   Index Cond: (355 = 
person_id)
   -  Index Scan using shipment_status_shipment_id_idx on 
shipment_status ss  (cost=0.00..3.20 rows=2 width=25) (actual 
time=0.000..0.000 rows=1 loops=39)
 Index Cond: (ss.shipment_id = outer.id)
Total runtime: 328.000 ms

2) Force PG to drop the merge join via SET ENABLE_MERGEJOIN = FALSE;
Setting this option had no effect either  In fact, the query is a bit slower 
(266 milliseconds but 250 came up once in 20 executions).

Also, please let us know some about the server you're using and your
configuration parameters, particularly:
shared_buffers
work_mem
effective_cache_size
random_page_cost
Well, I'm on a test machine so the settings haven't changed one bit from the 
defaults.  This may sound embarrassing, but I bet the production server is 
not custom configured either.  The computer I'm running these queries on is 
just a simple Athon XP 2100+ on WinXP with 1GB of RAM.  The production 
server is a faster P4, but the rest is the same.  Here are the 4 values in 
my configuration, but