Re: [PERFORM] query optimization question

2004-01-30 Thread Jack Coates
On Thu, 2004-01-29 at 23:23, Dennis Bjorklund wrote:
 On Thu, 29 Jan 2004, Jack Coates wrote:
 
   Probably better to repost it as a gzip'd attachment.  That should
  
  complete with a picture of the GUI version. 26k zipped, let's see if
  this makes it through.
 
 Are you sure you attached it?
 
 At least when it got here there was no attachment.

argh; attached the 40K version which was in color, removed it to make
the new one with greyscale and forgot to attach that. Here it is again:
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan


pg-perf-sql-plan.tgz
Description: application/compressed-tar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] query optimization question

2004-01-29 Thread Jack Coates
On Wed, 2004-01-28 at 18:04, Tom Lane wrote:
 Jack Coates [EMAIL PROTECTED] writes:
  I've got a query that needs some help, please. Is there a way to avoid
  all the looping? I've got freedom to work with the double-indented
  sections below ) AND (, but the initial select distinct wrapper is much
  more difficult to change. This is auto-generated code.
 
 Well, you're not going to get any serious improvement without a
 wholesale rewrite of the query --- I'd think that something driven by
 a GROUP BY memberid_ HAVING count(*) = whatever at the outer level would
 be a better way to approach it.  As you have it, the system has no
 choice but to fully evaluate two very expensive subselects, from scratch,
 for each outer row.
 

I hear you. There's definitely an understanding that this tool can
generate some gnarly queries, and we want to redesign in a way that will
allow some more intelligence to be applied to the problem. In the
meantime, I'll be happy if PG grinds at the same level as other
databases. MS-SQL completed that query in 25 minutes on a database with
31 times the data in it. Since I'm one of the bigger *nix fans around
here, that doesn't make me happy.

 However...
 
  ( select count(*) from lyrActiveRecips, members_ a, outmail_  
  where lyrActiveRecips.UserName = a.UserNameLC_  
  and lyrActiveRecips.Domain = a.Domain_  
  and a.MemberID_ = members_.MemberID_  
  and outmail_.MessageID_ = lyrActiveRecips.MailingID  
 
 Is memberid_ a unique identifier for members_, as one would think from
 the name?  If so, can't you drop the join of members_ a in this
 subselect, and just use the corresponding fields from the outer table?
 
  ( select count(*) from lyrCompletedRecips, members_ a, outmail_
  where a.MemberID_ = lyrCompletedRecips.MemberID  
  and a.UserNameLC_ = members_.UserNameLC_  
  and a.Domain_ = members_.Domain_  
  and outmail_.MessageID_ = lyrCompletedRecips.MailingID  
 
 Why are the join conditions different here from the other subselect?
 Can't you rephrase them the same as above, and then again remove the
 inner appearance of members_ ?
 
   regards, tom lane

unfortunately, the column names are different between lyrcompletedrecips
and lyractiverecips. However, one thing we were able to do is to reduce
the number of queries by not trying to match across multiple lists.

SELECT DISTINCT members_.emailaddr_, members_.memberid_ FROM members_ 
WHERE ( members_.List_='list1' 
AND members_.MemberType_='normal' 
AND members_.SubType_='mail' 
AND members_.emailaddr_ IS NOT NULL ) 
AND ( 
( select count(*) from lyrActiveRecips, outmail_ 
where outmail_.MessageID_ = lyrActiveRecips.MailingID 
and outmail_.Type_ = 'list' 
and members_.MemberID_ = lyrActiveRecips.MemberID 
and lyrActiveRecips.NextAttempt  '2004-01-20 00:00:00' )
 + 
( select count(*) from lyrCompletedRecips, outmail_ 
where members_.MemberID_ = lyrCompletedRecips.MemberID 
and outmail_.MessageID_ = lyrCompletedRecips.MailingID 
and outmail_.Type_ = 'list' 
and lyrCompletedRecips.FinalAttempt  '2004-01-20 00:00:00' 
and lyrCompletedRecips.CompletionStatusID = 300 )
 = 3
);

That completed in 3.5 minutes on MS-SQL. I killed the query this morning
after 15 hours on PostgreSQL 7.4. I tried a GROUP BY memberid_ HAVING
variation, which completed in 59 seconds on MS-SQL. I killed it after 35
minutes on PostgreSQL.

On a more positive note, if you remember the benchmarking I was doing
last month, PostgreSQL got some pretty good relative numbers. It
requires a lot of hand-holding and tuning relative to MS-SQL, but it
certainly beat the pants off of Oracle 8 and 9 for speed and ease of
management. Oracle 8 was in fact unable to complete the uglier stress
tests. I'll be working on a tuning recommendations white paper today.

thanks for all the help,
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] query optimization question

2004-01-29 Thread Jack Coates
On Thu, 2004-01-29 at 11:31, Tom Lane wrote:
 Jack Coates [EMAIL PROTECTED] writes:
  jackdb=# explain SELECT DISTINCT members_.memberid_ 
  jackdb-# FROM members_ 
  jackdb-# WHERE ( members_.List_='list1' 
  jackdb(# AND members_.MemberType_='normal' 
  jackdb(# AND members_.SubType_='mail' 
  jackdb(# AND members_.emailaddr_ IS NOT NULL ) 
  jackdb-# GROUP BY memberid_ HAVING ( 
 
 Um, that's not what I had in mind at all.  Does GROUP BY actually do
 anything at all here?  (You didn't answer me as to whether memberid_
 is a unique identifier or not, but if it is, this GROUP BY is just an
 expensive no-op.)
 

Sorry for the misunderstanding. It should be unique, yes.

 What I was envisioning was pulling the sub-selects up to the top level
 and using grouping to calculate the count(*) values for all memberids
 in parallel.  Roughly speaking it would look like (again assuming
 memberid_ is unique)
 
 SELECT memberid_ FROM
 (
   SELECT memberid_ FROM lyrActiveRecips, members_, outmail
   WHERE (all the conditions for this case)
   UNION ALL
   SELECT memberid_ FROM lyrCompletedRecips, members_, outmail
   WHERE (all the conditions for this case)
 )
 GROUP BY memberid_ HAVING count(*) = 3;
 
 However, if you can't change the boilerplate part of your query then
 this is all blue-sky speculation anyway.  

Got it now -- I'm running into some subquery errors trying to implement
this, anyway.

 What I'm actually more
 interested in is your statement that MSSQL can do the original query
 quickly.  I find that a bit hard to believe because I don't see any
 relevant optimization techniques.  Do they have any equivalent to
 EXPLAIN that would give some hint how they're doing it?

yup -- here it is. It will probably be a nasty mess after linewrap gets
done with it, so let me know if you'd like me to post a copy on ftp.

SELECT DISTINCT members_.memberid_   FROM members_   WHERE (
members_.List_='list1'AND members_.MemberType_='normal'AND
members_.SubType_='mail' )   GROUP BY memberid_ HAVING (   ( select
count(*) from lyrActiveRecips, outmail_where
outmail 11  1   0   NULLNULL1   NULL102274.5NULL   
 NULLNULL104.10356   NULLNULLSELECT  0   NULL
  |--Parallelism(Gather Streams)11  2   1   Parallelism Gather
Streams NULLNULL102274.50.0 0.22011127  23  104.10356  
 [members_].[MemberID_]  NULLPLAN_ROW-1  1.0
   |--Filter(WHERE:(If ([Expr1006] IS NULL) then 0 else
[Expr1006]+If ([Expr1012] IS NULL) then 0 else
[Expr1012]=3))  11  3   2   Filter  Filter  WHERE:(If ([Expr1006] IS NULL) 
then
0 else [Expr1006]+If ([Expr1012] IS NULL) then 0 else
[Expr1012]=3)   NULL102274.50.0 3.5393338   23  103.88345  
 [members_].[MemberID_]  NULLPLAN_ROW-1  1.0
|--Hash Match(Right Outer Join,
HASH:([lyrCompletedRecips].[MemberID])=([members_].[MemberID_]),
RESIDUAL:([members_].[MemberID_]=[lyrCompletedRecips].[MemberID]))  11  4  
 3   Hash Match  Right Outer Join
HASH:([lyrCompletedRecips].[MemberID])=([members_].[MemberID_]), 
RESIDUAL:([members_].[MemberID_]=[lyrCompletedRecips].[MemberID])  NULL
4782883.5   0.0 21.874712   23  100.34412   
[members_].[MemberID_], [Expr1006], [Expr1012]  NULLPLAN_ROW-1  1.0
 |--Compute
Scalar(DEFINE:([Expr1012]=Convert([Expr1020]))) 11  5   4   Compute
Scalar  Compute
Scalar  DEFINE:([Expr1012]=Convert([Expr1020])) [Expr1012]=Convert([Expr1020])  
119575.35   0.0 1.3723248   15  4.3749919   
[lyrCompletedRecips].[MemberID], [Expr1012] NULLPLAN_ROW-1  1.0
 ||--Hash Match(Aggregate,
HASH:([lyrCompletedRecips].[MemberID]),
RESIDUAL:([lyrCompletedRecips].[MemberID]=[lyrCompletedRecips].[MemberID]) 
DEFINE:([Expr1020]=COUNT(*)))11  6   5   Hash Match  Aggregate 
  HASH:([lyrCompletedRecips].[MemberID]), 
RESIDUAL:([lyrCompletedRecips].[MemberID]=[lyrCompletedRecips].[MemberID])  
[Expr1020]=COUNT(*) 119575.35   0.0 1.3723248   15  4.3749919  
 [lyrCompletedRecips].[MemberID], [Expr1020] NULLPLAN_ROW-1  1.0
 | |--Parallelism(Repartition Streams, PARTITION
COLUMNS:([lyrCompletedRecips].[MemberID]))  11  7   6   Parallelism
 Repartition Streams PARTITION COLUMNS:([lyrCompletedRecips].[MemberID]) NULL  
  119640.60.0 0.32407209  173 3.002667
[lyrCompletedRecips].[MemberID] NULLPLAN_ROW-1  1.0
 |  |--Nested Loops(Inner Join, OUTER
REFERENCES:([outmail_].[MessageID_]))   11  8   7   Nested LoopsInner
JoinOUTER
REFERENCES:([outmail_].[MessageID_])NULL119640.60.0 0.75014657 
 173 2.6785948

Re: [PERFORM] query optimization question

2004-01-29 Thread Jack Coates
On Thu, 2004-01-29 at 14:01, Tom Lane wrote:

 Probably better to repost it as a gzip'd attachment.  That should
 protect the formatting and get it into the list archives.
 
   regards, tom lane

complete with a picture of the GUI version. 26k zipped, let's see if
this makes it through.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] query optimization question

2004-01-28 Thread Jack Coates
 without time zone)
   -  Hash  (cost=22.50..22.50 rows=6
width=4) (actual time=0.003..0.003 rows=0 loops=1)
 -  Seq Scan on outmail_ 
(cost=0.00..22.50 rows=6 width=4) (actual time=0.002..0.002 rows=0
loops=1)
   Filter: ((type_)::text =
'list'::text)
 -  Hash  (cost=4.82..4.82 rows=2
width=211) (actual time=0.017..0.017 rows=0 loops=818122)
   -  Index Scan using pk_members_ on
members_ a  (cost=0.00..4.82 rows=2 width=211) (actual time=0.011..0.013
rows=1 loops=818122)
 Index Cond: (memberid_ = $0)
 Total runtime: 114474.407 ms
(34 rows)

that's with no data in lyractiverecips or lyrcompletedrecips. With data
in those tables, the query still hasn't completed after several hours on
two different machines.

thanks,
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] tuning questions

2003-12-09 Thread Jack Coates
On Mon, 2003-12-08 at 11:19, Tom Lane wrote:
 Jack Coates [EMAIL PROTECTED] writes:
  Theories at this point, in no particular order:
 
  a) major differences between my 7.3.4 from source (compiled with no
  options) and dev's 7.3.2-1PGDG RPMs. Looking at the spec file doesn't
  reveal anything glaring to me, but is there something I'm missing?
 
 There are quite a few performance-related patches between 7.3.2 and
 7.3.4.  Most of them should be in 7.3.4's favor but there are some
 places where we had to take a performance hit in order to have a
 suitably low-risk fix for a bug.  You haven't told us enough about
 the problem to know if any of those cases apply, though.  AFAIR
 you have not actually showed either the slow query or EXPLAIN ANALYZE
 results for it on the two boxes ...
 
   regards, tom lane

Right, because re-architecture of a cross-platform query makes sense if
performance is bad on all systems, but is questionable activity when
performance is fine on some systems and lousy on others. Hence my
statement that while SQL optimization is certainly something we want to
do for across-the-board performance increase, I wanted to focus on other
issues for troubleshooting this problem. I will be back to ask about
data access models later :-)

I ended up going back to a default postgresql.conf and reapplying the
various tunings one-by-one. Turns out that while setting fsync = false
had little effect on the slow IDE box, it had a drastic effect on this
faster SCSI box and performance is quite acceptable now (aside from the
expected falloff of about 30% after the first twenty minutes, which I
believe comes from growing and shrinking tables without vacuumdb
--analyzing).

-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] tuning questions

2003-12-08 Thread Jack Coates
On Fri, 2003-12-05 at 17:22, Jack Coates wrote:
...
 That's it, I'm throwing out this whole test series and starting over
 with different hardware. Database server is now a dual 2GHz Xeon with
 2GB RAM  2940UW SCSI, OS and PG's logs on 36G drive, PG data on 9GB
 drive. Data is importing now and I'll restart the tests tonight.

Sorry to reply at myself, but thought I'd note that the performance is
practically unchanged by moving to better hardware and separating logs
and data onto different spindles. Although the disks are twice as fast
by hdparm -Tt, their behavior as shown by iostat and vmstat is little
different between dual and dev (single P4-2GHz/512MB/(2)IDE drives).
Dual is moderately faster than my first, IDE-based testbed (about 8%),
but still only 30% as fast as the low-powered dev.

I've been running vacuumdb --analyze and/or vaccuumdb --full between
each config change, and I also let the job run all weekend. Saturday it
got --analyze every three hours or so, Sunday it got --analyze once in
the morning. None of these vacuumdb's are making any difference.

Theories at this point, in no particular order:

a) major differences between my 7.3.4 from source (compiled with no
options) and dev's 7.3.2-1PGDG RPMs. Looking at the spec file doesn't
reveal anything glaring to me, but is there something I'm missing?

b) major differences between my kernel 2.4.18-14smp (RH8) and dev's
kernel 2.4.18-3 (RH7.3).

c) phase of the moon.

While SQL optimization is likely to improve performance across the
board, it doesn't explain the differences between these two systems and
I'd like to avoid it as a theory until the fast box can perform as well
as the slow box.

Any ideas? Thanks in advance,
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] tuning questions

2003-12-05 Thread Jack Coates
On Fri, 2003-12-05 at 09:26, Josh Berkus wrote:
 Jack,
 
  The frustrating thing is, we also have a UP P3-500 with 512M RAM and two
  IDE drives with the same PG install which is doing okay with this load
  -- still half the speed of MS-SQL2K, but usable. I'm at a loss.
 
 Overall, I'm really getting the feeling that this procedure was optimized for 
 Oracle and/or MSSQL and is hitting some things that aren't such a good idea 
 for PostgreSQL.   I highly suggest that you try using log_duration and 
 log_statement (and in 7.4 log_min_duration_statement) to try to locate which 
 particular statements are taking the longest.

I'll definitely buy that as round two of optimization, but round one is
still it's faster on the slower server.

hdparm -I is identical between the boxes, filesystem structure layout is
identical, disk organization isn't identical, but far worse: the UP low
ram box has PG on /dev/hdb, ew. Predictably, vmstat shows low numbers...
but steady numbers.

dev is the box which goes fast, and I was wrong, it's actually a 2GHz
P4. rufus is the box which goes slow. During the big fetch:
dev bi sits around 2000 blocks for twenty seconds while bo is around 50
blocks, then bo jumps to 800 or so while the data is returned, then
we're done.

rufus bi starts at 16000 blocks, then drops steadily while bo climbs.
After a minute or so, bi stabilizes at 4096 blocks, then bo bursts to
return the data. Then the next fetch starts, and it's bi of 500, bo of
300 for several minutes.

These observations certainly all point to Eric and Thierry's
recommendations to better organize the filesystem and get faster disks..
except that the dev box gets acceptable performance.

So, I've dug into postgresql.conf on dev and rufus, and here's what I
found:

RUFUS








how much
ram do
you
have?






75%
converted to 8K pages of that for effective_cache



15% of
that or
512M,
whichever is larger, converted to 8K pages for shared_buffers
15% of
that
converted to 8K pages for vacuum_mem



how many
messages
will you
send
between
vacuums?



divide
that by
2 and
divide
by 6 for
max_fsm_pages







































DEV








how much
ram do
you
have?






48%
converted to 8K pages of that for effective_cache



6.5% of
that or
512M,
whichever is larger, converted to 8K pages for shared_buffers
52% of
that
converted to 8K pages for vacuum_mem




max_fsm_pages untouched on this box.







I adjusted rufus's configuration to match those percentages, but left
max_fsm_pages dialed up to 50. Now Rufus's vmstat shows much better
behavior: bi 12000 blocks gradually sloping down to 3000 during the big
select, bo steady until it's ready to return. As more jobs come in, we
see overlap areas where bi is 600-ish and bo is 200-ish, but they only
last a few tens of seconds.

The big selects are still a lot slower than they are on the smaller
database and overall performance is still unacceptable. Next I dialed
max_fsm_pages back down to 1 -- no change. Hm, maybe it's been too
long since the last vacuumdb --analyze, let's give it another.

hdparm -Tt shows that disk performance is crappo on rufus, half what it
is on dev -- and freaking dev is using 16 bit IO! This is a motherboard
IDE controller issue.

South Bridge:   VIA vt8233
Revision:   ISA 0x0 IDE 0x6

That's it, I'm throwing out this whole test series and starting over
with different hardware. Database server is now a dual 2GHz Xeon with
2GB RAM  2940UW SCSI, OS and PG's logs on 36G drive, PG data on 9GB
drive. Data is importing now and I'll restart the tests tonight.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] tuning questions

2003-12-04 Thread Jack Coates
Hi,

sorry for duplication, I asked this on pgsql-admin first before
realizing it wasn't the appropriate list.

I'm having trouble optimizing PostgreSQL for an admittedly heinous
worst-case scenario load.

testbed:
dual P3 1.3 GHz box with 2GB RAM
two IDE 120G drives on separate channels (DMA on), OS on one, DB on the
other, some swap on each (totalling 2.8G).
RH Linux 8.

I've installed PG 7.3.4 from source (./configure  make  make
install) and from PGDG RPMs and can switch back and forth. I also have
the 7.4 source but haven't done any testing with it yet aside from
starting it and importing some data.

The application is on another server, and does this torture test: it
builds a large table (~6 million rows in one test, ~18 million in
another). Rows are then pulled in chunks of 4 to 6 thousand, acted on,
and inserted back into another table (which will of course eventually
grow to the full size of the first).

The problem is that pulling the 4 to 6 thousand rows puts PostgreSQL
into a tail spin: postmaster hammers on CPU anywhere from 90 seconds to
five minutes before returning the data. During this time vmstat shows
that disk activity is up of course, but it doesn't appear to be with
page swapping (free and top and vmstat).

Another problem is that performance of the 6 million row job is decent
if I stop the job and run a vacuumdb --analyze before letting it
continue; is this something that 7.4 will help with? vacuumb --analyze
doesn't seem to have much effect on the 18 million row job.

I've tweaked shared buffers to 8192, pushed sort memory to 2048, vacuum
memory to 8192, and effective cache size to 1.
/proc/sys/kernel/shmmax is set to 16 and /proc/sys/fs/file-max
is set to 65536. Ulimit -n 3192.

I've read several sites and postings on tuning PG and have tried a
number of different theories, but I'm still not getting the architecture
of how things work.

thanks,
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 11:20, Josh Berkus wrote:
 Jack,
 
  Following this, I've done:
  2gb ram
  =
   2,000,000,000
  bytes
 
 This calculation is fun, but I really don't know where you got it from.   It 
 seems quite baroque.  What are you trying to set, exactly?
Message-ID:  [EMAIL PROTECTED]
Date: Thu, 04 Dec 2003 17:12:11 +
From: Rob Fielding [EMAIL PROTECTED]

I'm trying to set Postgres's shared memory usage in a fashion that
allows it to return requested results quickly. Unfortunately, none of
these changes allow PG to use more than a little under 300M RAM.
vacuumdb --analyze is now taking an inordinate amount of time as well
(40 minutes and counting), so that change needs to be rolled back.

 
  getting the SQL query better optimized for PG is on my todo list, but
  not something I can do right now -- this application is designed to be
  cross-platform with MS-SQL, PG, and Oracle so tweaking SQL is a touchy
  subject.
 
 Well, if you're queries are screwed up, no amount of .conf optimization is 
 going to help you much. You could criticize that PG is less adept than 
 some other systems at re-writing bad queries, and you would be correct.  
 However, there's not much to do about that on existing systems.
 
 How about posting some sample code?

Tracking that down in CVS and translating from C++ is going to take a
while -- is there a way to get PG to log the queries it's receiving?

 
  The pgavd conversation is intriguing, but I don't really understand the
  role of vacuuming. Would this be a correct statement: PG needs to
  regularly re-evaluate the database in order to adjust itself? I'm
  imagining that it continues to treat the table as a small one until
  vacuum informs it that the table is now large?
 
 Not Vacuum, Analyze.  Otherwise correct.  Mind you, in regular use where 
 only a small % of the table changes per hour, periodic ANALYZE is fine.  
 However, in batch data transform analyze statements need to be keyed to the 
 updates and/or imports.
 
 BTW, I send a couple of e-mails to the Lyris documentation maintainer about 
 updating out-of-date information about setting up PostgreSQL.   I never got a 
 response, and I don't think my changes were made.

She sits on the other side of the cube wall from me, and if I find a
decent config it's going into the manual -- consider this a golden
opportunity :-)

-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 12:27, Richard Huxton wrote:
 On Thursday 04 December 2003 19:50, Jack Coates wrote:
 
  I'm trying to set Postgres's shared memory usage in a fashion that
  allows it to return requested results quickly. Unfortunately, none of
  these changes allow PG to use more than a little under 300M RAM.
  vacuumdb --analyze is now taking an inordinate amount of time as well
  (40 minutes and counting), so that change needs to be rolled back.
 
 You don't want PG to use all your RAM, it's designed to let the underlying OS 
 do a lot of caching for it. Probably worth having a look at vmstat/iostat and 
 see if it's saturating on I/O.

latest changes:
shared_buffers = 35642
max_fsm_relations = 1000
max_fsm_pages = 1
wal_buffers = 64
sort_mem = 32768
vacuum_mem = 32768
effective_cache_size = 1

/proc/sys/kernel/shmmax = 5

IO is active, but hardly saturated. CPU load is hefty though, load
average is at 4 now.

   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us 
sy  id
 0  2  1   2808  11436  39616 1902988   0   0   240   896  765   469  
2  11  87
 0  2  1   2808  11432  39616 1902988   0   0   244   848  768   540  
4   3  93
 0  2  1   2808  11432  39616 1902984   0   0   204   876  788   507  
3   4  93
 0  2  1   2808  11432  39616 1902984   0   0   360   416  715   495  
4   1  96
 0  2  1   2808  11432  39616 1902984   0   0   376   328  689   441  
2   1  97
 0  2  0   2808  11428  39616 1902976   0   0   464   360  705   479  
2   1  97
 0  2  1   2808  11428  39616 1902976   0   0   432   380  718   547  
3   1  97
 0  2  1   2808  11428  39616 1902972   0   0   440   372  742   512  
1   3  96
 0  2  1   2808  11428  39616 1902972   0   0   416   364  711   504  
3   1  96
 0  2  1   2808  11424  39616 1902972   0   0   456   492  743   592  
2   1  97
 0  2  1   2808  11424  39616 1902972   0   0   440   352  707   494  
2   1  97
 0  2  1   2808  11424  39616 1902972   0   0   456   360  709   494  
2   2  97
 0  2  1   2808  11436  39616 1902968   0   0   536   516  807   708  
3   2  94

-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 13:24, Josh Berkus wrote:
 Jack,
 
  latest changes:
  shared_buffers = 35642
 
 This is fine, it's about 14% of available RAM.  Though the way you calculated 
 it still confuses me.   It's not complicated; it should be between 6% and 15% 
 of available RAM; since you're doing a data-transformation DB, yours should 
 be toward the high end. 
 
  max_fsm_relations = 1000
  max_fsm_pages = 1
 
 You want to raise this a whole lot if your data transformations involve large 
 delete or update batches.I'd suggest running vacuum analyze verbose 
 between steps to see how many dead pages you're accumulating.

This looks really difficult to tune, and based on the load I'm giving
it, it looks really important. I've tried the verbose analyze and I've
looked at the rules of thumb, neither approach seems good for the
pattern of hammer the system for a day or two, then leave it alone for
a week. I'm setting it to 50 (half of the biggest table size
divided by a 6k page size), but I'll keep tweaking this.

 
  wal_buffers = 64
  sort_mem = 32768
  vacuum_mem = 32768
  effective_cache_size = 1
 
 This is way the heck too low.  it's supposed to be the size of all available 
 RAM; I'd set it to 2GB*65% as a start.

This makes a little bit of difference. I set it to 65% (15869 pages).
Now we have some real disk IO:
   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us 
sy  id
 0  3  1   2804  10740  40808 1899856   0   0 26624 0  941  4144 
13  24  63
 1  2  1   2804  10808  40808 1899848   0   0 2174860 1143  3655  
9  22  69

still high cpu (3-ish load) though, and there's no noticeable
improvement in query speed.

 
  IO is active, but hardly saturated. CPU load is hefty though, load
  average is at 4 now.
 
 Unless you're doing huge statistical aggregates (like radar charts), or heavy 
 numerical calculations-by-query, high CPU and idle I/O usually indicates a 
 really bad query, like badly mismatched data types on a join or unconstrained 
 joins or  overblown formatting-by-query.

Ran that by the programmer responsible for this area and watched the
statements go by with tcpdump -X. Looks like really simple stuff to me:
select a handful of values, then insert into one table and delete from
another.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 14:59, Eric Soroos wrote:
 
  IO is active, but hardly saturated. CPU load is hefty though, load
  average is at 4 now.
 
 procs  memoryswap  io
  system cpu
   r  b  w   swpd   free   buff  cache  si  sobibo   incs  
  us  sy  id
 
   0  2  1   2808  11432  39616 1902984   0   0   204   876  788   507  
  3   4  93
 
 You're getting a load average of 4 with 93% idle?
down a bit since my last set of tweaks, but yeah:
  3:18pm  up 2 days,  3:37,  3 users,  load average: 3.42, 3.31, 2.81
66 processes: 65 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:  2.0% user,  3.4% system,  0.0% nice, 93.4% idle
CPU1 states:  1.3% user,  2.3% system,  0.0% nice, 95.2% idle
Mem:  2064656K av, 2053896K used,   10760K free,   0K shrd,   40388K
buff
Swap: 2899716K av,2800K used, 2896916K free 1896232K
cached

  PID USER PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
23103 root  15   0  1072 1072   840 R 1.3  0.0   0:01 top
23046 postgres  15   0 33364  32M 32220 S 0.5  1.6   0:12 postmaster
 
 That's a reasonable number of context switches, and if the blocks 
 you're reading/writing are discontinous, I could see io saturation 
 rearing it's head.
 
 This looks to me like you're starting and killing a lot of processes.

isn't that by design though? I've been looking at other postgres servers
around the company and they seem to act pretty similar under load (none
is being pounded to this level, though).

 
 Is this thrashing psql connections, or is it one big query? What are 
 your active processes?

[EMAIL PROTECTED] root]# ps auxw | grep postgres
postgres 23042  0.0  0.4 308808 8628 pts/0   S14:46   0:00
/usr/bin/postmaster -p 5432
postgres 23043  0.0  0.4 309788 8596 pts/0   S14:46   0:00 postgres:
stats buffer process   
postgres 23044  0.0  0.4 308828 8620 pts/0   S14:46   0:00 postgres:
stats collector process   
postgres 23046  0.6  1.4 309952 29872 pts/0  R14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23047  1.4 14.7 310424 304240 pts/0 S14:46   0:21 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23048  0.4 14.7 310044 304368 pts/0 S14:46   0:07 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23049  0.0  0.5 309820 10352 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23050  0.0  0.6 310424 13352 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23051  0.0  0.6 309940 12992 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23052  0.0  0.5 309880 11916 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23053  0.0  0.6 309924 12872 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23054  0.0  0.6 310012 13460 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23055  0.0  0.5 309932 12284 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23056  2.0 14.7 309964 304072 pts/0 S14:46   0:30 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23057  2.4 14.7 309916 304104 pts/0 S14:46   0:37 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23058  0.0  0.6 310392 13168 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23059  0.5 14.7 310424 304072 pts/0 S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23060  0.0  0.6 309896 13212 pts/0  S14:46   0:00 postgres:
lmuser lmdb 10.0.0.2 idle
postgres 23061  0.5  1.4 309944 29832 pts/0  R14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT
postgres 23062  0.6  1.4 309936 29832 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23063  0.6  1.4 309944 30028 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23064  0.6  1.4 309944 29976 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting
postgres 23065  1.4 14.7 310412 304112 pts/0 S14:46   0:21 postgres:
lmuser lmdb 216.91.56.200 idle
postgres 23066  0.5  1.4 309944 29496 pts/0  S14:46   0:08 postgres:
lmuser lmdb 216.91.56.200 INSERT waiting
postgres 23067  0.5  1.4 310472 30040 pts/0  D14:46   0:09 postgres:
lmuser lmdb 216.91.56.200 idle
postgres 23068  0.6  1.4 309936 30104 pts/0  R14:46   0:09 postgres:
lmuser lmdb 216.91.56.200 INSERT waiting
postgres 23069  0.5  1.4 309936 29716 pts/0  S14:46   0:09 postgres:
lmuser lmdb 216.91.56.200 INSERT waiting
postgres 23070  0.6  1.4 309944 29744 pts/0  S14:46   0:09 postgres:
lmuser lmdb 10.0.0.2 INSERT waiting

ten-ish stay idle all the time, the inserts go to update when the big
select is done and rows get moved from the active to the completed
table.

 Your effective cache size looks to be about 1900 megs (+- binary), 
 assuming all of it is pg.
 
 eric
   
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end

Re: [PERFORM] tuning questions

2003-12-04 Thread Jack Coates
On Thu, 2003-12-04 at 15:47, Richard Huxton wrote:
 On Thursday 04 December 2003 23:16, Jack Coates wrote:
 
effective_cache_size = 1
  
   This is way the heck too low.  it's supposed to be the size of all
   available RAM; I'd set it to 2GB*65% as a start.
 
  This makes a little bit of difference. I set it to 65% (15869 pages).
 
 That's still only about 127MB (15869 * 8KB).

yeah, missed the final digit when I copied it into the postgresql.conf
:-( Just reloaded with 158691 pages.
 
  Now we have some real disk IO:
 procs  memoryswap  io
  system cpu
   r  b  w   swpd   free   buff  cache  si  sobibo   incs  us
  sy  id
   0  3  1   2804  10740  40808 1899856   0   0 26624 0  941  4144
 
 According to this your cache is currently 1,899,856 KB which in 8KB blocks is 
 237,482 - be frugal and say effective_cache_size = 20 (or even 15 if 
 the trace above isn't typical).

d'oh, just realized what you're telling me here. /me smacks forehead.
Let's try effective_cache of 183105... (75%). Starting both servers,
waiting for big fetch to start, and...

   procs  memoryswap  io
system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us 
sy  id
 0  0  0   2800  11920  40532 1906516   0   0 0 0  521 8  
0   0 100
 0  1  0   2800  11920  40532 1906440   0   0   35652  611   113  
1   3  97
 0  1  0   2800  11920  40532 1906424   0   0 20604 0  897   808  
1  18  81
 0  1  0   2800  11920  40532 1906400   0   0 26112 0  927   820  
1  13  87
 0  1  0   2800  11920  40532 1906384   0   0 26112 0  923   812  
1  12  87
 0  1  0   2800  11920  40532 1906372   0   0 24592 0  921   805  
1  13  87
 0  1  0   2800  11920  40532 1906368   0   0  324848  961  1209  
0   4  96
 0  1  0   2800  11920  40532 1906368   0   0  2600 0  845  1631  
0   2  98
 0  1  0   2800  11920  40532 1906364   0   0  2728 0  871  1714  
0   2  98

better in vmstat... but the query doesn't work any better unfortunately.

The frustrating thing is, we also have a UP P3-500 with 512M RAM and two
IDE drives with the same PG install which is doing okay with this load
-- still half the speed of MS-SQL2K, but usable. I'm at a loss.
-- 
Jack Coates, Lyris Technologies Applications Engineer
510-549-4350 x148, [EMAIL PROTECTED]
Interoperability is the keyword, uniformity is a dead end.
--Olivier Fourdan



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match