Re: [PERFORM] High CPU with 7.4.1 after running for about 2 weeks

2004-03-13 Thread Tom Lane
Mike Bridge <[EMAIL PROTECTED]> writes:
> I've been running Postgresql 7.4.1 for a couple weeks after upgrading
> from 7.2.  I noticed today that the postmaster had been using 99% of
> the dual CPUs (on a PowerEdge 2650) non-stop for the last couple days.
> I stopped all the clients, and it didn't abate---even with no
> connections---so I restarted the postmaster.  Now everything is
> running smoothly again.

Since the postmaster is a single unthreaded process, it's quite
impossible for it to take up 100% of two CPUs.  Could you be more
precise about which processes were eating CPU, and what they were
doing according to the available state data?  (ps auxww and
pg_stat_activity can be helpful tools.)

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [ADMIN] syslog slowing the database?

2004-03-13 Thread Magnus Naeslund(t)
Tom Lane wrote:
Greg Spiegelberg <[EMAIL PROTECTED]> writes:

I turned syslog back on and the restore slowed down again.  Turned
it off and it sped right back up.


We have heard reports before of syslog being quite slow.  What platform
are you on exactly?  Does Richard's suggestion of turning off syslog's
fsync help?
Another tip is to use a better (well atleast more optimized) syslog 
implementation, like metalog. It optimizes log writes to a blocksize 
that is better for disk throughput.
You can also use "per line" mode with those if you want, i think.

I use another logger that is called multilog (see at http://cr.yp.to), 
that's a pipe logger thing, like one per postmaster.
It also gives very exact timestamps to every line, has built in log 
rotation and works nice with all programs i use it for.

One thing is for sure, if you log much, standard syslog (atleast on 
linux) sucks big time.
I gained back approx 30% CPU on a mailserver over here by changing to 
another logger.

Cheers
Magnus


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] High CPU with 7.4.1 after running for about 2 weeks

2004-03-13 Thread Mike Bridge
I've been running Postgresql 7.4.1 for a couple weeks after upgrading
from 7.2.  I noticed today that the postmaster had been using 99% of
the dual CPUs (on a PowerEdge 2650) non-stop for the last couple days.
I stopped all the clients, and it didn't abate---even with no
connections---so I restarted the postmaster.  Now everything is
running smoothly again.

Is there anything that might accumulate after two weeks that might
cause postgresql to thrash?  I'm running pg_autovacuum, so the
database itself should be nice and clean.  It isn't connections,
because I restarted the clients a few times without success.  I've
been running a long time on 7.2 with essentially the same
configuration (minus pg_autovacuum) without any problems

Thanks for any help,

-Mike

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Drop Tables Very Slow in Postgresql 7.2.1

2004-03-13 Thread Maneesha Nunes
Hello there !!!

I  am using  postgresql7.2.1  as the backend for an E.R.P system running
on Linux  Redhat 7.2(Enigma)
The database size is around 20-25GB
Dropping of an individual table whose size is around 200Mb takes more
than 7 mins, and also increases the load on our System
The database is vacuumed/ reindexed on a daily basis.

We have recreated the same database on a Linux Redhat release 9 OS, and
used PostgreSQL 7.3.2, the drop  here is really  fast.

Any  suggestions as to  how I  could improve the performance of drop on
postgresql7.2.1.


Thanks
maneesha.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Scaling further up

2004-03-13 Thread Marty Scholes
I have some suggestions based on my anecdotal experience.

1. This is a relatively small DB -- the working set will likely be in 
RAM at any moment in time, making read I/O time mostly irrelevant.

2. The killer will be write times -- specifically log writes.  Small and 
heavily synchronized writes, log and data writes, will drag down an 
impressive hardware RAID setup.  We run mirrored hardware RAID 5 arrays 
with write back cache and are constantly seeking ways to improve write 
performance.  We do a lot of batch processing, though, so we do a lot of 
write I/Os.

3. Be very careful with "battery backed write cache."  It usually works 
as advertised.  More than once in the past decade I have seen 
spontaneous cache corruption after power losss.  The corruption usually 
happens when some admin, including me, has assumed that the cache will 
ALWAYS survive a power failure unblemished and has no "plan B."  Make 
sure you have a contingency plan for corruption, or don't enable the cache.

4. RAID 10 will likely have bigger stripe sizes on the RAID 0 portion of 
the setup, and might hinder, not help small write I/O performance.

5. Most (almost all) of the I/O time will be due to the access time 
(head seek + head settle + rotational latency) and very little of the 
I/O time will due to data transfer time.  In other words, getting drives 
that provide faster transfer rates will barely improve performance.  The 
secret is lowering the access time.

6. A relatively cheap way to drastically drop the access time is to get 
large drive(s) and only use a portion of them for storage.  The less 
space used on the drive, the less area the heads need to cover for 
seeks.  At one extreme, you could make the partition the size of a 
single cylinder.  This would make access time (ignoring OS and 
controller overhead) identical to rotational latency, which is as low as 
4.2 ms for a cheap 7200 RPM drive.

7. A drive with a 5 ms average service time, servicing 8 KB blocks, will 
yield as much as 1.6 MB/s sustained write throughput.  Not bad for a 
cheap uncached solution.  Any OS aggregation of writes during the 
fsync() call will further improve this number -- it is basically a lower 
bound for throughput.

8. Many people, especially managers, cannot stomach buying disk space 
and only using a portion of it.  In many cases, it seems more palatable 
to purchase a much more expensive solution to get to the same speeds.

Good luck.

scott.marlowe wrote:
On Wed, 3 Mar 2004, Paul Thomas wrote:

 >
 > On 02/03/2004 23:25 johnn wrote:
 > > [snip]
 > > random_page_cost should be set with the following things taken into
 > > account:
 > >   - seek speed
 >
 > Which is not exactly the same thing as spindle speed as it's a 
combination
 > of spindle speed and track-to-track speed. I think you'll find that a 
15K
 > rpm disk, whilst it will probably have a lower seek time than a 10K rpm
 > disk, won't have a proportionately (i.e., 2/3rds) lower seek time.

There are three factors that affect how fast you can get to the next
sector:
seek time
settle time
rotational latency
Most drives only list the first, and don't bother to mention the other
two.
On many modern drives, the seek times are around 5 to 10 milliseconds.
The settle time varies as well.  the longer the seek, the longer the
settle, generally.  This is the time it takes for the head to stop shaking
and rest quietly over a particular track.
Rotational Latency is the amount of time you have to wait, on average, for
the sector you want to come under the heads.
Assuming an 8 ms seek, and 2 ms settle (typical numbers), and that the
rotational latency on average is 1/2 of a rotation:  At 10k rpm, a
rotation takes 1/166.667 of a second, or 6 mS.  So, a half a rotation is
approximately 3 mS.  By going to a 15k rpm drive, the latency drops to 2
mS.  So, if we add them up, on the same basic drive, one being 10k and one
being 15k, we get:
10krpm: 8+2+3 = 13 mS
15krpm: 8+2+2 = 12 mS
So, based on the decrease in rotational latency being the only advantage
the 15krpm drive has over the 10krpm drive, we get an decrease in access
time of only 1 mS, or only about an 8% decrease in actual seek time.
So, if you're random page cost on 10krpm drives was 1.7, you'd need to
drop it to 1.57 or so to reflect the speed increase from 15krpm drives.
I.e. it's much more likely that going from 1 gig to 2 gigs of ram will
make a noticeable difference than going from 10k to 15k drives.




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Postgresql on SAN

2004-03-13 Thread Joseph Shraibman
Josh Berkus wrote:


See above.   Also keep in mind that PostgreSQL's use of I/O should improve 
100% in version 7.5.

Really?  What happened?

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] rapid degradation after postmaster restart

2004-03-13 Thread Matthew T. O'Connor
Joe Conway wrote:

Tom Lane wrote:

Just to be clear on this: you have to restart the postmaster to bring
the time back down?  Simply starting a fresh backend session doesn't do
it?


IIRC, shared buffers was reasonable, maybe 128MB. One thing that is 
worthy of note is that they are using pg_autovacuum and a very low 
vacuum_mem setting (1024). But I also believe that max_fsm_relations 
and max_fsm_pages have been bumped up from default (something like 
1 & 20).

pg_autovacuum could be a problem if it's vacuuming too often.  Have you 
looked to see if a vacuum or analyze is running while the server is 
slow?  If so, have you played with the pg_autovacuum default vacuum and 
analyze thresholds?  If it appears that it is related to pg_autovacuum 
please send me the command options used to run it and a logfile of it's 
output running at at a debug level of -d2

Matthew

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] rapid degradation after postmaster restart

2004-03-13 Thread Josh Berkus
Joe,

> IIRC, shared buffers was reasonable, maybe 128MB. One thing that is 
> worthy of note is that they are using pg_autovacuum and a very low 
> vacuum_mem setting (1024). But I also believe that max_fsm_relations and 
> max_fsm_pages have been bumped up from default (something like 1 & 
> 20).

pg_autovacuum may be your problem.   Imagine this:

1) The chain of updates and inserts called by the procedures makes enough 
changes, on its own, to trigger pg_autovacuum.
2) Because they have a big database, and a low vacuum_mem, a vacuum of the 
largest table takes noticable time, like several minutes.
3) This means that the vacuum is still running during the second and 
succeeding events 

Something to check by watching the process list.

FWIW, I don't use pg_autovacuum for databases which have frequent large batch 
updates; I find it results in uneven performance.

Feel free to phone me if you're still stuck!

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] rapid degradation after postmaster restart

2004-03-13 Thread Joe Conway
Marty Scholes wrote:
I have seen similar results to what you are describing.

I found that running a full vacuum:

vacuumdb -fza

followed by a checkpoint makes it run fast again.

Try timing the update with and without a full vacuum.
Will do. I'll let you know how it goes.

Thanks for the reply.

Joe

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] rapid degradation after postmaster restart

2004-03-13 Thread Tom Lane
Joe Conway <[EMAIL PROTECTED]> writes:
> ... Immediately 
> after a postmaster restart, the first insert or two take about 1.5 
> minutes (undoubtedly this could be improved, but it isn't the main 
> issue). However by the second or third insert, the time increases to 7 - 
> 9 minutes. Restarting the postmaster causes the cycle to repeat, i.e. 
> the first one or two inserts are back to the 1.5 minute range.

Just to be clear on this: you have to restart the postmaster to bring
the time back down?  Simply starting a fresh backend session doesn't do
it?

Are you using particularly large values for shared_buffers or any of the
other resource parameters?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings