Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-16 Thread Josh Berkus
Jeff,

  4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog
  you're better off.

 Like Mr. Stone said earlier, this is pure dogma.  In my experience,
 xlogs on the same volume with data is much faster if both are on
 battery-backed write-back RAID controller memory.  Moving from this
 situation to xlogs on a single normal disk is going to be much slower in
 most cases.

The advice on separate drives for xlog (as is all advice on that web page) is 
based on numerous, repeatable tests at OSDL.

However, you are absolutely correct in that it's *relative* advice, not 
absolute advice.   If, for example, you're using a $100,000 EMC SAN as your 
storage you'll probably be better off giving it everything and letting its 
controller and cache handle disk allocation etc.   On the other hand, if 
you're dealing with the 5 drives in a single Dell 6650, I've yet to encounter 
a case where a separate xlog disk did not benefit an OLTP application.

For Solaris, the advantage of using a separate disk or partition is that the 
mount options you want for the xlog (including forcedirectio) are 
considerably different from what you'd use with the main database.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-16 Thread Alvaro Herrera
On Tue, Aug 16, 2005 at 09:12:31AM -0700, Josh Berkus wrote:

 However, you are absolutely correct in that it's *relative* advice, not 
 absolute advice.   If, for example, you're using a $100,000 EMC SAN as your 
 storage you'll probably be better off giving it everything and letting its 
 controller and cache handle disk allocation etc.   On the other hand, if 
 you're dealing with the 5 drives in a single Dell 6650, I've yet to encounter 
 a case where a separate xlog disk did not benefit an OLTP application.

I've been asked this a couple of times and I don't know the answer: what
happens if you give XLog a single drive (unmirrored single spindle), and
that drive dies?  So the question really is, should you be giving two
disks to XLog?

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
[PostgreSQL] is a great group; in my opinion it is THE best open source
development communities in existence anywhere.(Lamar Owen)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-16 Thread Joshua D. Drake



I've been asked this a couple of times and I don't know the answer: what
happens if you give XLog a single drive (unmirrored single spindle), and
that drive dies?  So the question really is, should you be giving two
disks to XLog?


If that drive dies your restoring from backup. You would need to run at 
least RAID 1, preferrably RAID 10.


Sincerely,

Joshua D. Drkae







--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-16 Thread Michael Stone

On Tue, Aug 16, 2005 at 09:12:31AM -0700, Josh Berkus wrote:
However, you are absolutely correct in that it's *relative* advice, not 
absolute advice.   If, for example, you're using a $100,000 EMC SAN as your 
storage you'll probably be better off giving it everything and letting its 
controller and cache handle disk allocation etc.   


Well, you don't have to spend *quite* that much to get a decent storage
array. :) 


On the other hand, if you're dealing with the 5 drives in a single Dell
6650, I've yet to encounter a case where a separate xlog disk did not
benefit an OLTP application.


IIRC, that's an older raid controller that tops out at 128MB write
cache, and 5 spindles ain't a lot--so it makes sense that it would
benefit from a seperate spindle for xlog. Also note that I said the
write cache advice goes out the window if you have a workload that
involves constant writing (or if your xlog writes come in faster than
your write cache can drain) because at that point you essentially drop
back to raw disk speed; I assume the OLTP apps you mention are fairly
write-intensive.  OTOH, in a reasonably safe configuration I suppose
you'd end up with a 3 disk raid5 / 2 disk raid1 or 2 raid 1 pairs on
that dell 6650; is that how you test? Once you're down to that small a
data set I'd expect the system's ram cache to be a much larger
percentage of the working set, which would tend to make the xlog just
about the *only* latency-critical i/o. That's a different creature from
a data mining app that might really benefit from having additional
spindles to accelerate read performance from indices much larger than
RAM. At any rate, this just underscores the need for testing a
particular workload on particular hardware. Things like the disk speed,
raid configuration, write cache size, transaction size, data set size,
working set size, concurrent transactions, read vs write emphasis, etc.,
are each going to have a fairly large impact on performance. 

For Solaris, the advantage of using a separate disk or partition is that the 
mount options you want for the xlog (including forcedirectio) are 
considerably different from what you'd use with the main database.


Yeah, having a seperate partition is often good even if you do have
everything on the same disks.

Mike Stone

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-12 Thread Jeff Trout


On Aug 11, 2005, at 12:58 PM, Jeffrey W. Baker wrote:


Like Mr. Stone said earlier, this is pure dogma.  In my experience,
xlogs on the same volume with data is much faster if both are on
battery-backed write-back RAID controller memory.  Moving from this
situation to xlogs on a single normal disk is going to be much  
slower in

most cases.



This does also point one important point about performance.  Which is  
a touch unfortunate (and expensive to test):  Your milage may vary on  
any of these improvements.   Some people have 0 problems and  
incredible performance with say, 1000 shared_bufs and the WAL on the  
same disk..  Others need 10k shared bufs and wal split over a 900  
spindle raid with data spread across 18 SAN's...

Unfortunately there is no one true way :(

The best bet (which is great if you can): Try out various settings..  
if you still run into problems look into some more hardware.. see if  
you can borrow any or fabricate a poor man's equivalent for testing.


--
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-11 Thread Steve Poe
Paul,

Before I say anything else, one online document which may be of
assistance to you is:
http://www.powerpostgresql.com/PerfList/

Some thoughts I have:

3) You're shared RAM setting seems overkill to me. Part of the challenge
is you're going from 1000 to 262K with no assessment in between. Each
situation can be different, but try in the range of 10 - 50K.

4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog
you're better off. If it is sharing with any other OS/DB resource, the
performance will be impacted.

From what I have learned from others on this list, RAID5 is not the best
choice for the database. RAID10 would be a better solution (using 8 of
your disks) then take the remaining disk and do mirror with your pg_xlog
if possible.

Best of luck,

Steve Poe

On Thu, 2005-08-11 at 13:23 +0100, Paul Johnson wrote:
 Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC
 CPUs running Solaris 10. The DB cluster is on an external fibre-attached
 Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN.
 
 The system is for the sole use of a couple of data warehouse developers,
 hence we are keen to use 'aggressive' tuning options to maximise
 performance.
 
 So far we have made the following changes and measured the impact on our
 test suite:
 
 1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement
 in some cases.
 
 2) Increase work_mem from 1,024 to 524,288.
 
 3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required
 setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box.
 
 Question - can Postgres only use 2GB RAM, given that shared_buffers can
 only be set as high as 262,143 (8K pages)?
 
 So far so good...
 
 4) Move /pg_xlog to an internal disk within the V250. This has had a
 severe *negative* impact on performance. Copy job has gone from 2 mins to
 12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL
 jobs.
 
 I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to
 a single spindle disk?
 
 In cases such as this, where an external storage array with a hardware
 RAID controller is used, the normal advice to separate the data from the
 pg_xlog  seems to come unstuck, or are we missing something?
 
 Cheers,
 
 Paul Johnson.
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-11 Thread Jeffrey W. Baker
On Fri, 2005-08-12 at 08:47 +, Steve Poe wrote:
 Paul,
 
 Before I say anything else, one online document which may be of
 assistance to you is:
 http://www.powerpostgresql.com/PerfList/
 
 Some thoughts I have:
 
 3) You're shared RAM setting seems overkill to me. Part of the challenge
 is you're going from 1000 to 262K with no assessment in between. Each
 situation can be different, but try in the range of 10 - 50K.
 
 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog
 you're better off.

Like Mr. Stone said earlier, this is pure dogma.  In my experience,
xlogs on the same volume with data is much faster if both are on
battery-backed write-back RAID controller memory.  Moving from this
situation to xlogs on a single normal disk is going to be much slower in
most cases.

-jwb

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SPAM?] Re: [PERFORM] PG8 Tuning

2005-08-11 Thread Tom Arthurs
I think the T-3 RAID at least breaks some of these rules -- I've got 2 
T-3's, 1 configured as RAID-10 and the other as RAID5, and they both 
seem to perform about the same.  I use RAID5 with a hot spare, so it's 
using 8 spindles.


I got a lot of performance improvement out of mount the fs noatime and 
turning journaling off.  Of course it takes a *long* time to recover 
from a crash.


Steve Poe wrote:

Paul,

Before I say anything else, one online document which may be of
assistance to you is:
http://www.powerpostgresql.com/PerfList/

Some thoughts I have:

3) You're shared RAM setting seems overkill to me. Part of the challenge
is you're going from 1000 to 262K with no assessment in between. Each
situation can be different, but try in the range of 10 - 50K.

4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog
you're better off. If it is sharing with any other OS/DB resource, the
performance will be impacted.


From what I have learned from others on this list, RAID5 is not the best

choice for the database. RAID10 would be a better solution (using 8 of
your disks) then take the remaining disk and do mirror with your pg_xlog
if possible.

Best of luck,

Steve Poe

On Thu, 2005-08-11 at 13:23 +0100, Paul Johnson wrote:


Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC
CPUs running Solaris 10. The DB cluster is on an external fibre-attached
Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN.

The system is for the sole use of a couple of data warehouse developers,
hence we are keen to use 'aggressive' tuning options to maximise
performance.

So far we have made the following changes and measured the impact on our
test suite:

1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement
in some cases.

2) Increase work_mem from 1,024 to 524,288.

3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required
setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box.

Question - can Postgres only use 2GB RAM, given that shared_buffers can
only be set as high as 262,143 (8K pages)?

So far so good...

4) Move /pg_xlog to an internal disk within the V250. This has had a
severe *negative* impact on performance. Copy job has gone from 2 mins to
12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL
jobs.

I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to
a single spindle disk?

In cases such as this, where an external storage array with a hardware
RAID controller is used, the normal advice to separate the data from the
pg_xlog  seems to come unstuck, or are we missing something?

Cheers,

Paul Johnson.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq