Re: [PERFORM] performance impact of non-C locale

2008-09-11 Thread Peter Eisentraut

Axel Rau wrote:

some erp software requires a change of my pgsql cluster from
locale Cencoding UTF-8
to
locale de_DE.UTF-8encoding UTF-8

Most of my databases have only ASCII text data (8 bit UTF8 code range) 
in the text columns.

Does the above change influence index performance on such columns?


Yes.

Does postmaster keep track on any multibyte characters being inserted in 
such columns, so that the planner can adapt?


No.


What other performance impacts can be expected?


The performance impact is mainly with string comparisons and sorts.  I 
suggest you run your own tests to find out what is acceptable in your 
scenario.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance impact of non-C locale

2008-09-11 Thread Peter Eisentraut

Axel Rau wrote:
Im not yet convinced to switch to non-C locale. Is the following 
intended behavior:

With lc_ctype  C:  select lower('ÄÖÜ'); = ÄÖÜ
With lc_ctype  en_US.utf8  select lower('ÆÅË'); = æåë
? (Both have server encoding UTF8)


I would expect exactly that.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Choosing a filesystem

2008-09-11 Thread Laszlo Nagy
I'm about to buy a new server. It will be a Xeon system with two 
processors (4 cores per processor) and  16GB RAM. Two RAID extenders 
will be attached to an Intel s5000 series motherboard, providing 12 
SAS/Serial ATA connectors.


The server will run FreeBSD 7.0, PostgreSQL 8, apache, PHP, mail server, 
dovecot IMAP server and background programs for database maintenance. On 
our current system, I/O performance for PostgreSQL is the biggest 
problem, but sometimes all CPUs are at 100%. Number of users using this 
system:


PostgreSQL:  30 connections
Apache: 30 connections
IMAP server: 15 connections

The databases are mostly OLTP, but the background programs are creating 
historical data and statistic data continuously, and sometimes web site 
visitors/serach engine robots run searches in bigger tables (with 
3million+ records).


There is an expert at the company who sells the server, and he 
recommended that I use SAS disks for the base system at least. I would 
like to use many SAS disks, but they are just too expensive. So the 
basic system will reside on a RAID 1 array, created from two SAS disks 
spinning at 15 000 rpm. I will buy 10 pieces of Seagate Barracuda 320GB 
SATA (ES 7200) disks for the rest.


The expert told me to use RAID 5 but I'm hesitating. I think that RAID 
1+0 would be much faster, and I/O performance is what I really need.


I would like to put the WAL file on the SAS disks to improve 
performance, and create one big RAID 1+0 disk for the data directory. 
But maybe I'm completely wrong. Can you please advise how to create 
logical partitions? The hardware is capable of handling different types 
of RAID volumes on the same set of disks. For example, a smaller RAID 0 
for indexes and a bigger RAID 5 etc.


If you need more information about the database, please ask. :-)

Thank you very much,

  Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choosing a filesystem

2008-09-11 Thread Andrew Sullivan
On Thu, Sep 11, 2008 at 06:29:36PM +0200, Laszlo Nagy wrote:

 The expert told me to use RAID 5 but I'm hesitating. I think that RAID 1+0 
 would be much faster, and I/O performance is what I really need.

I think you're right.  I think it's a big mistake to use RAID 5 in a
database server where you're hoping for reasonable write performance.
In theory RAID 5 ought to be fast for reads, but I've never seen it
work that way.

 I would like to put the WAL file on the SAS disks to improve performance, 
 and create one big RAID 1+0 disk for the data directory. But maybe I'm 
 completely wrong. Can you please advise how to create logical partitions? 

I would listen to yourself before you listen to the expert.  You sound
right to me :)

A


-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choosing a filesystem

2008-09-11 Thread Matthew Wakeling

On Thu, 11 Sep 2008, Laszlo Nagy wrote:
So the basic system will reside on a RAID 1 array, created from two SAS 
disks spinning at 15 000 rpm. I will buy 10 pieces of Seagate Barracuda 
320GB SATA (ES 7200) disks for the rest.


That sounds good. Put RAID 1 on the pair, and RAID 1+0 on the rest. It'll 
be pretty good. Put the OS and the WAL on the pair, and the database on 
the large array.


However, one of the biggest things that will improve your performance 
(especially in OLTP) is to use a proper RAID controller with a 
battery-backed-up cache.


Matthew

--
X's book explains this very well, but, poor bloke, he did the Cambridge Maths 
Tripos...   -- Computer Science Lecturer


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choosing a filesystem

2008-09-11 Thread Kevin Grittner
 Kenneth Marshall [EMAIL PROTECTED] wrote:
 On Thu, Sep 11, 2008 at 06:18:37PM +0100, Matthew Wakeling wrote:
 On Thu, 11 Sep 2008, Laszlo Nagy wrote:
 So the basic system will reside on a RAID 1 array, created from two
SAS 
 disks spinning at 15 000 rpm. I will buy 10 pieces of Seagate
Barracuda 
 320GB SATA (ES 7200) disks for the rest.

 That sounds good. Put RAID 1 on the pair, and RAID 1+0 on the rest.
It'll 
 be pretty good. Put the OS and the WAL on the pair, and the database
on the 
 large array.

 However, one of the biggest things that will improve your
performance 
 (especially in OLTP) is to use a proper RAID controller with a 
 battery-backed-up cache.
 
 But remember that putting the WAL on a separate drive(set) will only
 help if you do not have competing I/O, such as system logging or
paging,
 going to the same drives. This turns your fast sequential I/O into
 random I/O with the accompaning 10x or more performance decrease.
 
Unless you have a good RAID controller with battery-backed-up cache.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choosing a filesystem

2008-09-11 Thread Laszlo Nagy



going to the same drives. This turns your fast sequential I/O into
random I/O with the accompaning 10x or more performance decrease.

 
Unless you have a good RAID controller with battery-backed-up cache.
  

All right. :-) This is what I'll have:

Boxed Intel Server Board S5000PSLROMB with 8-port SAS ROMB card 
(Supports 45nm processors (Harpertown and Wolfdale-DP)
Intel® RAID Activation key AXXRAK18E enables full intelligent SAS RAID 
on S5000PAL, S5000PSL, SR4850HW4/M, SR6850HW4/M. RoHS Compliant.
512 MB 400MHz DDR2 ECC Registered CL3 DIMM Single Rank, x8(for 
s5000pslromb)
6-drive SAS/SATA backplane with expander (requires 2 SAS ports) for 
SC5400 and SC5299 (two pieces)

5410 Xeon 2.33 GHz/1333 FSB/12MB Dobozos , Passive cooling / 80W (2 pieces)
2048 MB 667MHz DDR2 ECC Fully Buffered CL5 DIMM Dual Rank, x8 (8 pieces)

SAS disks will be:  146.8 GB, SAS 3G,15000RPM, 16 MB cache (two pieces)
SATA disks will be: HDD Server SEAGATE Barracuda ES 7200.1 
(320GB,16MB,SATA II-300) __(10 pieces)


I cannot spend more money on this computer, but since you are all 
talking about battery back up, I'll try to get money from the management 
and buy this:


Intel® RAID Smart Battery AXXRSBBU3, optional battery back up for use 
with AXXRAK18E and SRCSAS144E.  RoHS Complaint.



This server will also be an IMAP server, web server etc. so I'm 100% 
sure that the SAS disks will be used for logging. I have two spare 200GB 
SATA disks here in the office but they are cheap ones designed for 
desktop computers. Is it okay to dedicate these disks for the WAL file 
in RAID1? Will it improve performance? How much trouble would it cause 
if the WAL file goes wrong? Should I just put the WAL file on the RAID 
1+0 array?


Thanks,

 Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choosing a filesystem

2008-09-11 Thread Scott Marlowe
On Thu, Sep 11, 2008 at 10:29 AM, Laszlo Nagy [EMAIL PROTECTED] wrote:
 I'm about to buy a new server. It will be a Xeon system with two processors
 (4 cores per processor) and  16GB RAM. Two RAID extenders will be attached
 to an Intel s5000 series motherboard, providing 12 SAS/Serial ATA
 connectors.

 The server will run FreeBSD 7.0, PostgreSQL 8, apache, PHP, mail server,
 dovecot IMAP server and background programs for database maintenance. On our
 current system, I/O performance for PostgreSQL is the biggest problem, but
 sometimes all CPUs are at 100%. Number of users using this system:

100% what? sys?  user? iowait?  if it's still iowait, then the newer,
bigger, faster RAID should really help.

 PostgreSQL:  30 connections
 Apache: 30 connections
 IMAP server: 15 connections

 The databases are mostly OLTP, but the background programs are creating
 historical data and statistic data continuously, and sometimes web site
 visitors/serach engine robots run searches in bigger tables (with 3million+
 records).

This might be a good application to setup where you slony replicate to
another server, then run your I/O intensive processes against the
slave.

 There is an expert at the company who sells the server, and he recommended
 that I use SAS disks for the base system at least. I would like to use many
 SAS disks, but they are just too expensive. So the basic system will reside
 on a RAID 1 array, created from two SAS disks spinning at 15 000 rpm. I will
 buy 10 pieces of Seagate Barracuda 320GB SATA (ES 7200) disks for the rest.

SAS = a bit faster, and better at parallel work.  However, short
stroking 7200 RPM SATA drives on the fastest parts of the platters can
get you close to SAS territory for a fraction of the cost, plus you
can then store backups etc on the rest of the drives at night.

So, you're gonna put the OS o RAID1, and pgsql on the rest...  Makes
sense.  consider setting up another RAID1 for the pg_clog directory.

 The expert told me to use RAID 5 but I'm hesitating. I think that RAID 1+0
 would be much faster, and I/O performance is what I really need.

The expert is most certainly wrong for an OLTP database.  If your RAID
controller can't run RAID-10 quickly compared to RAID-5 then it's a
crap card, and you need a better one.  Or put it into JBOD and let the
OS handle the RAID-10 work.  Or split it RAID-1 sets on the
controller, RAID-0 in the OS.

 I would like to put the WAL file on the SAS disks to improve performance,

Actually, the WAL doesn't need SAS for good performance really.
Except for the 15K.6 Seagate Cheetahs, most decent SATA drives are
within a few percentage of SAS drives for sequential write / read
speed, which is what the WAL basically does.

 and create one big RAID 1+0 disk for the data directory. But maybe I'm
 completely wrong. Can you please advise how to create logical partitions?
 The hardware is capable of handling different types of RAID volumes on the
 same set of disks. For example, a smaller RAID 0 for indexes and a bigger
 RAID 5 etc.

Avoid RAID-5 on OLTP.

Now, if you have a slony slave for the aggregate work stuff, and
you're doing big reads and writes, RAID-5 on a large SATA set may be a
good and cost effective solution.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Choosing a filesystem

2008-09-11 Thread Scott Marlowe
On Thu, Sep 11, 2008 at 11:47 AM, Laszlo Nagy [EMAIL PROTECTED] wrote:
 I cannot spend more money on this computer, but since you are all talking
 about battery back up, I'll try to get money from the management and buy
 this:

 Intel(R) RAID Smart Battery AXXRSBBU3, optional battery back up for use with
 AXXRAK18E and SRCSAS144E.  RoHS Complaint.

Sacrifice a couple of SAS drives to get that.

I'd rather have all SATA drives and a BBU than SAS without one.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread Scott Carey
Hmm, I would expect this tunable to potentially be rather file system
dependent, and potentially raid controller dependant.  The test was using
ext2, perhaps the others automatically prefetch or read ahead?   Does it
vary by RAID controller?

Well I went and found out, using ext3 and xfs.  I have about 120+ data
points but here are a few interesting ones before I compile the rest and
answer a few other questions of my own.

1:  readahead does not affect pure random I/O -- there seems to be a
heuristic trigger -- a single process or file probably has to request a
sequence of linear I/O of some size to trigger it.  I set it to over 64MB of
read-ahead and random iops remained the same to prove this.
2:  File system matters more than you would expect.  XFS sequential
transfers when readahead was tuned had TWICE the sequential throughput of
ext3, both for a single reader and 8 concurrent readers on 8 different
files.
3:  The RAID controller and its configuration make a pretty significant
difference as well.

Hardware:
12 7200RPM SATA (Seagate) in raid 10 on 3Ware 9650 (only ext3)
12 7200RPM SATA ('nearline SAS' : Seagate ES.2) on PERC 6 in raid 10 (ext3,
xfs)
I also have some results with PERC raid 10 with 4x 15K SAS, not reporting in
this message though


Testing process:
All tests begin with
#sync; echo 3  /proc/sys/vm/drop_caches;
followed by
#blockdev --setra XXX /dev/sdb
Even though FIO claims that it issues reads that don't go to cache, the
read-ahead DOES go to the file system cache, and so one must drop them to
get consistent results unless you disable the read-ahead.  Even if you are
reading more than 2x the physical RAM, that first half of the test is
distorted.  By flushing the cache first my results became consistent within
about +-2%.

Tests
-- fio, read 8 files concurrently, sequential read profile, one process per
file:
[seq-read8]
rw=read
; this will be total of all individual files per process
size=8g
directory=/data/test
fadvise_hint=0
blocksize=8k
direct=0
ioengine=sync
iodepth=1
numjobs=8
; this is number of files total per process
nrfiles=1
runtime=1m

-- fio, read one large file sequentially with one process
[seq-read]
rw=read
; this will be total of all individual files per process
size=64g
directory=/data/test
fadvise_hint=0
blocksize=8k
direct=0
ioengine=sync
iodepth=1
numjobs=1
; this is number of files total per process
nrfiles=1
runtime=1m

-- 'dd' in a few ways:
Measure direct to partition / disk read rate at the start of the disk:
'dd if=/dev/sdb of=/dev/null ibs=24M obs=64K'
Measure direct to partition / disk read rate near the end of the disk:
'dd if=/dev/sdb1 of=/dev/null ibs=24M obs=64K skip=160K'
Measure direct read of the large file used by the FIO one sequential file
test:
'dd if=/data/test/seq-read.1.0 of=/dev/null ibs=32K obs=32K'

the dd paramters for block sizes were chosen with much experimentation to
get the best result.


Results:
I've got a lot of results, I'm only going to put a few of them here for now
while I investigate a few other things (see the end of this message)
Preliminary summary:

PERC 6, ext3, full partition.
dd beginning of disk :  642MB/sec
dd end of disk: 432MB/sec
dd large file (readahead 49152): 312MB/sec
-- maximum expected sequential capabilities above?

fio: 8 concurrent readers and 1 concurrent reader results
readahead is in 512 byte blocks, sequential transfer rate in MiB/sec as
reported by fio.

readahead  |  8 conc read rate  |  1 conc read rate
49152  |  311  |  314
16384  |  312  |  312
12288  |  304  |  309
 8192  |  292  |
 4096  |  264  |
 2048  |  211  |
 1024  |  162  |  302
  512  |  108  |
  256  |  81  | 300
8  |  38  |

Conclusion, on this array going up to 12288 (6MB) readahead makes a huge
impact on concurrent sequential reads.  That is 1MB per raid slice (6, 12
disks raid 10).  Sequential read performance under concurrent.  It has
almost no impact at all on one sequential read alone, the OS or the RAID
controller are dealing with that case just fine.

But, how much of the above effect is ext3?  How much is it the RAID card?
At the top end, the sequential rate for both concurrent and single
sequential access is in line with what dd can get going through ext3.  But
it is not even close to what you can get going right to the device and
bypassing the file system.

Lets try a different RAID card first.  The disks aren't exactly the same,
and there is no guarantee that the file is positioned near the beginning or
end, but I've got another 12 disk RAID 10, using a 3Ware 9650 card.

Results, as above -- don't conclude this card is faster, the files may have
just been closer to the front of the partition.
dd, beginning of disk: 522MB/sec
dd, end of disk array: 412MB/sec
dd, file read via file system (readahead 49152): 391MB/sec

readahead  |  8 conc read rate  |  1 conc read rate
49152  |  343  |  392
16384  |  349  |  379
12288  |  348  |  387
 8192  |  344  |
 6144  |  |  376
 4096  |  340  |
 2048  |  319  |
 1024  |  284  | 

Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread James Mansion

Greg Smith wrote:
The point I was trying to make there is that even under impossibly 
optimal circumstances, you'd be hard pressed to blow out the disk's 
read cache with seek-dominated data even if you read a lot at each 
seek point.  That idea didn't make it from my head into writing very 
well though.


Isn't there a bigger danger in blowing out the cache on the controller 
and causing premature pageout of its dirty pages?


If you could get the readahead to work on the drive and not return data 
to the controller, that might be dandy, but I'm sceptical.


James


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread Scott Carey
Sorry, I forgot to mention the Linux kernel version I'm using, etc:

2.6.18-92.1.10.el5 #1 SMP x86_64
CentOS 5.2.

The adaptive read-ahead, as well as other enhancements in the kernel, are
taking place or coming soon in the most recent stuff.  Some distributions
offer the adaptive read-ahead as an add-on (Debian, for example).  This is
an area where much can be improved in Linux http://kerneltrap.org/node/6642
http://kernelnewbies.org/Linux_2_6_23#head-102af265937262a7a21766ae58fddc1a29a5d8d7

I obviously did not test how the new read-ahead stuff impacts these sorts of
tests.

On Thu, Sep 11, 2008 at 12:07 PM, Scott Carey [EMAIL PROTECTED]wrote:

 Hmm, I would expect this tunable to potentially be rather file system
 dependent, and potentially raid controller dependant.  The test was using
 ext2, perhaps the others automatically prefetch or read ahead?   Does it
 vary by RAID controller?

 Well I went and found out, using ext3 and xfs.  I have about 120+ data
 points but here are a few interesting ones before I compile the rest and
 answer a few other questions of my own.

 1:  readahead does not affect pure random I/O -- there seems to be a
 heuristic trigger -- a single process or file probably has to request a
 sequence of linear I/O of some size to trigger it.  I set it to over 64MB of
 read-ahead and random iops remained the same to prove this.
 2:  File system matters more than you would expect.  XFS sequential
 transfers when readahead was tuned had TWICE the sequential throughput of
 ext3, both for a single reader and 8 concurrent readers on 8 different
 files.
 3:  The RAID controller and its configuration make a pretty significant
 difference as well.

 Hardware:
 12 7200RPM SATA (Seagate) in raid 10 on 3Ware 9650 (only ext3)
 12 7200RPM SATA ('nearline SAS' : Seagate ES.2) on PERC 6 in raid 10 (ext3,
 xfs)
 I also have some results with PERC raid 10 with 4x 15K SAS, not reporting
 in this message though

   . . . {snip}


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread david

On Thu, 11 Sep 2008, Scott Carey wrote:


Drives have their own read-ahead in the firmware.  Many can keep track of 2
or 4 concurrent file accesses.  A few can keep track of more.  This also
plays in with the NCQ or SCSI command queuing implementation.

Consumer drives will often read-ahead much more than server drives optimized
for i/o per second.
The difference in read-ahead sensitivity between the two setups I tested may
be due to one setup using nearline-SAS (SATA, tuned for io-per sec using SAS
firmware) and the other used consumer SATA.
For example, here is one nearline SAS style server tuned drive versus a
consumer tuned one:
http://www.storagereview.com/php/benchmark/suite_v4.php?typeID=10testbedID=4osID=6raidconfigID=1numDrives=1devID_0=354devID_1=348devCnt=2

The Linux readahead setting is _definitely_ in the kernel, definitely uses
and fills the page cache, and from what I can gather, simply issues extra
I/O's to the hardware beyond the last one requested by an app in certain
situations.  It does not make your I/O request larger, it just queues an
extra I/O following your request.


that extra I/O will be merged with your request by the I/O scheduler code 
so that by the time it gets to the drive it will be a single request.


by even if it didn't, most modern drives read the entire cylinder into 
their buffer so any additional requests to the drive will be satisfied 
from this buffer and not have to wait for the disk itself.


David Lang


On Thu, Sep 11, 2008 at 12:54 PM, James Mansion 
[EMAIL PROTECTED] wrote:


Greg Smith wrote:


The point I was trying to make there is that even under impossibly optimal
circumstances, you'd be hard pressed to blow out the disk's read cache with
seek-dominated data even if you read a lot at each seek point.  That idea
didn't make it from my head into writing very well though.

 Isn't there a bigger danger in blowing out the cache on the controller

and causing premature pageout of its dirty pages?

If you could get the readahead to work on the drive and not return data to
the controller, that might be dandy, but I'm sceptical.

James



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread Scott Marlowe
On Thu, Sep 11, 2008 at 3:36 PM,  [EMAIL PROTECTED] wrote:
 On Thu, 11 Sep 2008, Scott Carey wrote:

 Drives have their own read-ahead in the firmware.  Many can keep track of
 2
 or 4 concurrent file accesses.  A few can keep track of more.  This also
 plays in with the NCQ or SCSI command queuing implementation.

 Consumer drives will often read-ahead much more than server drives
 optimized
 for i/o per second.
 The difference in read-ahead sensitivity between the two setups I tested
 may
 be due to one setup using nearline-SAS (SATA, tuned for io-per sec using
 SAS
 firmware) and the other used consumer SATA.
 For example, here is one nearline SAS style server tuned drive versus a
 consumer tuned one:

 http://www.storagereview.com/php/benchmark/suite_v4.php?typeID=10testbedID=4osID=6raidconfigID=1numDrives=1devID_0=354devID_1=348devCnt=2

 The Linux readahead setting is _definitely_ in the kernel, definitely uses
 and fills the page cache, and from what I can gather, simply issues extra
 I/O's to the hardware beyond the last one requested by an app in certain
 situations.  It does not make your I/O request larger, it just queues an
 extra I/O following your request.

 that extra I/O will be merged with your request by the I/O scheduler code so
 that by the time it gets to the drive it will be a single request.

 by even if it didn't, most modern drives read the entire cylinder into their
 buffer so any additional requests to the drive will be satisfied from this
 buffer and not have to wait for the disk itself.

Generally speaking I agree, but I would still make a separate logical
partition for pg_xlog so that if the OS fills up the /var/log dir or
something, it doesn't impact the db.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread david

On Thu, 11 Sep 2008, Scott Marlowe wrote:


On Thu, Sep 11, 2008 at 3:36 PM,  [EMAIL PROTECTED] wrote:

by even if it didn't, most modern drives read the entire cylinder into their
buffer so any additional requests to the drive will be satisfied from this
buffer and not have to wait for the disk itself.


Generally speaking I agree, but I would still make a separate logical
partition for pg_xlog so that if the OS fills up the /var/log dir or
something, it doesn't impact the db.


this is a completely different discussion :-)

while I agree with you in theory, in practice I've seen multiple 
partitions cause far more problems than they have prevented (due to the 
partitions ending up not being large enough and having to be resized after 
they fill up, etc) so I tend to go in the direction of a few large 
partitions.


the only reason I do multiple partitions (besides when the hardware or 
performance considerations require it) is when I can identify that there 
is some data that I would not want to touch on a OS upgrade. I try to make 
it so that an OS upgrade can wipe the OS partitions if nessasary.


David Lang


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread Alan Hodgson
On Thursday 11 September 2008, [EMAIL PROTECTED] wrote:
 while I agree with you in theory, in practice I've seen multiple
 partitions cause far more problems than they have prevented (due to the
 partitions ending up not being large enough and having to be resized
 after they fill up, etc) so I tend to go in the direction of a few large
 partitions.

I used to feel this way until LVM became usable. LVM plus online resizable 
filesystems really makes multiple partitions manageable.


-- 
Alan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread david

On Thu, 11 Sep 2008, Alan Hodgson wrote:


On Thursday 11 September 2008, [EMAIL PROTECTED] wrote:

while I agree with you in theory, in practice I've seen multiple
partitions cause far more problems than they have prevented (due to the
partitions ending up not being large enough and having to be resized
after they fill up, etc) so I tend to go in the direction of a few large
partitions.


I used to feel this way until LVM became usable. LVM plus online resizable
filesystems really makes multiple partitions manageable.


won't the fragmentation of your filesystem across the different LVM 
segments hurt you?


David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Effects of setting linux block device readahead size

2008-09-11 Thread Scott Carey
I also thought that LVM is unsafe for WAL logs and file system journals with
disk write cache -- it doesn't flush the disk write caches correctly and
build write barriers.

As pointed out here:
http://groups.google.com/group/pgsql.performance/browse_thread/thread/9dc43991c1887129
by Greg Smith
http://lwn.net/Articles/283161/



On Thu, Sep 11, 2008 at 3:41 PM, Alan Hodgson [EMAIL PROTECTED] wrote:

 On Thursday 11 September 2008, [EMAIL PROTECTED] wrote:
  while I agree with you in theory, in practice I've seen multiple
  partitions cause far more problems than they have prevented (due to the
  partitions ending up not being large enough and having to be resized
  after they fill up, etc) so I tend to go in the direction of a few large
  partitions.

 I used to feel this way until LVM became usable. LVM plus online resizable
 filesystems really makes multiple partitions manageable.


 --
 Alan

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance