Re: [PERFORM] Linux Filesystems again - Ubuntu this time

2010-07-27 Thread Whit Armstrong
Kevin,

While we're on the topic, do you also diable fsync?

We use xfs with battery-backed raid as well.  We have had no issues with xfs.

I'm curious whether anyone can comment on his experience (good or bad)
using xfs/battery-backed-cache/fsync=off.

Thanks,
Whit


On Tue, Jul 27, 2010 at 9:48 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote:

 Also xfs has seen quite a bit of development in these later
 kernels, any thoughts on that?

 We've been using xfs for a few years now with good performance and
 no problems other than needing to disable write barriers to get good
 performance out of our battery-backed RAID adapter.

 -Kevin

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux Filesystems again - Ubuntu this time

2010-07-27 Thread Whit Armstrong
Thanks.

But there is no such risk to turning off write barriers?

I'm only specifying noatime for xfs at the moment.

Did you get a substantial performace boost from disabling write
barriers?  like 10x or more like 2x?

Thanks,
Whit



On Tue, Jul 27, 2010 at 1:19 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 Basically, you should never use fsync unless you are OK with
 losing everything in the database server if you have an OS or
 hardware failure.

 s/use/disable/

 -Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] enum for performance?

2009-06-17 Thread Whit Armstrong
I have a column which only has six states or values.

Is there a size advantage to using an enum for this data type?
Currently I have it defined as a character(1).

This table has about 600 million rows, so it could wind up making a
difference in total size.

Thanks,
Whit

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-30 Thread Whit Armstrong
Thanks, Scott.

 I went with ext3 for the OS -- it makes Ops feel a lot better. ext2 for a
 separate xlogs partition, and xfs for the data.
 ext2's drawbacks are not relevant for a small partition with just xlog data,
 but are a problem for the OS.

Can you suggest an appropriate size for the xlogs partition?  These
files are controlled by checkpoint_segments, is that correct?

We have checkpoint_segments set to 500 in the current setup, which is
about 8GB.  So 10 to 15 GB xlogs partition?  Is that reasonable?

-Whit

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-30 Thread Whit Armstrong
Thanks to everyone who helped me arrive at the config for this server.
 Here is my first set of benchmarks using the standard pgbench setup.

The benchmark numbers seem pretty reasonable to me, but I don't have a
good feel for what typical numbers are.  Any feedback is appreciated.

-Whit


the server is set up as follows:
6 1TB drives all seagateBarracuda ES.2
dell PERC 6 raid controller card
RAID 1 volume with OS and pg_xlog mounted as a separate partition w/
noatime and data=writeback both ext3
RAID 10 volume with pg_data as xfs

nodead...@node3:~$ /usr/lib/postgresql/8.3/bin/pgbench -t 1 -c 10
-U dbadmin test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 1
number of transactions actually processed: 10/10
tps = 5498.740733 (including connections establishing)
tps = 5504.244984 (excluding connections establishing)
nodead...@node3:~$ /usr/lib/postgresql/8.3/bin/pgbench -t 1 -c 10
-U dbadmin test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 1
number of transactions actually processed: 10/10
tps = 5627.047823 (including connections establishing)
tps = 5632.835873 (excluding connections establishing)
nodead...@node3:~$ /usr/lib/postgresql/8.3/bin/pgbench -t 1 -c 10
-U dbadmin test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 1
number of transactions actually processed: 10/10
tps = 5629.213818 (including connections establishing)
tps = 5635.225116 (excluding connections establishing)
nodead...@node3:~$



On Wed, Apr 29, 2009 at 2:58 PM, Scott Carey sc...@richrelevance.com wrote:

 On 4/29/09 7:28 AM, Whit Armstrong armstrong.w...@gmail.com wrote:

 Thanks, Scott.

 I went with ext3 for the OS -- it makes Ops feel a lot better. ext2 for a
 separate xlogs partition, and xfs for the data.
 ext2's drawbacks are not relevant for a small partition with just xlog data,
 but are a problem for the OS.

 Can you suggest an appropriate size for the xlogs partition?  These
 files are controlled by checkpoint_segments, is that correct?

 We have checkpoint_segments set to 500 in the current setup, which is
 about 8GB.  So 10 to 15 GB xlogs partition?  Is that reasonable?


 Yes and no.
 If you are using or plan to ever use log shipping you¹ll need more space.
 In most setups, It will keep around logs until successful shipping has
 happened and been told to remove them, which will allow them to grow.
 There may be other reasons why the total files there might be greater and
 I'm not an expert in all the possibilities there so others will probably
 have to answer that.

 With a basic install however, it won't use much more than your calculation
 above.
 You probably want a little breathing room in general, and in most new
 systems today its not hard to carve out 50GB.  I'd be shocked if your mirror
 that you are carving this out of isn't at least 250GB since its SATA.

 I will reiterate that on a system your size the xlog throughput won't be a
 bottleneck (fsync latency might, but raid cards with battery backup is for
 that).  So the file system choice isn't a big deal once its on its own
 partition -- the main difference at that point is almost entirely max write
 throughput.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] partition question for new server setup

2009-04-28 Thread Whit Armstrong
I have the opportunity to set up a new postgres server for our
production database.  I've read several times in various postgres
lists about the importance of separating logs from the actual database
data to avoid disk contention.

Can someone suggest a typical partitioning scheme for a postgres server?

My initial thought was to create /var/lib/postgresql as a partition on
a separate set of disks.

However, I can see that the xlog files will be stored here as well:
http://www.postgresql.org/docs/8.3/interactive/storage-file-layout.html

Should the xlog files be stored on a separate partition to improve performance?

Any suggestions would be very helpful.  Or if there is a document that
lays out some best practices for server setup, that would be great.

The database usage will be read heavy (financial data) with batch
writes occurring overnight and occassionally during the day.

server information:
Dell PowerEdge 2970, 8 core Opteron 2384
6 1TB hard drives with a PERC 6i
64GB of ram

We will be running Ubuntu 9.04.

Thanks in advance,
Whit

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-28 Thread Whit Armstrong
Thanks, Scott.

Just to clarify you said:

 postgres.  So, my pg_xlog and all OS and logging stuff goes on the
 RAID-10 and the main store for the db goes on the RAID-10.

Is that meant to be that the pg_xlog and all OS and logging stuff go
on the RAID-1 and the real database (the
/var/lib/postgresql/8.3/main/base directory) goes on the RAID-10
partition?

This is very helpful.  Thanks for your feedback.

Additionally are there any clear choices w/ regard to filesystem
types?  Our choices would be xfs, ext3, or ext4.

Is anyone out there running ext4 on a production system?

-Whit

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-28 Thread Whit Armstrong
  echo noop /sys/block/hdx/queue/scheduler

can this go into /etc/init.d somewhere?

or does that change stick between reboots?

-Whit


On Tue, Apr 28, 2009 at 2:16 PM, Craig James craig_ja...@emolecules.com wrote:
 Kenneth Marshall wrote:

 Additionally are there any clear choices w/ regard to filesystem
 types? ?Our choices would be xfs, ext3, or ext4.

 Well, there's a lot of people who use xfs and ext3.  XFS is generally
 rated higher than ext3 both for performance and reliability.  However,
 we run Centos 5 in production, and XFS isn't one of the blessed file
 systems it comes with, so we're running ext3.  It's worked quite well
 for us.


 The other optimizations are using data=writeback when mounting the
 ext3 filesystem for PostgreSQL and using the elevator=deadline for
 the disk driver. I do not know how you specify that for Ubuntu.

 After a reading various articles, I thought that noop was the right choice
 when you're using a battery-backed RAID controller.  The RAID controller is
 going to cache all data and reschedule the writes anyway, so the kernal
 schedule is irrelevant at best, and can slow things down.

 On Ubuntu, it's

  echo noop /sys/block/hdx/queue/scheduler

 where hdx is replaced by the appropriate device.

 Craig



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-28 Thread Whit Armstrong
I see.

Thanks for everyone for replying.  The whole discussion has been very helpful.

Cheers,
Whit


On Tue, Apr 28, 2009 at 3:13 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Whit Armstrong armstrong.w...@gmail.com wrote:
   echo noop /sys/block/hdx/queue/scheduler

 can this go into /etc/init.d somewhere?

 We set the default for the kernel in the /boot/grub/menu.lst file.  On
 a kernel line, add  elevator=xxx (where xxx is your choice of
 scheduler).

 -Kevin



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-28 Thread Whit Armstrong
are there any other xfs settings that should be tuned for postgres?

I see this post mentions allocation groups.  does anyone have
suggestions for those settings?
http://archives.postgresql.org/pgsql-admin/2009-01/msg00144.php

what about raid stripe size?  does it really make a difference?  I
think the default for the perc is 64kb (but I'm not in front of the
server right now).

-Whit


On Tue, Apr 28, 2009 at 7:40 PM, Scott Carey sc...@richrelevance.com wrote:
 On 4/28/09 11:16 AM, Craig James craig_ja...@emolecules.com wrote:

 Kenneth Marshall wrote:
 Additionally are there any clear choices w/ regard to filesystem
 types? ?Our choices would be xfs, ext3, or ext4.
 Well, there's a lot of people who use xfs and ext3.  XFS is generally
 rated higher than ext3 both for performance and reliability.  However,
 we run Centos 5 in production, and XFS isn't one of the blessed file
 systems it comes with, so we're running ext3.  It's worked quite well
 for us.


 The other optimizations are using data=writeback when mounting the
 ext3 filesystem for PostgreSQL and using the elevator=deadline for
 the disk driver. I do not know how you specify that for Ubuntu.

 After a reading various articles, I thought that noop was the right choice
 when you're using a battery-backed RAID controller.  The RAID controller is
 going to cache all data and reschedule the writes anyway, so the kernal
 schedule is irrelevant at best, and can slow things down.

 On Ubuntu, it's

   echo noop /sys/block/hdx/queue/scheduler

 where hdx is replaced by the appropriate device.

 Craig


 I've always had better performance from deadline than noop, no matter what
 raid controller I have.  Perhaps with a really good one or a SAN that
 changes (NOT a PERC 6 mediocre thingamabob).

 PERC 6 really, REALLY needs to have the linux readahead value set up to at
 least 1MB per effective spindle to get good sequential read performance.
 Xfs helps with it too, but you can mitigate half of the ext3 vs xfs
 sequential access performance with high readahead settings:

 /sbin/blockdev --setra value device

 Value is in blocks (512 bytes)

 /sbin/blockdev --getra device to see its setting.   Google for more info.


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] partition question for new server setup

2009-04-28 Thread Whit Armstrong
Thanks, Scott.

So far, I've followed a pattern similar to Scott Marlowe's setup.  I
have configured 2 disks as a RAID 1 volume, and 4 disks as a RAID 10
volume.  So, the OS and xlogs will live on the RAID 1 vol and the data
will live on the RAID 10 vol.

I'm running the memtest on it now, so we still haven't locked
ourselves into any choices.

regarding your comment:
 6 and 8 disk counts are tough.  My biggest single piece of advise is to have
 the xlogs in a partition separate from the data (not necessarily a different
 raid logical volume), with file system and mount options tuned for each case
 separately.  I've seen this alone improve performance by a factor of 2.5 on
 some file system / storage combinations.

can you suggest mount options for the various partitions?  I'm leaning
towards xfs for the filesystem format unless someone complains loudly
about data corruption on xfs for a recent 2.6 kernel.

-Whit


On Tue, Apr 28, 2009 at 7:58 PM, Scott Carey sc...@richrelevance.com wrote:

 server information:
 Dell PowerEdge 2970, 8 core Opteron 2384
 6 1TB hard drives with a PERC 6i
 64GB of ram

 We're running a similar configuration: PowerEdge 8 core, PERC 6i, but we have
 8 of the 2.5 10K 384GB disks.

 When I asked the same question on this forum, I was advised to just put all 8
 disks into a single RAID 10, and forget about separating things.  The
 performance of a battery-backed PERC 6i (you did get a battery-backed cache,
 right?) with 8 disks is quite good.

 In order to separate the logs, OS and data, I'd have to split off at least 
 two
 of the 8 disks, leaving only six for the RAID 10 array.  But then my xlogs
 would be on a single disk, which might not be safe.  A more robust approach
 would be to split off four of the disks, put the OS on a RAID 1, the xlog on 
 a
 RAID 1, and the database data on a 4-disk RAID 10.  Now I've separated the
 data, but my primary partition has lost half its disks.

 So, I took the advice, and just made one giant 8-disk RAID 10, and I'm very
 happy with it.  It has everything: Postgres, OS and logs.  But since the RAID
 array is 8 disks instead of 4, the net performance seems to quite good.


 If you go this route, there are a few risks:
 1.  If everything is on the same partition/file system, fsyncs from the
 xlogs may cross-pollute to the data.  Ext3 is notorious for this, though
 data=writeback limits the effect you especially might not want
 data=writeback on your OS partition.  I would recommend that the OS, Data,
 and xlogs + etc live on three different partitions regardless of the number
 of logical RAID volumes.
 2. Cheap raid controllers (PERC, others) will see fsync for an array and
 flush everything that is dirty (not just the partition or file data), which
 is a concern if you aren't using it in write-back with battery backed cache,
 even for a very read heavy db that doesn't need high fsync speed for
 transactions.

 But ... your mileage may vary.  My box has just one thing running on it:
 Postgres.  There is almost no other disk activity to interfere with the
 file-system caching.  If your server is going to have a bunch of other
 activity that generate a lot of non-Postgres disk activity, then this advice
 might not apply.

 Craig


 6 and 8 disk counts are tough.  My biggest single piece of advise is to have
 the xlogs in a partition separate from the data (not necessarily a different
 raid logical volume), with file system and mount options tuned for each case
 separately.  I've seen this alone improve performance by a factor of 2.5 on
 some file system / storage combinations.


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance