date:20090110

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Ron

At 03:28 PM 1/8/2009, Merlin Moncure wrote:

On Thu, Jan 8, 2009 at 9:42 AM, Stefano Nichele
 wrote:
> Merlin Moncure wrote:
>> IIRC that's the 'perc 6ir' card...no write caching.  You are getting
>> killed with syncs. If you can restart the database, you can test with
>> fsync=off comparing load to confirm this.  (another way is to compare
>> select only vs regular transactions on pgbench).
>
> I'll try next Saturday.
>

just be aware of the danger .  hard reset (power off) class of failure
when fsync = off means you are loading from backups.

merlin

That's what redundant power conditioning UPS's are supposed to help prevent ;-)

Merlin is of course absolutely correct that you are taking a bigger 
risk if you turn fsync off.

I would not recommend fysnc = off if you do not have other safety 
measures in place to protect against data loss because of a power event..

(At least for most DB applications.)

...and of course, those lucky few with bigger budgets can use SSD's 
and not care what fsync is set to.

Ron 

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Scott Marlowe

On Sat, Jan 10, 2009 at 5:40 AM, Ron  wrote:
> At 03:28 PM 1/8/2009, Merlin Moncure wrote:
>> just be aware of the danger .  hard reset (power off) class of failure
>> when fsync = off means you are loading from backups.
>
> That's what redundant power conditioning UPS's are supposed to help prevent
> ;-)

But of course, they can't prevent them, but only reduce the likelihood
of their occurrance.  Everyone who's working in large hosting
environments has at least one horror story to tell about a power
outage that never should have happened.

> I would not recommend fysnc = off if you do not have other safety measures
> in place to protect against data loss because of a power event..
> (At least for most DB applications.)

Agreed.   Keep in mind that you'll be losing whatever wasn't
transferred to the backup machines.

> ...and of course, those lucky few with bigger budgets can use SSD's and not
> care what fsync is set to.

Would that prevent any corruption if the writes got out of order
because of lack of fsync?  Or partial writes?  Or wouldn't fsync still
need to be turned on to keep the data safe.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Gregory Stark

"Scott Marlowe"  writes:

> On Sat, Jan 10, 2009 at 5:40 AM, Ron  wrote:
>> At 03:28 PM 1/8/2009, Merlin Moncure wrote:
>>> just be aware of the danger .  hard reset (power off) class of failure
>>> when fsync = off means you are loading from backups.
>>
>> That's what redundant power conditioning UPS's are supposed to help prevent
>> ;-)
>
> But of course, they can't prevent them, but only reduce the likelihood
> of their occurrance.  Everyone who's working in large hosting
> environments has at least one horror story to tell about a power
> outage that never should have happened.

Or a system crash. If the kernel panics for any reason when it has dirty
buffers in memory the database will need to be restored.

>> ...and of course, those lucky few with bigger budgets can use SSD's and not
>> care what fsync is set to.
>
> Would that prevent any corruption if the writes got out of order
> because of lack of fsync?  Or partial writes?  Or wouldn't fsync still
> need to be turned on to keep the data safe.

I think the idea is that with SSDs or a RAID with a battery backed cache you
can leave fsync on and not have any significant performance hit since the seek
times are very fast for SSD. They have limited bandwidth but bandwidth to the
WAL is rarely an issue -- just latency.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] block device benchmarking

2009-01-10 Thread Markus Wanner

Hi,

I'm fiddling with a hand-made block device based benchmarking thingie,
which I want to run random reads and writes of relatively small blocks
(somewhat similar to databases). I'm much less interested in measuring
throughput, but rather in latency. Besides varying block sizes, I'm also
testing with a varying number of concurrent threads and varying
read/write ratios. As a result, I'm interested in roughly the following
graphs:

 * (single thread) i/o latency vs. seek distance
 * (single thread) throughput vs. (accurator) position
 * (single thread) i/o latency vs. no of concurrent threads
 * total requests per second + throughput vs. no of concurrent threads
 * total requests per second + throughput vs. read/write ratio
 * total requests per second + throughput vs. block size
 * distribution of access times (histogram)

(Of course, not all of these are relevant for all types of storages.)

Does there already exist a tool giving (most of) these measures? Am I
missing something interesting? What would you expect from a block device
benchmarking tool?

Regards

Markus Wanner

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread david


On Sat, 10 Jan 2009, Gregory Stark wrote:


...and of course, those lucky few with bigger budgets can use SSD's and not
care what fsync is set to.


Would that prevent any corruption if the writes got out of order
because of lack of fsync?  Or partial writes?  Or wouldn't fsync still
need to be turned on to keep the data safe.


I think the idea is that with SSDs or a RAID with a battery backed cache you
can leave fsync on and not have any significant performance hit since the seek
times are very fast for SSD. They have limited bandwidth but bandwidth to the
WAL is rarely an issue -- just latency.


I don't think that this is true, even if your SSD is battery backed RAM 
(as opposed to the flash based devices that have slower writes than 
high-end hard drives) you can complete 'writes' to the system RAM faster 
than the OS can get the data to the drive, so if you don't do a fsync you 
can still loose a lot in a power outage.


raid controllers with battery backed ram cache will make the fsyncs very 
cheap (until the cache fills up anyway)


with SSDs having extremely good read speeds, but poor (at least by 
comparison) write speeds I wonder if any of the RAID controllers are going 
to get a mode where they cache writes, but don't cache reads, leaving all 
ofyour cache to handle writes.


David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Markus Wanner

Hi,

da...@lang.hm wrote:
> On Sat, 10 Jan 2009, Gregory Stark wrote:
>> I think the idea is that with SSDs or a RAID with a battery backed
>> cache you
>> can leave fsync on and not have any significant performance hit since
>> the seek
>> times are very fast for SSD. They have limited bandwidth but bandwidth
>> to the
>> WAL is rarely an issue -- just latency.

That's also my understanding.

> with SSDs having extremely good read speeds, but poor (at least by
> comparison) write speeds I wonder if any of the RAID controllers are
> going to get a mode where they cache writes, but don't cache reads,
> leaving all ofyour cache to handle writes.

My understanding of SSDs so far is, that they are not that bad at
writing *on average*, but to perform wear-leveling, they sometimes have
to shuffle around multiple blocks at once. So there are pretty awful
spikes for writing latency (IIRC more than 100ms has been measured on
cheaper disks).

A battery backed cache could theoretically flatten those, as long as
your avg. WAL throughput is below the SSDs avg. writing throughput.

Regards

Markus Wanner

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Scott Marlowe

On Sat, Jan 10, 2009 at 12:00 PM, Markus Wanner  wrote:
> Hi,
>
> da...@lang.hm wrote:
>> On Sat, 10 Jan 2009, Gregory Stark wrote:
>>> I think the idea is that with SSDs or a RAID with a battery backed
>>> cache you
>>> can leave fsync on and not have any significant performance hit since
>>> the seek
>>> times are very fast for SSD. They have limited bandwidth but bandwidth
>>> to the
>>> WAL is rarely an issue -- just latency.
>
> That's also my understanding.
>
>> with SSDs having extremely good read speeds, but poor (at least by
>> comparison) write speeds I wonder if any of the RAID controllers are
>> going to get a mode where they cache writes, but don't cache reads,
>> leaving all ofyour cache to handle writes.
>
> My understanding of SSDs so far is, that they are not that bad at
> writing *on average*, but to perform wear-leveling, they sometimes have
> to shuffle around multiple blocks at once. So there are pretty awful
> spikes for writing latency (IIRC more than 100ms has been measured on
> cheaper disks).

Multiply it by 10 and apply to both reads and writes for most cheap
SSDs when doing random writes and reads mixed together.  Which is why
so many discussions specificall mention the intel XM series, because
they don't suck like that.  They keep good access times even under
several random read / write threads.

Some review of the others was posted here a while back and it was
astounding how slow the others became in a mixed read / write
benchmark.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread david


On Sat, 10 Jan 2009, Markus Wanner wrote:


da...@lang.hm wrote:

On Sat, 10 Jan 2009, Gregory Stark wrote:

I think the idea is that with SSDs or a RAID with a battery backed
cache you
can leave fsync on and not have any significant performance hit since
the seek
times are very fast for SSD. They have limited bandwidth but bandwidth
to the
WAL is rarely an issue -- just latency.


That's also my understanding.


with SSDs having extremely good read speeds, but poor (at least by
comparison) write speeds I wonder if any of the RAID controllers are
going to get a mode where they cache writes, but don't cache reads,
leaving all ofyour cache to handle writes.


My understanding of SSDs so far is, that they are not that bad at
writing *on average*, but to perform wear-leveling, they sometimes have
to shuffle around multiple blocks at once. So there are pretty awful
spikes for writing latency (IIRC more than 100ms has been measured on
cheaper disks).


well, I have one of those cheap disks.

brand new out of the box, format the 32G drive, then copy large files to 
it (~1G per file). this should do almost no wear-leveling, but it's write 
performance is still poor and it has occasional 1 second pauses.


I for my initial tests I hooked it up to a USB->SATA adapter and the write 
speed is showing about half of what I can get on a 1.5TB SATA drive hooked 
to the same system.


the write speed is fairly comparable to what you can do with slow laptop 
drives (even ignoring the pauses)


read speed is much better (and I think limited by the USB)

the key thing with any new storage technology (including RAID controller) 
is that you need to do your own testing, treat the manufacturers specs as 
ideal conditions or 'we guarentee that the product will never do better 
than this' specs


Imation has a white paper on their site about solid state drive 
performance that is interesting. among other things it shows that 
high-speed SCSI drives are still a significant win in 
random-write workloads



at this point, if I was specing out a new high-end system I would be 
looking at and testing somthing like the following


SSD for read-mostly items (OS, possibly some indexes)
15K SCSI drives for heavy writing (WAL, indexes, temp tables, etc)
SATA drives for storage capacity (table contents)

David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Gregory Stark

da...@lang.hm writes:

> On Sat, 10 Jan 2009, Markus Wanner wrote:
>
>> My understanding of SSDs so far is, that they are not that bad at
>> writing *on average*, but to perform wear-leveling, they sometimes have
>> to shuffle around multiple blocks at once. So there are pretty awful
>> spikes for writing latency (IIRC more than 100ms has been measured on
>> cheaper disks).

That would be fascinating. And frightening. A lot of people have been
recommending these for WAL disks and this would be make them actually *worse*
than regular drives.

> well, I have one of those cheap disks.
>
> brand new out of the box, format the 32G drive, then copy large files to it
> (~1G per file). this should do almost no wear-leveling, but it's write
> performance is still poor and it has occasional 1 second pauses.

This isn't similar to the way WAL behaves though. What you're testing is the
behaviour when the bandwidth to the SSD is saturated. At that point some point
in the stack, whether in the SSD, the USB hardware or driver, or OS buffer
cache can start to queue up writes. The stalls you see could be the behaviour
when that queue fills up and it needs to push back to higher layers.

To simulate WAL you want to transfer smaller volumes of data, well below the
bandwidth limit of the drive, fsync the data, then pause a bit repeat. Time
each fsync and see whether the time they take is proportional to the amount of
data written in the meantime or whether they randomly spike upwards.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread david


On Sat, 10 Jan 2009, Gregory Stark wrote:


da...@lang.hm writes:


On Sat, 10 Jan 2009, Markus Wanner wrote:


My understanding of SSDs so far is, that they are not that bad at
writing *on average*, but to perform wear-leveling, they sometimes have
to shuffle around multiple blocks at once. So there are pretty awful
spikes for writing latency (IIRC more than 100ms has been measured on
cheaper disks).


That would be fascinating. And frightening. A lot of people have been
recommending these for WAL disks and this would be make them actually *worse*
than regular drives.


well, I have one of those cheap disks.

brand new out of the box, format the 32G drive, then copy large files to it
(~1G per file). this should do almost no wear-leveling, but it's write
performance is still poor and it has occasional 1 second pauses.


This isn't similar to the way WAL behaves though. What you're testing is the
behaviour when the bandwidth to the SSD is saturated. At that point some point
in the stack, whether in the SSD, the USB hardware or driver, or OS buffer
cache can start to queue up writes. The stalls you see could be the behaviour
when that queue fills up and it needs to push back to higher layers.

To simulate WAL you want to transfer smaller volumes of data, well below the
bandwidth limit of the drive, fsync the data, then pause a bit repeat. Time
each fsync and see whether the time they take is proportional to the amount of
data written in the meantime or whether they randomly spike upwards.


if you have a specific benchmark for me to test I would be happy to do 
this.


the test that I did is basicly the best-case for the SSD (more-or-less 
sequential writes where the vendors claim that the drives match or 
slightly outperform the traditional disks). for random writes the vendors 
put SSDs at fewer IOPS than 5400 rpm drives, let along 15K rpm drives.


take a look at this paper 
http://www.imation.com/PageFiles/83/Imation-SSD-Performance-White-Paper.pdf


this is not one of the low-performance drives, they include a sandisk 
drive in the paper that shows significantly less performance (but the same 
basic pattern) than the imation drives.


David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Ron

At 10:36 AM 1/10/2009, Gregory Stark wrote:

"Scott Marlowe"  writes:

> On Sat, Jan 10, 2009 at 5:40 AM, Ron  wrote:
>> At 03:28 PM 1/8/2009, Merlin Moncure wrote:
>>> just be aware of the danger .  hard reset (power off) class of failure
>>> when fsync = off means you are loading from backups.
>>
>> That's what redundant power conditioning UPS's are supposed to 
help prevent

>> ;-)
>
> But of course, they can't prevent them, but only reduce the likelihood
> of their occurrance.  Everyone who's working in large hosting
> environments has at least one horror story to tell about a power
> outage that never should have happened.

Or a system crash. If the kernel panics for any reason when it has dirty
buffers in memory the database will need to be restored.
A power conditioning UPS should prevent a building wide or circuit 
level bad power event, caused by either dirty power or a power loss, 
from affecting the host.  Within the design limits of the UPS in 
question of course.

So the real worry with fsync = off in a environment with redundant 
decent UPS's is pretty much limited to host level HW failures, SW 
crashes, and unlikely catastrophes like building collapses, lightning 
strikes, floods, etc.
Not that your fsync setting is going to matter much in the event of 
catastrophes in the physical environment...

Like anything else, there is usually more than one way to reduce risk 
while at the same time meeting (realistic) performance goals.
If you need the performance implied by fsync off, then you have to 
take other steps to reduce the risk of data corruption down to about 
the same statistical level as running with fsync on.  Or you have to 
decide that you are willing to life with the increased risk (NOT my 
recommendation for most DB hosting scenarios.)

>> ...and of course, those lucky few with bigger budgets can use 
SSD's and not

>> care what fsync is set to.
>
> Would that prevent any corruption if the writes got out of order
> because of lack of fsync?  Or partial writes?  Or wouldn't fsync still
> need to be turned on to keep the data safe.

I think the idea is that with SSDs or a RAID with a battery backed cache you
can leave fsync on and not have any significant performance hit since the seek
times are very fast for SSD. They have limited bandwidth but bandwidth to the
WAL is rarely an issue -- just latency.
Yes, Greg understands what I meant here.  In the case of SSDs, the 
performance hit of fsync = on is essentially zero.  In the case of 
battery backed RAM caches for RAID arrays, the efficacy is dependent 
on how the size of the cache compares with the working set of the 
disk access pattern.

Ron 

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread david

On Sat, 10 Jan 2009, Ron wrote:

At 10:36 AM 1/10/2009, Gregory Stark wrote:

"Scott Marlowe"  writes:

> On Sat, Jan 10, 2009 at 5:40 AM, Ron  wrote:
>> At 03:28 PM 1/8/2009, Merlin Moncure wrote:
>>> just be aware of the danger .  hard reset (power off) class of failure
>>> when fsync = off means you are loading from backups.
>>
>> That's what redundant power conditioning UPS's are supposed to help 
prevent

>> ;-)
>
> But of course, they can't prevent them, but only reduce the likelihood
> of their occurrance.  Everyone who's working in large hosting
> environments has at least one horror story to tell about a power
> outage that never should have happened.

Or a system crash. If the kernel panics for any reason when it has dirty
buffers in memory the database will need to be restored.
A power conditioning UPS should prevent a building wide or circuit level bad 
power event, caused by either dirty power or a power loss, from affecting the 
host.  Within the design limits of the UPS in question of course.

So the real worry with fsync = off in a environment with redundant decent 
UPS's is pretty much limited to host level HW failures, SW crashes, and 
unlikely catastrophes like building collapses, lightning strikes, floods, 
etc.

I've seen datacenters with redundant UPSs go dark unexpectedly. it's less 
common, but it does happen.

Not that your fsync setting is going to matter much in the event of 
catastrophes in the physical environment...

questionable, but sometimes true. in the physical environment disasters 
you will loose access to your data for a while, but after the drives are 
dug out of the rubble (or dried out from the flood) the data can probably 
be recovered.

for crying out loud, they were able to recover most of the data from the 
hard drives in the latest shuttle disaster.

Like anything else, there is usually more than one way to reduce risk while 
at the same time meeting (realistic) performance goals.

very true.

>> ...and of course, those lucky few with bigger budgets can use SSD's and 
not

>> care what fsync is set to.
>
> Would that prevent any corruption if the writes got out of order
> because of lack of fsync?  Or partial writes?  Or wouldn't fsync still
> need to be turned on to keep the data safe.

I think the idea is that with SSDs or a RAID with a battery backed cache 
you
can leave fsync on and not have any significant performance hit since the 
seek
times are very fast for SSD. They have limited bandwidth but bandwidth to 
the

WAL is rarely an issue -- just latency.
Yes, Greg understands what I meant here.  In the case of SSDs, the 
performance hit of fsync = on is essentially zero.

this is definantly not the case.

fsync off the data stays in memory and may never end up being sent to the 
drive. RAM speeds are several orders of magnatude faster than the 
interfaces to the drives (or even to the RAID controllers in high-speed 
slots)

it may be that it's fast enough (see the other posts disputing that), but 
don't think that it's the same.

David Lang

In the case of battery 
backed RAM caches for RAID arrays, the efficacy is dependent on how the size 
of the cache compares with the working set of the disk access pattern.

Ron 

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Gregory Stark

Ron  writes:

> At 10:36 AM 1/10/2009, Gregory Stark wrote:
>>
>> Or a system crash. If the kernel panics for any reason when it has dirty
>> buffers in memory the database will need to be restored.
>
> A power conditioning UPS should prevent a building wide or circuit level bad
> power event

Except of course those caused *by* a faulty UPS. Or for that matter by the
power supply in the computer or drive array, or someone just accidentally
hitting the wrong power button.

I'm surprised people are so confident in their kernels though. I know some
computers with uptimes measured in years but I know far more which don't.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread david


On Sat, 10 Jan 2009, Luke Lonergan wrote:

The new MLC based SSDs have better wear leveling tech and don't suffer 
the pauses.  Intel X25-M 80 and 160 GB SSDs are both pause-free.  See 
Anandtech's test results for details.


they don't suffer the pauses, but they still don't have fantasic write 
speeds.


David Lang


Intel's SLC SSDs should also be good enough but they're smaller.

- Luke

- Original Message -
From: pgsql-performance-ow...@postgresql.org 

To: Gregory Stark 
Cc: Markus Wanner ; Scott Marlowe ; Ron 
; pgsql-performance@postgresql.org 
Sent: Sat Jan 10 14:40:51 2009
Subject: Re: [PERFORM] understanding postgres issues/bottlenecks

On Sat, 10 Jan 2009, Gregory Stark wrote:


da...@lang.hm writes:


On Sat, 10 Jan 2009, Markus Wanner wrote:


My understanding of SSDs so far is, that they are not that bad at
writing *on average*, but to perform wear-leveling, they sometimes have
to shuffle around multiple blocks at once. So there are pretty awful
spikes for writing latency (IIRC more than 100ms has been measured on
cheaper disks).


That would be fascinating. And frightening. A lot of people have been
recommending these for WAL disks and this would be make them actually *worse*
than regular drives.


well, I have one of those cheap disks.

brand new out of the box, format the 32G drive, then copy large files to it
(~1G per file). this should do almost no wear-leveling, but it's write
performance is still poor and it has occasional 1 second pauses.


This isn't similar to the way WAL behaves though. What you're testing is the
behaviour when the bandwidth to the SSD is saturated. At that point some point
in the stack, whether in the SSD, the USB hardware or driver, or OS buffer
cache can start to queue up writes. The stalls you see could be the behaviour
when that queue fills up and it needs to push back to higher layers.

To simulate WAL you want to transfer smaller volumes of data, well below the
bandwidth limit of the drive, fsync the data, then pause a bit repeat. Time
each fsync and see whether the time they take is proportional to the amount of
data written in the meantime or whether they randomly spike upwards.


if you have a specific benchmark for me to test I would be happy to do
this.

the test that I did is basicly the best-case for the SSD (more-or-less
sequential writes where the vendors claim that the drives match or
slightly outperform the traditional disks). for random writes the vendors
put SSDs at fewer IOPS than 5400 rpm drives, let along 15K rpm drives.

take a look at this paper
http://www.imation.com/PageFiles/83/Imation-SSD-Performance-White-Paper.pdf

this is not one of the low-performance drives, they include a sandisk
drive in the paper that shows significantly less performance (but the same
basic pattern) than the imation drives.

David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Luke Lonergan

The new MLC based SSDs have better wear leveling tech and don't suffer the 
pauses.  Intel X25-M 80 and 160 GB SSDs are both pause-free.  See Anandtech's 
test results for details.

Intel's SLC SSDs should also be good enough but they're smaller.

- Luke

- Original Message -
From: pgsql-performance-ow...@postgresql.org 

To: Gregory Stark 
Cc: Markus Wanner ; Scott Marlowe ; 
Ron ; pgsql-performance@postgresql.org 

Sent: Sat Jan 10 14:40:51 2009
Subject: Re: [PERFORM] understanding postgres issues/bottlenecks

On Sat, 10 Jan 2009, Gregory Stark wrote:

> da...@lang.hm writes:
>
>> On Sat, 10 Jan 2009, Markus Wanner wrote:
>>
>>> My understanding of SSDs so far is, that they are not that bad at
>>> writing *on average*, but to perform wear-leveling, they sometimes have
>>> to shuffle around multiple blocks at once. So there are pretty awful
>>> spikes for writing latency (IIRC more than 100ms has been measured on
>>> cheaper disks).
>
> That would be fascinating. And frightening. A lot of people have been
> recommending these for WAL disks and this would be make them actually *worse*
> than regular drives.
>
>> well, I have one of those cheap disks.
>>
>> brand new out of the box, format the 32G drive, then copy large files to it
>> (~1G per file). this should do almost no wear-leveling, but it's write
>> performance is still poor and it has occasional 1 second pauses.
>
> This isn't similar to the way WAL behaves though. What you're testing is the
> behaviour when the bandwidth to the SSD is saturated. At that point some point
> in the stack, whether in the SSD, the USB hardware or driver, or OS buffer
> cache can start to queue up writes. The stalls you see could be the behaviour
> when that queue fills up and it needs to push back to higher layers.
>
> To simulate WAL you want to transfer smaller volumes of data, well below the
> bandwidth limit of the drive, fsync the data, then pause a bit repeat. Time
> each fsync and see whether the time they take is proportional to the amount of
> data written in the meantime or whether they randomly spike upwards.

if you have a specific benchmark for me to test I would be happy to do
this.

the test that I did is basicly the best-case for the SSD (more-or-less
sequential writes where the vendors claim that the drives match or
slightly outperform the traditional disks). for random writes the vendors
put SSDs at fewer IOPS than 5400 rpm drives, let along 15K rpm drives.

take a look at this paper
http://www.imation.com/PageFiles/83/Imation-SSD-Performance-White-Paper.pdf

this is not one of the low-performance drives, they include a sandisk
drive in the paper that shows significantly less performance (but the same
basic pattern) than the imation drives.

David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] block device benchmarking

2009-01-10 Thread M. Edward (Ed) Borasky

Markus Wanner wrote:
> Hi,
> 
> I'm fiddling with a hand-made block device based benchmarking thingie,
> which I want to run random reads and writes of relatively small blocks
> (somewhat similar to databases). I'm much less interested in measuring
> throughput, but rather in latency. Besides varying block sizes, I'm also
> testing with a varying number of concurrent threads and varying
> read/write ratios. As a result, I'm interested in roughly the following
> graphs:
> 
>  * (single thread) i/o latency vs. seek distance
>  * (single thread) throughput vs. (accurator) position
>  * (single thread) i/o latency vs. no of concurrent threads
>  * total requests per second + throughput vs. no of concurrent threads
>  * total requests per second + throughput vs. read/write ratio
>  * total requests per second + throughput vs. block size
>  * distribution of access times (histogram)
> 
> (Of course, not all of these are relevant for all types of storages.)
> 
> Does there already exist a tool giving (most of) these measures? Am I
> missing something interesting? What would you expect from a block device
> benchmarking tool?
> 
> Regards
> 
> Markus Wanner
> 

Check out the work of Jens Axboe and Alan Brunelle, specifically the
packages "blktrace" and "fio". "blktrace" acts as a "sniffer" for I/O,
recording the path of every I/O operation through the block I/O layer.
Using another tool in the package, "btreplay/btrecord", you can
translate the captured trace into a benchmark that re-issues the I/Os.
And the third tool in the package, "btt", does statistical analysis. I
don't think you really need "benchmarks" if you can extract this kind of
detail from a real application. :)

However, if you do want to build a benchmark, "fio" is a customizable
benchmark utility. In the absence of real-world traces, you can emulate
any I/O activity pattern with "fio". "fio" is what Mark Wong's group has
been using to characterize filesystem behavior. I'm not sure where the
presentations are at the moment, but there is some of it at

http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide

There are also some more generic filesystem benchmarks like "iozone" and
"bonnie++". They're a good general tool for comparing filesystems and
I/O subsystems, but the other tools are more useful if you have a
specific workload, for example, a PostgreSQL application.

BTW ... I am working on my blktrace howto even as I type this. I don't
have an ETA -- that's going to depend on how long it takes me to get the
PostgreSQL benchmarks I'm using to work on my machine. But everything
will be on Github at

http://github.com/znmeb/linux_perf_viz/tree/master/blktrace-howto

as it evolves.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Luke Lonergan

I believe they write at 200MB/s which is outstanding for sequential BW.  Not 
sure about the write latency, though the Anandtech benchmark results showed 
high detail and IIRC the write latencies were very good.

- Luke

- Original Message -
From: da...@lang.hm 
To: Luke Lonergan
Cc: st...@enterprisedb.com ; mar...@bluegap.ch 
; scott.marl...@gmail.com ; 
rjpe...@earthlink.net ; pgsql-performance@postgresql.org 

Sent: Sat Jan 10 16:03:32 2009
Subject: Re: [PERFORM] understanding postgres issues/bottlenecks

On Sat, 10 Jan 2009, Luke Lonergan wrote:

> The new MLC based SSDs have better wear leveling tech and don't suffer
> the pauses.  Intel X25-M 80 and 160 GB SSDs are both pause-free.  See
> Anandtech's test results for details.

they don't suffer the pauses, but they still don't have fantasic write
speeds.

David Lang

> Intel's SLC SSDs should also be good enough but they're smaller.
>
> - Luke
>
> - Original Message -
> From: pgsql-performance-ow...@postgresql.org 
> 
> To: Gregory Stark 
> Cc: Markus Wanner ; Scott Marlowe 
> ; Ron ; 
> pgsql-performance@postgresql.org 
> Sent: Sat Jan 10 14:40:51 2009
> Subject: Re: [PERFORM] understanding postgres issues/bottlenecks
>
> On Sat, 10 Jan 2009, Gregory Stark wrote:
>
>> da...@lang.hm writes:
>>
>>> On Sat, 10 Jan 2009, Markus Wanner wrote:
>>>
 My understanding of SSDs so far is, that they are not that bad at
 writing *on average*, but to perform wear-leveling, they sometimes have
 to shuffle around multiple blocks at once. So there are pretty awful
 spikes for writing latency (IIRC more than 100ms has been measured on
 cheaper disks).
>>
>> That would be fascinating. And frightening. A lot of people have been
>> recommending these for WAL disks and this would be make them actually *worse*
>> than regular drives.
>>
>>> well, I have one of those cheap disks.
>>>
>>> brand new out of the box, format the 32G drive, then copy large files to it
>>> (~1G per file). this should do almost no wear-leveling, but it's write
>>> performance is still poor and it has occasional 1 second pauses.
>>
>> This isn't similar to the way WAL behaves though. What you're testing is the
>> behaviour when the bandwidth to the SSD is saturated. At that point some 
>> point
>> in the stack, whether in the SSD, the USB hardware or driver, or OS buffer
>> cache can start to queue up writes. The stalls you see could be the behaviour
>> when that queue fills up and it needs to push back to higher layers.
>>
>> To simulate WAL you want to transfer smaller volumes of data, well below the
>> bandwidth limit of the drive, fsync the data, then pause a bit repeat. Time
>> each fsync and see whether the time they take is proportional to the amount 
>> of
>> data written in the meantime or whether they randomly spike upwards.
>
> if you have a specific benchmark for me to test I would be happy to do
> this.
>
> the test that I did is basicly the best-case for the SSD (more-or-less
> sequential writes where the vendors claim that the drives match or
> slightly outperform the traditional disks). for random writes the vendors
> put SSDs at fewer IOPS than 5400 rpm drives, let along 15K rpm drives.
>
> take a look at this paper
> http://www.imation.com/PageFiles/83/Imation-SSD-Performance-White-Paper.pdf
>
> this is not one of the low-performance drives, they include a sandisk
> drive in the paper that shows significantly less performance (but the same
> basic pattern) than the imation drives.
>
> David Lang
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread Mark Kirkwood


da...@lang.hm wrote:

On Sat, 10 Jan 2009, Luke Lonergan wrote:

The new MLC based SSDs have better wear leveling tech and don't 
suffer the pauses.  Intel X25-M 80 and 160 GB SSDs are both 
pause-free.  See Anandtech's test results for details.


they don't suffer the pauses, but they still don't have fantasic write 
speeds.


David Lang


Intel's SLC SSDs should also be good enough but they're smaller.



From what I can see, SLC SSDs are still quite superior for reliability 
and (write) performance. However they are too small and too expensive 
right now. Hopefully the various manufacturers are working on improving 
the size/price issue for SLC, as well as improving the 
performance/reliability area for the MLC products.


regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-10 Thread david


On Sun, 11 Jan 2009, Mark Kirkwood wrote:


da...@lang.hm wrote:

On Sat, 10 Jan 2009, Luke Lonergan wrote:

The new MLC based SSDs have better wear leveling tech and don't suffer the 
pauses.  Intel X25-M 80 and 160 GB SSDs are both pause-free.  See 
Anandtech's test results for details.


they don't suffer the pauses, but they still don't have fantasic write 
speeds.


David Lang


Intel's SLC SSDs should also be good enough but they're smaller.



From what I can see, SLC SSDs are still quite superior for reliability and 
(write) performance. However they are too small and too expensive right now. 
Hopefully the various manufacturers are working on improving the size/price 
issue for SLC, as well as improving the performance/reliability area for the 
MLC products.


the very nature of the technology means that SLC will never be as cheap as 
MLC and MLC will never be as reliable as SLC


take a look at 
http://www.imation.com/PageFiles/83/SSD-Reliability-Lifetime-White-Paper.pdf 
for a good writeup of the technology.


for both technologies, the price will continue to drop, and the 
reliability and performance will continue to climb, but I don't see 
anything that would improve one without the other (well, I could see MLC 
gaining a 50% capacity boost if they can get to 3 bits per cell vs the 
current 2, but that would come at the cost of reliability again)


for write performance I don't think there is as much of a difference 
between the two technologies. today there is a huge difference in most of 
the shipping products, but Intel has now demonstrated that it's mostly due 
to the controller chip, so I expect much of that difference to vanish in 
the next year or so (as new generations of controller chips ship)


David Lang

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

[PERFORM] block device benchmarking

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] block device benchmarking

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

Re: [PERFORM] understanding postgres issues/bottlenecks

19 matches

Site Navigation

Mail list logo

Footer information