[PERFORM] SSD + RAID

2009-11-13 Thread Laszlo Nagy

Hello,

I'm about to buy SSD drive(s) for a database. For decision making, I 
used this tech report:


http://techreport.com/articles.x/16255/9
http://techreport.com/articles.x/16255/10

Here are my concerns:

   * I need at least 32GB disk space. So DRAM based SSD is not a real
 option. I would have to buy 8x4GB memory, costs a fortune. And
 then it would still not have redundancy.
   * I could buy two X25-E drives and have 32GB disk space, and some
 redundancy. This would cost about $1600, not counting the RAID
 controller. It is on the edge.
   * I could also buy many cheaper MLC SSD drives. They cost about
 $140. So even with 10 drives, I'm at $1400. I could put them in
 RAID6, have much more disk space (256GB), high redundancy and
 POSSIBLY good read/write speed. Of course then I need to buy a
 good RAID controller.

My question is about the last option. Are there any good RAID cards that 
are optimized (or can be optimized) for SSD drives? Do any of you have 
experience in using many cheaper SSD drives? Is it a bad idea?


Thank you,

  Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Laszlo Nagy



Note that some RAID controllers (3Ware in particular) refuse to
recognize the MLC drives, in particular, they act as if the OCZ Vertex
series do not exist when connected.

I don't know what they're looking for (perhaps some indication that
actual rotation is happening?) but this is a potential problem make
sure your adapter can talk to these things!

BTW I have done some benchmarking with Postgresql against these drives
and they are SMOKING fast.
  
I was thinking about ARECA 1320 with 2GB memory + BBU. Unfortunately, I 
cannot find information about using ARECA cards with SSD drives. I'm 
also not sure how they would work together. I guess the RAID cards are 
optimized for conventional disks. They read/write data in bigger blocks 
and they optimize the order of reading/writing for physical cylinders. I 
know for sure that this particular areca card has an Intel dual core IO 
processor and its own embedded operating system. I guess it could be 
tuned for SSD drives, but I don't know how.


I was hoping that with a RAID 6 setup, write speed (which is slower for 
cheaper flash based SSD drives) would dramatically increase, because 
information written simultaneously to 10 drives. With very small block 
size, it would probably be true. But... what if the RAID card uses 
bigger block sizes, and - say - I want to update much smaller blocks in 
the database?


My other option is to buy two SLC SSD drives and use RAID1. It would 
cost about the same, but has less redundancy and less capacity. Which is 
the faster? 8-10 MLC disks in RAID 6 with a good caching controller, or 
two SLC disks in RAID1?


Thanks,

  Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Marcos Ortiz Valmaseda

This is very fast.
On IT Toolbox there are many whitepapers about it.
On the ERP and DataCenter sections specifically.

We need that all tests that we do, we can share it on the
Project Wiki.

Regards

On Nov 13, 2009, at 7:02 AM, Karl Denninger wrote:


Laszlo Nagy wrote:

Hello,

I'm about to buy SSD drive(s) for a database. For decision making, I
used this tech report:

http://techreport.com/articles.x/16255/9
http://techreport.com/articles.x/16255/10

Here are my concerns:

  * I need at least 32GB disk space. So DRAM based SSD is not a real
option. I would have to buy 8x4GB memory, costs a fortune. And
then it would still not have redundancy.
  * I could buy two X25-E drives and have 32GB disk space, and some
redundancy. This would cost about $1600, not counting the RAID
controller. It is on the edge.
  * I could also buy many cheaper MLC SSD drives. They cost about
$140. So even with 10 drives, I'm at $1400. I could put them in
RAID6, have much more disk space (256GB), high redundancy and
POSSIBLY good read/write speed. Of course then I need to buy a
good RAID controller.

My question is about the last option. Are there any good RAID cards
that are optimized (or can be optimized) for SSD drives? Do any of  
you

have experience in using many cheaper SSD drives? Is it a bad idea?

Thank you,

 Laszlo


Note that some RAID controllers (3Ware in particular) refuse to
recognize the MLC drives, in particular, they act as if the OCZ Vertex
series do not exist when connected.

I don't know what they're looking for (perhaps some indication that
actual rotation is happening?) but this is a potential problem  
make

sure your adapter can talk to these things!

BTW I have done some benchmarking with Postgresql against these drives
and they are SMOKING fast.

-- Karl
karl.vcf
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org 
)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Scott Marlowe
2009/11/13 Laszlo Nagy gand...@shopzeus.com:
 Hello,

 I'm about to buy SSD drive(s) for a database. For decision making, I used
 this tech report:

 http://techreport.com/articles.x/16255/9
 http://techreport.com/articles.x/16255/10

 Here are my concerns:

   * I need at least 32GB disk space. So DRAM based SSD is not a real
     option. I would have to buy 8x4GB memory, costs a fortune. And
     then it would still not have redundancy.
   * I could buy two X25-E drives and have 32GB disk space, and some
     redundancy. This would cost about $1600, not counting the RAID
     controller. It is on the edge.

I'm not sure a RAID controller brings much of anything to the table with SSDs.

   * I could also buy many cheaper MLC SSD drives. They cost about
     $140. So even with 10 drives, I'm at $1400. I could put them in
     RAID6, have much more disk space (256GB), high redundancy and

I think RAID6 is gonna reduce the throughput due to overhead to
something far less than what a software RAID-10 would achieve.

     POSSIBLY good read/write speed. Of course then I need to buy a
     good RAID controller.

I'm guessing that if you spent whatever money you were gonna spend on
more SSDs you'd come out ahead, assuming you had somewhere to put
them.

 My question is about the last option. Are there any good RAID cards that are
 optimized (or can be optimized) for SSD drives? Do any of you have
 experience in using many cheaper SSD drives? Is it a bad idea?

This I don't know.  Some quick googling shows the Areca 1680ix and
Adaptec 5 Series to be able to handle Samsun SSDs.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Merlin Moncure
On Fri, Nov 13, 2009 at 9:48 AM, Scott Marlowe scott.marl...@gmail.com wrote:
 I think RAID6 is gonna reduce the throughput due to overhead to
 something far less than what a software RAID-10 would achieve.

I was wondering about this.  I think raid 5/6 might be a better fit
for SSD than traditional drives arrays.  Here's my thinking:

*) flash SSD reads are cheaper than writes.  With 6 or more drives,
less total data has to be written in Raid 5 than Raid 10.  The main
component of raid 5 performance penalty is that for each written
block, it has to be read first than written...incurring rotational
latency, etc.   SSD does not have this problem.

*) flash is much more expensive in terms of storage/$.

*) flash (at least the intel stuff) is so fast relative to what we are
used to, that the point of using flash in raid is more for fault
tolerance than performance enhancement.  I don't have data to support
this, but I suspect that even with relatively small amount of the
slower MLC drives in raid, postgres will become cpu bound for most
applications.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Heikki Linnakangas
Laszlo Nagy wrote:
* I need at least 32GB disk space. So DRAM based SSD is not a real
  option. I would have to buy 8x4GB memory, costs a fortune. And
  then it would still not have redundancy.

At 32GB database size, I'd seriously consider just buying a server with
a regular hard drive or a small RAID array for redundancy, and stuffing
16 or 32 GB of RAM into it to ensure everything is cached. That's tried
and tested technology.

I don't know how you came to the 32 GB figure, but keep in mind that
administration is a lot easier if you have plenty of extra disk space
for things like backups, dumps+restore, temporary files, upgrades etc.
So if you think you'd need 32 GB of disk space, I'm guessing that 16 GB
of RAM would be enough to hold all the hot data in cache. And if you
choose a server with enough DIMM slots, you can expand easily if needed.

Just my 2 cents, I'm not really an expert on hardware..

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Merlin Moncure
2009/11/13 Heikki Linnakangas heikki.linnakan...@enterprisedb.com:
 Laszlo Nagy wrote:
    * I need at least 32GB disk space. So DRAM based SSD is not a real
      option. I would have to buy 8x4GB memory, costs a fortune. And
      then it would still not have redundancy.

 At 32GB database size, I'd seriously consider just buying a server with
 a regular hard drive or a small RAID array for redundancy, and stuffing
 16 or 32 GB of RAM into it to ensure everything is cached. That's tried
 and tested technology.

lots of ram doesn't help you if:
*) your database gets written to a lot and you have high performance
requirements
*) your data is important

(if either of the above is not true or even partially true, than your
advice is spot on)

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Scott Carey



On 11/13/09 7:29 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Fri, Nov 13, 2009 at 9:48 AM, Scott Marlowe scott.marl...@gmail.com
 wrote:
 I think RAID6 is gonna reduce the throughput due to overhead to
 something far less than what a software RAID-10 would achieve.
 
 I was wondering about this.  I think raid 5/6 might be a better fit
 for SSD than traditional drives arrays.  Here's my thinking:
 
 *) flash SSD reads are cheaper than writes.  With 6 or more drives,
 less total data has to be written in Raid 5 than Raid 10.  The main
 component of raid 5 performance penalty is that for each written
 block, it has to be read first than written...incurring rotational
 latency, etc.   SSD does not have this problem.
 

For random writes, RAID 5 writes as much as RAID 10 (parity + data), and
more if the raid block size is larger than 8k.  With RAID 6 it writes 50%
more than RAID 10.

For streaming writes RAID 5 / 6 has an advantage however.

For SLC drives, there is really  not much of a write performance penalty.
 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Karl Denninger

Greg Smith wrote:
 In order for a drive to work reliably for database use such as for
 PostgreSQL, it cannot have a volatile write cache.  You either need a
 write cache with a battery backup (and a UPS doesn't count), or to
 turn the cache off.  The SSD performance figures you've been looking
 at are with the drive's write cache turned on, which means they're
 completely fictitious and exaggerated upwards for your purposes.  In
 the real world, that will result in database corruption after a crash
 one day.
If power is unexpectedly removed from the system, this is true.  But
the caches on the SSD controllers are BUFFERS.  An operating system
crash does not disrupt the data in them or cause corruption.  An
unexpected disconnection of the power source from the drive (due to
unplugging it or a power supply failure for whatever reason) is a
different matter.
   No one on the drive benchmarking side of the industry seems to have
 picked up on this, so you can't use any of those figures.  I'm not
 even sure right now whether drives like Intel's will even meet their
 lifetime expectations if they aren't allowed to use their internal
 volatile write cache.

 Here's two links you should read and then reconsider your whole design:
 http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/

 http://petereisentraut.blogspot.com/2009/07/solid-state-drive-benchmarks-and-write.html


 I can't even imagine how bad the situation would be if you decide to
 wander down the use a bunch of really cheap SSD drives path; these
 things are barely usable for databases with Intel's hardware.  The
 needs of people who want to throw SSD in a laptop and those of the
 enterprise database market are really different, and if you believe
 doom forecasting like the comments at
 http://blogs.sun.com/BestPerf/entry/oracle_peoplesoft_payroll_sun_sparc
 that gap is widening, not shrinking.
Again, it depends.

With the write cache off on these disks they still are huge wins for
very-heavy-read applications, which many are.  The issue is (as always)
operation mix - if you do a lot of inserts and updates then you suffer,
but a lot of database applications are in the high 90%+ SELECTs both in
frequency and data flow volume.  The lack of rotational and seek latency
in those applications is HUGE.

-- Karl Denninger
attachment: karl.vcf
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Greg Smith

Karl Denninger wrote:

If power is unexpectedly removed from the system, this is true.  But
the caches on the SSD controllers are BUFFERS.  An operating system
crash does not disrupt the data in them or cause corruption.  An
unexpected disconnection of the power source from the drive (due to
unplugging it or a power supply failure for whatever reason) is a
different matter.
  
As standard operating procedure, I regularly get something writing heavy 
to the database on hardware I'm suspicious of and power the box off 
hard.  If at any time I suffer database corruption from this, the 
hardware is unsuitable for database use; that should never happen.  This 
is what I mean when I say something meets the mythical enterprise 
quality.  Companies whose data is worth something can't operate in a 
situation where money has been exchanged because a database commit was 
recorded, only to lose that commit just because somebody tripped over 
the power cord and it was in the buffer rather than on permanent disk.  
That's just not acceptable, and the even bigger danger of the database 
perhaps not coming up altogether even after such a tiny disaster is also 
very real with a volatile write cache.



With the write cache off on these disks they still are huge wins for
very-heavy-read applications, which many are.
Very read-heavy applications would do better to buy a ton of RAM instead 
and just make sure they populate from permanent media (say by reading 
everything in early at sequential rates to prime the cache).  There is 
an extremely narrow use-case where SSDs are the right technology, and 
it's only in a subset even of read-heavy apps where they make sense.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Karl Denninger
Greg Smith wrote:
 Karl Denninger wrote:
 If power is unexpectedly removed from the system, this is true.  But
 the caches on the SSD controllers are BUFFERS.  An operating system
 crash does not disrupt the data in them or cause corruption.  An
 unexpected disconnection of the power source from the drive (due to
 unplugging it or a power supply failure for whatever reason) is a
 different matter.
   
 As standard operating procedure, I regularly get something writing
 heavy to the database on hardware I'm suspicious of and power the box
 off hard.  If at any time I suffer database corruption from this, the
 hardware is unsuitable for database use; that should never happen. 
 This is what I mean when I say something meets the mythical
 enterprise quality.  Companies whose data is worth something can't
 operate in a situation where money has been exchanged because a
 database commit was recorded, only to lose that commit just because
 somebody tripped over the power cord and it was in the buffer rather
 than on permanent disk.  That's just not acceptable, and the even
 bigger danger of the database perhaps not coming up altogether even
 after such a tiny disaster is also very real with a volatile write cache.
Yep.  The plug test is part of my standard is this stable enough for
something I care about checkout.
 With the write cache off on these disks they still are huge wins for
 very-heavy-read applications, which many are.
 Very read-heavy applications would do better to buy a ton of RAM
 instead and just make sure they populate from permanent media (say by
 reading everything in early at sequential rates to prime the cache). 
 There is an extremely narrow use-case where SSDs are the right
 technology, and it's only in a subset even of read-heavy apps where
 they make sense.
I don't know about that in the general case - I'd say it depends.

250GB of SSD for read-nearly-always applications is a LOT cheaper than
250gb of ECC'd DRAM.  The write performance issues can be handled by
clever use of controller technology as well (that is, turn off the
drive's write cache and use the BBU on the RAID adapter.)

I have a couple of applications where two 250GB SSD disks in a Raid 1
array with a BBU'd controller, with the disk drive cache off, is all-in
a fraction of the cost of sticking 250GB of volatile storage in a server
and reading in the data set (plus managing the occasional updates) from
stable storage.  It is not as fast as stuffing the 250GB of RAM in a
machine but it's a hell of a lot faster than a big array of small
conventional drives in a setup designed for maximum IO-Ops.

One caution for those thinking of doing this - the incremental
improvement of this setup on PostGresql in WRITE SIGNIFICANT environment
isn't NEARLY as impressive.  Indeed the performance in THAT case for
many workloads may only be 20 or 30% faster than even reasonably
pedestrian rotating media in a high-performance (lots of spindles and
thus stripes) configuration and it's more expensive (by a lot.)  If you
step up to the fast SAS drives on the rotating side there's little
argument for the SSD at all (again, assuming you don't intend to cheat
and risk data loss.)

Know your application and benchmark it.

-- Karl
attachment: karl.vcf
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Merlin Moncure
On Fri, Nov 13, 2009 at 12:22 PM, Scott Carey
sc...@richrelevance.com  On 11/13/09 7:29 AM, Merlin Moncure
mmonc...@gmail.com wrote:

 On Fri, Nov 13, 2009 at 9:48 AM, Scott Marlowe scott.marl...@gmail.com
 wrote:
 I think RAID6 is gonna reduce the throughput due to overhead to
 something far less than what a software RAID-10 would achieve.

 I was wondering about this.  I think raid 5/6 might be a better fit
 for SSD than traditional drives arrays.  Here's my thinking:

 *) flash SSD reads are cheaper than writes.  With 6 or more drives,
 less total data has to be written in Raid 5 than Raid 10.  The main
 component of raid 5 performance penalty is that for each written
 block, it has to be read first than written...incurring rotational
 latency, etc.   SSD does not have this problem.


 For random writes, RAID 5 writes as much as RAID 10 (parity + data), and
 more if the raid block size is larger than 8k.  With RAID 6 it writes 50%
 more than RAID 10.

how does raid 5 write more if the block size is  8k? raid 10 is also
striped, so has the same problem, right?  IOW, if the block size is 8k
and you need to write 16k sequentially the raid 5 might write out 24k
(two blocks + parity).  raid 10 always writes out 2x your data in
terms of blocks (raid 5 does only in the worst case).  For a SINGLE
block, it's always 2x your data for both raid 5 and raid 10, so what i
said above was not quite correct.

raid 6 is not going to outperform raid 10 ever IMO.  It's just a
slightly safer raid 5.  I was just wondering out loud if raid 5 might
give similar performance to raid 10 on flash based disks since there
is no rotational latency.  even if it did, I probably still wouldn't
use it...

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Merlin Moncure
2009/11/13 Greg Smith g...@2ndquadrant.com:
 In order for a drive to work reliably for database use such as for
 PostgreSQL, it cannot have a volatile write cache.  You either need a write
 cache with a battery backup (and a UPS doesn't count), or to turn the cache
 off.  The SSD performance figures you've been looking at are with the
 drive's write cache turned on, which means they're completely fictitious and
 exaggerated upwards for your purposes.  In the real world, that will result
 in database corruption after a crash one day.  No one on the drive
 benchmarking side of the industry seems to have picked up on this, so you
 can't use any of those figures.  I'm not even sure right now whether drives
 like Intel's will even meet their lifetime expectations if they aren't
 allowed to use their internal volatile write cache.

hm.  I never understood why Peter was only able to turn up 400 iops
when others were turning up 4000+ (measured from bonnie).  This would
explain it.

Is it authoritatively known that the Intel drives true random write
ops is not what they are claiming?  If so,  then you are right..flash
doesn't make sense, at least not without a NV cache on the device.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Brad Nicholson

Greg Smith wrote:

Karl Denninger wrote:

With the write cache off on these disks they still are huge wins for
very-heavy-read applications, which many are.
Very read-heavy applications would do better to buy a ton of RAM 
instead and just make sure they populate from permanent media (say by 
reading everything in early at sequential rates to prime the cache).  
There is an extremely narrow use-case where SSDs are the right 
technology, and it's only in a subset even of read-heavy apps where 
they make sense.


Out of curiosity, what are those narrow use cases where you think SSD's 
are the correct technology?


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Dave Crooke
Itching to jump in here :-)

There are a lot of things to trade off when choosing storage for a
database: performance for different parts of the workload,
reliability, performance in degraded mode (when a disk dies), backup
methodologies, etc. ... the mistake many people make is to overlook
the sub-optimal operating conditions, dailure modes and recovery
paths.

Some thoughts:

- RAID-5 and RAID-6 have poor write performance, and terrible
performance in degraded mode - there are a few edge cases, but in
almost all cases you should be using RAID-10 for a database.

- Like most apps, the ultimate way to make a databse perform is to
have most of it (or at least the working set) in RAM, preferably the
DB server buffer cache. This is why big banks run Oracle on an HP
Superdome with 1TB of RAM ... the $15m Hitachi data array is just
backing store :-)

- Personally, I'm an SSD skeptic ... the technology just isn't mature
enough for the data center. If you apply a typical OLTP workload, they
are going to die early deaths. The only case in which they will
materially improve performance is where you have a large data set with
lots of **totally random** reads, i.e. where buffer cache is
ineffective. In the words of TurboTax, this is not common.

- If you're going to use synchronous write with a significant amount
of small transactions, then you need some reliable RAM (not SSD) to
commit log files into, which means a proper battery-backed RAID
controller / external SAN with write-back cache. For many apps though,
a synchronous commit simply isn't necessary: losing a few rows of data
during a crash is relatively harmless. For these apps, turning off
synchronous writes is an often overlooked performance tweak.


In summary, don't get distracted by shiny new objects like SSD and RAID-6 :-)


2009/11/13 Brad Nicholson bnich...@ca.afilias.info:
 Greg Smith wrote:

 Karl Denninger wrote:

 With the write cache off on these disks they still are huge wins for
 very-heavy-read applications, which many are.

 Very read-heavy applications would do better to buy a ton of RAM instead
 and just make sure they populate from permanent media (say by reading
 everything in early at sequential rates to prime the cache).  There is an
 extremely narrow use-case where SSDs are the right technology, and it's only
 in a subset even of read-heavy apps where they make sense.

 Out of curiosity, what are those narrow use cases where you think SSD's are
 the correct technology?

 --
 Brad Nicholson  416-673-4106
 Database Administrator, Afilias Canada Corp.


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Fernando Hevia
 

 -Mensaje original-
 Laszlo Nagy
 
 My question is about the last option. Are there any good RAID 
 cards that are optimized (or can be optimized) for SSD 
 drives? Do any of you have experience in using many cheaper 
 SSD drives? Is it a bad idea?
 
 Thank you,
 
Laszlo
 

Never had a SSD to try yet, still I wonder if software raid + fsync on SSD
Drives could be regarded as a sound solution?
Shouldn't their write performance be more than a trade-off for fsync?

You could benchmark this setup yourself before purchasing a RAID card.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why age (datfrozenxid) in postgres becomes 1073742202 not zero after each vacuum of database.

2009-11-13 Thread Robert Haas
[ removing -jobs from cc list as it is not appropriate for this posting ]

On Thu, Nov 12, 2009 at 3:18 AM, Brahma Prakash Tiwari
brahma.tiw...@inventum.cc wrote:
 Hi all

 Why age (datfrozenxid) in postgres becomes 1073742202 not zero after vacuum
 of database.

 Thanks in advance

I think you're misunderstanding the meaning of the column.  As the
fine manual explains:

Similarly, the datfrozenxid column of a database's pg_database row is
a lower bound on the normal XIDs appearing in that database — it is
just the minimum of the per-table relfrozenxid values within the
database.

http://www.postgresql.org/docs/current/static/routine-vacuuming.html

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Greg Smith

Brad Nicholson wrote:
Out of curiosity, what are those narrow use cases where you think 
SSD's are the correct technology?

Dave Crooke did a good summary already, I see things like this:

* You need to have a read-heavy app that's bigger than RAM, but not too 
big so it can still fit on SSD
* You need reads to be dominated by random-access and uncached lookups, 
so that system RAM used as a buffer cache doesn't help you much.
* Writes have to be low to moderate, as the true write speed is much 
lower for database use than you'd expect from benchmarks derived from 
other apps.  And it's better if writes are biased toward adding data 
rather than changing existing pages


As far as what real-world apps have that profile, I like SSDs for small 
to medium web applications that have to be responsive, where the user 
shows up and wants their randomly distributed and uncached data with 
minimal latency. 

SSDs can also be used effectively as second-tier targeted storage for 
things that have a performance-critical but small and random bit as part 
of a larger design that doesn't have those characteristics; putting 
indexes on SSD can work out well for example (and there the write 
durability stuff isn't quite as critical, as you can always drop an 
index and rebuild if it gets corrupted).


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Merlin Moncure
2009/11/13 Greg Smith g...@2ndquadrant.com:
 As far as what real-world apps have that profile, I like SSDs for small to
 medium web applications that have to be responsive, where the user shows up
 and wants their randomly distributed and uncached data with minimal latency.
 SSDs can also be used effectively as second-tier targeted storage for things
 that have a performance-critical but small and random bit as part of a
 larger design that doesn't have those characteristics; putting indexes on
 SSD can work out well for example (and there the write durability stuff
 isn't quite as critical, as you can always drop an index and rebuild if it
 gets corrupted).


Here's a bonnie++ result for Intel showing 14k seeks:
http://www.wlug.org.nz/HarddiskBenchmarks

bonnie++ only writes data back 10% of the time.  Why is Peter's
benchmark showing only 400 seeks? Is this all attributable to write
barrier? I'm not sure I'm buying that...

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Greg Smith

Fernando Hevia wrote:

Shouldn't their write performance be more than a trade-off for fsync?
  
Not if you have sequential writes that are regularly fsync'd--which is 
exactly how the WAL writes things out in PostgreSQL.  I think there's a 
potential for SSD to reach a point where they can give good performance 
even with their write caches turned off.  But it will require a more 
robust software stack, like filesystems that really implement the write 
barrier concept effectively for this use-case, for that to happen.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] SSD + RAID

2009-11-13 Thread Kenny Gorman
The FusionIO products are a little different.  They are card based vs trying to 
emulate a traditional disk.  In terms of volatility, they have an on-board 
capacitor that allows power to be supplied until all writes drain.  They do not 
have a cache in front of them like a disk-type SSD might.   I don't sell these 
things, I am just a fan.  I verified all this with the Fusion IO techs before I 
replied.  Perhaps older versions didn't have this functionality?  I am not 
sure.  I have already done some cold power off tests w/o problems, but I could 
up the workload a bit and retest.  I will do a couple of 'pull the cable' tests 
on monday or tuesday and report back how it goes.

Re the performance #'s...  Here is my post:

http://www.kennygorman.com/wordpress/?p=398

-kg

 
In order for a drive to work reliably for database use such as for 
PostgreSQL, it cannot have a volatile write cache.  You either need a 
write cache with a battery backup (and a UPS doesn't count), or to turn 
the cache off.  The SSD performance figures you've been looking at are 
with the drive's write cache turned on, which means they're completely 
fictitious and exaggerated upwards for your purposes.  In the real 
world, that will result in database corruption after a crash one day.  
No one on the drive benchmarking side of the industry seems to have 
picked up on this, so you can't use any of those figures.  I'm not even 
sure right now whether drives like Intel's will even meet their lifetime 
expectations if they aren't allowed to use their internal volatile write 
cache.

Here's two links you should read and then reconsider your whole design: 

http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
http://petereisentraut.blogspot.com/2009/07/solid-state-drive-benchmarks-and-write.html

I can't even imagine how bad the situation would be if you decide to 
wander down the use a bunch of really cheap SSD drives path; these 
things are barely usable for databases with Intel's hardware.  The needs 
of people who want to throw SSD in a laptop and those of the enterprise 
database market are really different, and if you believe doom 
forecasting like the comments at 
http://blogs.sun.com/BestPerf/entry/oracle_peoplesoft_payroll_sun_sparc 
that gap is widening, not shrinking.



Re: [PERFORM] Manual vacs 5x faster than autovacs?

2009-11-13 Thread Craig Ringer
On 13/11/2009 2:29 PM, Dave Crooke wrote:

 Beware that VACUUM FULL locks an entire table at a time :-)

... and often bloats its indexes horribly. Use CLUSTER instead if you
need to chop a table that's massively bloated down to size; it'll be
much faster, and shouldn't leave the indexes in a mess.

I increasingly wonder what the purpose of VACUUM FULL in its current
form is.

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Manual vacs 5x faster than autovacs?

2009-11-13 Thread Scott Marlowe
On Fri, Nov 13, 2009 at 8:31 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 13/11/2009 2:29 PM, Dave Crooke wrote:

 Beware that VACUUM FULL locks an entire table at a time :-)

 ... and often bloats its indexes horribly. Use CLUSTER instead if you
 need to chop a table that's massively bloated down to size; it'll be
 much faster, and shouldn't leave the indexes in a mess.

 I increasingly wonder what the purpose of VACUUM FULL in its current
 form is.

There's been talk of removing it.  It's almost historical in nature
now, but there are apparently one or two situations, like when you're
almost out of space, that vacuum full can handle that dumping reload
or cluster or whatnot can't do without more extra space.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Manual vacs 5x faster than autovacs?

2009-11-13 Thread Craig Ringer
On 14/11/2009 11:55 AM, Scott Marlowe wrote:
 On Fri, Nov 13, 2009 at 8:31 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 On 13/11/2009 2:29 PM, Dave Crooke wrote:

 Beware that VACUUM FULL locks an entire table at a time :-)

 ... and often bloats its indexes horribly. Use CLUSTER instead if you
 need to chop a table that's massively bloated down to size; it'll be
 much faster, and shouldn't leave the indexes in a mess.

 I increasingly wonder what the purpose of VACUUM FULL in its current
 form is.
 
 There's been talk of removing it.  It's almost historical in nature
 now, but there are apparently one or two situations, like when you're
 almost out of space, that vacuum full can handle that dumping reload
 or cluster or whatnot can't do without more extra space.

Perhaps it should drop and re-create indexes as well, then? (Or disable
them so they become inconsistent, then REINDEX them - same deal). It'd
run a LOT faster, and the index bloat issue would be gone.

The current form of the command just invites misuse and misapplication.

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Manual vacs 5x faster than autovacs?

2009-11-13 Thread Scott Marlowe
On Fri, Nov 13, 2009 at 9:45 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 14/11/2009 11:55 AM, Scott Marlowe wrote:
 On Fri, Nov 13, 2009 at 8:31 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 On 13/11/2009 2:29 PM, Dave Crooke wrote:

 Beware that VACUUM FULL locks an entire table at a time :-)

 ... and often bloats its indexes horribly. Use CLUSTER instead if you
 need to chop a table that's massively bloated down to size; it'll be
 much faster, and shouldn't leave the indexes in a mess.

 I increasingly wonder what the purpose of VACUUM FULL in its current
 form is.

 There's been talk of removing it.  It's almost historical in nature
 now, but there are apparently one or two situations, like when you're
 almost out of space, that vacuum full can handle that dumping reload
 or cluster or whatnot can't do without more extra space.

 Perhaps it should drop and re-create indexes as well, then? (Or disable
 them so they become inconsistent, then REINDEX them - same deal). It'd
 run a LOT faster, and the index bloat issue would be gone.

 The current form of the command just invites misuse and misapplication.

Yeah, it should be a name that when you're typing it you know you
screwed up to get where you are.  The
opleasemayihavebackthespaceilostwhilelockingmytablesandbloatingmyindexes
command.  No chance you'll run it by mistake either!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance