Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-28 Thread Greg Smith

Lars wrote:

Below is a quote from the Pliant datasheet:
No Write Cache:
Pliant EFDs deliver outstanding
write performance
without any dependence on
write cache and thus does
not use battery/supercap.
  


I liked the article The Register wrote about them, with the headline 
Pliant's SSDs are awesome, says Pliant.  Of course they do.  Check out 
the write benchmark figures in the information review at 
http://oliveraaltonen.com/2010/09/29/preliminary-benchmark-results-of-the-pliant-ssd-drives/ 
to see how badly performance suffers on their design from those 
decisions.  The Fusion I/O devices get nearly an order of magnitude more 
write IOPS in those tests.


As far as I've been able to tell, what Pliant does is just push writes 
out all the time without waiting for them to be aligned with block 
sizes, followed by cleaning up the wreckage later via their internal 
automatic maintenance ASICs (it's sort of an always on TRIM 
implementation if I'm guessing right).  That has significant limitations 
both in regards to total write speed as well as device longevity.  For a 
database, I'd much rather have a supercap and get ultimate write 
performance without those downsides.  Depends on the read/write ratio 
though; I could see a heavily read-biased system work well with their 
approach.  Of course, a heavily read-based system would be better served 
by having a ton of RAM instead in most cases.


Could be worst though--they could be misleading about the whole topic of 
write durability like Intel is.  I consider claiming high performance 
when you don't always really have it, what Pliant is doing here, to be a 
much lesser sin than losing data at random and not being clear about 
when that can happen.  I'd like FusionIO to put a big expect your 
server to be down for many minutes after a power interruption warning 
on their drives, too, while I'm wishing for complete vendor transparency 
here.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-20 Thread Andy Colson

On 1/19/2011 6:42 PM, Craig Ringer wrote:

On 01/19/2011 05:09 PM, Lars wrote:

Thanks for the reply!



As others have mentioned, how are you going to be doing your shards?

Hmm... shards might not have been a good word to describe it. I'll
paste what I wrote in another reply:
I used sharding as an expression for partitioning data into several
databases.


sharding or shards is pretty much the standard way that setup is
described. It doesn't come up on the Pg list a lot as most people doing
web-oriented horizontally scaled apps use MySQL or fashionable non-SQL
databases, but it's pretty well known in wider circles.

--
Craig Ringer




Or... PG is just so good we've never had to use more than one database 
server!  :-)


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-20 Thread Scott Marlowe
On Thu, Jan 20, 2011 at 7:48 AM, Andy Colson a...@squeakycode.net wrote:

 Or... PG is just so good we've never had to use more than one database
 server!  :-)

Hehe, while you can do a lot with one server, there are some scenarios
where sharding is the answer.  I have a horror story about not
sharding when we should have I can tell you over a beer sometime.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-19 Thread Lars
 No idea what mysql thinks a shard is, but in PG we have read-only hot 
 standby's.
I used sharding as an expression for partitioning data into several databases.
Each user in the system is unaware of any other user. The user never accesses 
the private data of another user. Each user could in theory be assigned their 
own database server. This makes it easy to split the 4 users over a number 
of database servers. There are some shared data that is stored in a special 
shared database.

 The standby database is exactly the same as the master (save a bit of
 data that has not been synced yet.)  I assume you know this... but I'd
 really recommend trying out PG's hot-standby and make sure it works the
 way you need (because I bet its different than mysql's).

 Assuming the shared and the sharded databases are totally different
 (lets call them database a and c), with the PG setup you'd have database
 a on one computer, then one master with database b on it (where all
 writes go), then several hot-standby's mirroring database b (that
 support read-only queries).
As our data is easily partitioned into any number of servers we do not plan to 
use replication for load balancing. We do however plan to use it to set up a 
backup site.

  Its pretty hard to guess what your usage pattern is (70% read,
   small columns, no big blobs (like photos), etc)... and even then we'd
 still have to guess.
It's more like 40% read 60% write.

 Not only will I not compare apples to oranges, but I really wont compare
 apples in Canada to oranges in Japan. :-)
Hehe

/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-19 Thread Lars

 Are you going to RAID the SSD drives at all?
Yes, I was thinking four drives in RAID 10 and a (hot) spare drive...

 Of course this is based on my experience, and I have my fireproof suit since
 I mentioned the word fusionIO :)
Hehe

FusionIO has some impressive stats!
SSD in RAID10 provides redundancy in case of disc failure. How do you handle 
this with fusionIO? Two mirrored cards?

/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-19 Thread Lars
Thanks for the reply!

MyISAM was chosen back in 2000. I'm not aware of the reasoning behind this 
choice...

Dell claims both the Samsung and the Pliant are safe to use.
Below is a quote from the Pliant datasheet:
No Write Cache:
Pliant EFDs deliver outstanding
write performance
without any dependence on
write cache and thus does
not use battery/supercap.

 As others have mentioned, how are you going to be doing your shards?
Hmm... shards might not have been a good word to describe it. I'll paste what I 
wrote in another reply:
I used sharding as an expression for partitioning data into several databases.
Each user in the system is unaware of any other user. The user never accesses 
the private data of another user. Each user could in theory be assigned their 
own database server. This makes it easy to split the 4 users over a number 
of database servers. There are some shared data that is stored in a special 
shared database.

/Lars

-Ursprungligt meddelande-
Från: mark [mailto:dvlh...@gmail.com]
Skickat: den 19 januari 2011 05:10
Till: Lars
Kopia: pgsql-performance@postgresql.org
Ämne: RE: [PERFORM] Migrating to Postgresql and new hardware

Comments in line, take em for what you paid for em.



 -Original Message-
 From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
 ow...@postgresql.org] On Behalf Of Lars
 Sent: Tuesday, January 18, 2011 3:57 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] Migrating to Postgresql and new hardware

 Hi,

 We are in the process of moving a web based application from a MySql to
 Postgresql database.
 Our main reason for moving to Postgresql is problems with MySql
 (MyISAM) table locking.

I would never try and talk someone out of switching but MyISAM? What
version of MySQL and did you pick MyISAM for a good reason or just happened
to end up there?



 We will buy a new set of servers to run the Postgresql databases.

 The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB
 RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).

 One server is used for shared data.
 Four servers are used for sharded data. A user in the system only has
 data in one of the shards.
 There is another server to which all data is replicated but I'll leave
 that one out of this discussion.
 These are dedicated database servers. There are more or less no stored
 procedures. The shared database size is about 20GB and each shard
 database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect
 the size will grow 10%-15% this year. Server load might increase with
 15%-30% this year. This setup is disk I/O bound. The overwhelming
 majority of sql statements are fast (typically single row selects,
 updates, inserts and deletes on primary key) but there are some slow
 long running (10min) queries.

 As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb
 RAM, H700 512MB NV Cache.

One would think you should notice a nice speed improvement, ceteris paribus,
since the X5650 will have -significantly- more memory bandwidth than the
5410s you are used to, and you are going to have a heck of a lot more ram
for things to cache in. I think the H700 is a step up in raid cards as well
but with only 4 disks your probably not maxing out there.



 Dell has offered two alternative SSDs:
 Samsung model SS805 (100GB Solid State Disk SATA 2.5).
 (http://www.plianttechnology.com/lightning_lb.php)
 Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5).
 (http://www.samsung.com/global/business/semiconductor/products/SSD/Prod
 ucts_Enterprise_SSD.html)

The Samsung ones seems to indicate that they have protection in the event of
a power failure, and the pliant does not mention it.

Granted I haven't done or seen any pull the plug under max load tests on
either family, so I got nothing beyond that it is the first thing I have
looked at with every SSD that crosses my path.




 Both are SLC drives. The price of the Pliant is about 2,3 times the
 price of the Samsung (does it have twice the performance?).

 One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung
 drives (4 in RAID 10 + 1 spare).
 Another alternative would be 3 servers (1 shared and 2 shards) with 5
 Pliant drives (4 in RAID 10 + 1 spare). This would be slightly more
 expensive than the first alternative but would be easier to upgrade
 with two new shard servers when it's needed.

As others have mentioned, how are you going to be doing your shards?




 Anyone have experience using the Samsung or the Pliant SSD? Any
 information about degraded performance over time?
 Any comments on the setups? How would an alternative with 15K disks (6
 RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare?


You still may find that breaking xlog out to its own logical drive (2 drives
in raid 1) gives a speed improvement to the overall. YMMV - so tinker and
find out before you go deploying.

 How would these alternatives compare in I/O performance

[PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Lars
Hi,

We are in the process of moving a web based application from a MySql to 
Postgresql database.
Our main reason for moving to Postgresql is problems with MySql (MyISAM) table 
locking.
We will buy a new set of servers to run the Postgresql databases.

The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB RAM. 
PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).

One server is used for shared data.
Four servers are used for sharded data. A user in the system only has data in 
one of the shards.
There is another server to which all data is replicated but I'll leave that one 
out of this discussion.
These are dedicated database servers. There are more or less no stored 
procedures. The shared database size is about 20GB and each shard database is 
about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 
10%-15% this year. Server load might increase with 15%-30% this year. This 
setup is disk I/O bound. The overwhelming majority of sql statements are fast 
(typically single row selects, updates, inserts and deletes on primary key) but 
there are some slow long running (10min) queries.

As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb RAM, H700 
512MB NV Cache.
Dell has offered two alternative SSDs:
Samsung model SS805 (100GB Solid State Disk SATA 2.5).
(http://www.plianttechnology.com/lightning_lb.php)
Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5). 
(http://www.samsung.com/global/business/semiconductor/products/SSD/Products_Enterprise_SSD.html)

Both are SLC drives. The price of the Pliant is about 2,3 times the price of 
the Samsung (does it have twice the performance?).

One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung drives (4 
in RAID 10 + 1 spare).
Another alternative would be 3 servers (1 shared and 2 shards) with 5 Pliant 
drives (4 in RAID 10 + 1 spare). This would be slightly more expensive than the 
first alternative but would be easier to upgrade with two new shard servers 
when it's needed.

Anyone have experience using the Samsung or the Pliant SSD? Any information 
about degraded performance over time?
Any comments on the setups? How would an alternative with 15K disks (6 RAID 10 
+ 1 spare, or even 10 RAID10 + 1 spare) compare?
How would these alternatives compare in I/O performance compared to the old 
setup?
Anyone care to guess how the two alternatives would compare in performance 
running Postgresql?
How would the hardware usage of Postgresql compare to MySqls?



Regards
/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Andy Colson

On 1/18/2011 4:56 AM, Lars wrote:

Hi,

We are in the process of moving a web based application from a MySql
to Postgresql database. Our main reason for moving to Postgresql is
problems with MySql (MyISAM) table locking. We will buy a new set of
servers to run the Postgresql databases.

The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410,
4GB RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1
spare).

One server is used for shared data. Four servers are used for sharded
data. A user in the system only has data in one of the shards. There
is another server to which all data is replicated but I'll leave that
one out of this discussion. These are dedicated database servers.
There are more or less no stored procedures. The shared database size
is about 20GB and each shard database is about 40GB (total of 20 + 40
* 4 = 180GB). I would expect the size will grow 10%-15% this year.
Server load might increase with 15%-30% this year. This setup is disk
I/O bound. The overwhelming majority of sql statements are fast
(typically single row selects, updates, inserts and deletes on
primary key) but there are some slow long running (10min) queries.



No idea what mysql thinks a shard is, but in PG we have read-only hot
standby's.

The standby database is exactly the same as the master (save a bit of 
data that has not been synced yet.)  I assume you know this... but I'd 
really recommend trying out PG's hot-standby and make sure it works the 
way you need (because I bet its different than mysql's).


Assuming the shared and the sharded databases are totally different 
(lets call them database a and c), with the PG setup you'd have database 
a on one computer, then one master with database b on it (where all 
writes go), then several hot-standby's mirroring database b (that 
support read-only queries).


As for the hardware, you'd better test it.  Got any old servers you 
could put a real-world workload on?  Or just buy one new server for 
testing?  Its pretty hard to guess what your usage pattern is (70% read, 
 small columns, no big blobs (like photos), etc)... and even then we'd 
still have to guess.


I can tell you, however, having your readers and writers not block each 
other is really nice.


Not only will I not compare apples to oranges, but I really wont compare 
apples in Canada to oranges in Japan. :-)


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Andy Colson

oops, call them database 'a' and database 'b'.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Strange, John W
Are you going to RAID the SSD drives at all?  You would likely be better off 
with a couple of things.

- increasing ram to 256GB on each server to cache most of the databases. (easy, 
and cheaper than SSD)
- move to fusionIO
- move to SLC based SSD, warning not many raid controllers will get the 
performance out of the SSD's at this time.

Of the three I would suggest #1, and #2, the cost of a SLC SSD raid will cost 
more than the fusionIO drive and still not match the fusionIO drive performance.

Of course this is based on my experience, and I have my fireproof suit since I 
mentioned the word fusionIO :)

- John

-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Lars
Sent: Tuesday, January 18, 2011 4:57 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Migrating to Postgresql and new hardware

Hi,

We are in the process of moving a web based application from a MySql to 
Postgresql database.
Our main reason for moving to Postgresql is problems with MySql (MyISAM) table 
locking.
We will buy a new set of servers to run the Postgresql databases.

The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB RAM. 
PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).

One server is used for shared data.
Four servers are used for sharded data. A user in the system only has data in 
one of the shards.
There is another server to which all data is replicated but I'll leave that one 
out of this discussion.
These are dedicated database servers. There are more or less no stored 
procedures. The shared database size is about 20GB and each shard database is 
about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 
10%-15% this year. Server load might increase with 15%-30% this year. This 
setup is disk I/O bound. The overwhelming majority of sql statements are fast 
(typically single row selects, updates, inserts and deletes on primary key) but 
there are some slow long running (10min) queries.

As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb RAM, H700 
512MB NV Cache.
Dell has offered two alternative SSDs:
Samsung model SS805 (100GB Solid State Disk SATA 2.5).
(http://www.plianttechnology.com/lightning_lb.php)
Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5). 
(http://www.samsung.com/global/business/semiconductor/products/SSD/Products_Enterprise_SSD.html)

Both are SLC drives. The price of the Pliant is about 2,3 times the price of 
the Samsung (does it have twice the performance?).

One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung drives (4 
in RAID 10 + 1 spare).
Another alternative would be 3 servers (1 shared and 2 shards) with 5 Pliant 
drives (4 in RAID 10 + 1 spare). This would be slightly more expensive than the 
first alternative but would be easier to upgrade with two new shard servers 
when it's needed.

Anyone have experience using the Samsung or the Pliant SSD? Any information 
about degraded performance over time?
Any comments on the setups? How would an alternative with 15K disks (6 RAID 10 
+ 1 spare, or even 10 RAID10 + 1 spare) compare?
How would these alternatives compare in I/O performance compared to the old 
setup?
Anyone care to guess how the two alternatives would compare in performance 
running Postgresql?
How would the hardware usage of Postgresql compare to MySqls?



Regards
/Lars

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase  Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether

Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Mark Felder
On Tue, 18 Jan 2011 16:06:17 -0600, Strange, John W  
john.w.stra...@jpmchase.com wrote:


Of course this is based on my experience, and I have my fireproof suit  
since I mentioned the word fusionIO :)


I'll throw a fire blanket up as well. We have a customer who has been  
running Fusion IO with Postgres for about 2 years. They get amazing  
performance, but also aren't running fsync. They haven't had corruption  
with OS crashes (they're very abusive to their CentOS install), but did  
with a power outage (a UPS of ours went up in smoke; they weren't paying  
for N+1 power). Now their data is mostly static; they run analytics once a  
day and if they have a problem they can reload yesterday's data and run  
the analytics again to get back up to speed. If this is the type of stuff  
you're doing and you can easily get your data back to a sane state by all  
means give FusionIO a whirl!


This customer did discuss this with me in length last time they stopped in  
and also pointed out that FusionIO was announced as being a major part of  
some trading company or bank firm's database performance junk. I don't  
know the details, but I think he said they were out of Chicago. If anyone  
knows what I'm talking about please share the link. Either way, it seems  
that people are actually doing money transactions on FusionIO, so you can  
either take that as comforting reassurance or you can start getting really  
nervous about the stock market :-)



Regards,


Mark


PS, don't turn off fsync unless you know what you're doing.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread mark
Comments in line, take em for what you paid for em.



 -Original Message-
 From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
 ow...@postgresql.org] On Behalf Of Lars
 Sent: Tuesday, January 18, 2011 3:57 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] Migrating to Postgresql and new hardware
 
 Hi,
 
 We are in the process of moving a web based application from a MySql to
 Postgresql database.
 Our main reason for moving to Postgresql is problems with MySql
 (MyISAM) table locking.

I would never try and talk someone out of switching but MyISAM? What
version of MySQL and did you pick MyISAM for a good reason or just happened
to end up there?



 We will buy a new set of servers to run the Postgresql databases.
 
 The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB
 RAM. PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).
 
 One server is used for shared data.
 Four servers are used for sharded data. A user in the system only has
 data in one of the shards.
 There is another server to which all data is replicated but I'll leave
 that one out of this discussion.
 These are dedicated database servers. There are more or less no stored
 procedures. The shared database size is about 20GB and each shard
 database is about 40GB (total of 20 + 40 * 4 = 180GB). I would expect
 the size will grow 10%-15% this year. Server load might increase with
 15%-30% this year. This setup is disk I/O bound. The overwhelming
 majority of sql statements are fast (typically single row selects,
 updates, inserts and deletes on primary key) but there are some slow
 long running (10min) queries.
 
 As new server we are thinking of PowerEdge R510, 1 * Xeon X5650, 24Gb
 RAM, H700 512MB NV Cache.

One would think you should notice a nice speed improvement, ceteris paribus,
since the X5650 will have -significantly- more memory bandwidth than the
5410s you are used to, and you are going to have a heck of a lot more ram
for things to cache in. I think the H700 is a step up in raid cards as well
but with only 4 disks your probably not maxing out there.  



 Dell has offered two alternative SSDs:
 Samsung model SS805 (100GB Solid State Disk SATA 2.5).
 (http://www.plianttechnology.com/lightning_lb.php)
 Pliant model LB 150S (149GB Solid State Drive SAS 3Gbps 2.5).
 (http://www.samsung.com/global/business/semiconductor/products/SSD/Prod
 ucts_Enterprise_SSD.html)

The Samsung ones seems to indicate that they have protection in the event of
a power failure, and the pliant does not mention it. 

Granted I haven't done or seen any pull the plug under max load tests on
either family, so I got nothing beyond that it is the first thing I have
looked at with every SSD that crosses my path.



 
 Both are SLC drives. The price of the Pliant is about 2,3 times the
 price of the Samsung (does it have twice the performance?).
 
 One alternative is 5 servers (1 shared and 4 shards) with 5 Samsung
 drives (4 in RAID 10 + 1 spare).
 Another alternative would be 3 servers (1 shared and 2 shards) with 5
 Pliant drives (4 in RAID 10 + 1 spare). This would be slightly more
 expensive than the first alternative but would be easier to upgrade
 with two new shard servers when it's needed.

As others have mentioned, how are you going to be doing your shards?



 
 Anyone have experience using the Samsung or the Pliant SSD? Any
 information about degraded performance over time?
 Any comments on the setups? How would an alternative with 15K disks (6
 RAID 10 + 1 spare, or even 10 RAID10 + 1 spare) compare?


You still may find that breaking xlog out to its own logical drive (2 drives
in raid 1) gives a speed improvement to the overall. YMMV - so tinker and
find out before you go deploying. 

 How would these alternatives compare in I/O performance compared to the
 old setup?
 Anyone care to guess how the two alternatives would compare in
 performance running Postgresql?
 How would the hardware usage of Postgresql compare to MySqls?


I won't hazard a guess on the performance difference between PG w/ Fsync ON
and MySQL running with MyISAM. 

If you can get your OS and PG tuned you should be able to have a database
that can have pretty decent throughput for an OLTP workload. Since that
seems to be the majority of your intended workload. 


-Mark

 
 
 
 Regards
 /Lars
 
 --
 Sent via pgsql-performance mailing list (pgsql-
 performa...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Craig Ringer
On 18/01/11 18:56, Lars wrote:
 Hi,
 
 We are in the process of moving a web based application from a MySql to 
 Postgresql database.
 Our main reason for moving to Postgresql is problems with MySql (MyISAM) 
 table locking.
 We will buy a new set of servers to run the Postgresql databases.

Most people seem to simply move over to InnoDB when facing these issues,
saving themselves LOTS of pain over MyISAM while minimizing transition
costs. I assume you've rejected that, but I'm interested in why.

 The current setup is five Dell PowerEdge 2950 with 2 *  XEON E5410, 4GB RAM. 
 PERC 5/I 256MB NV Cache, 4 * 10K Disks (3 in RAID 5 + 1 spare).
 
 One server is used for shared data.
 Four servers are used for sharded data. A user in the system only has data in 
 one of the shards.
 There is another server to which all data is replicated but I'll leave that 
 one out of this discussion.

Don't, if you want to have a similar thing going in your Pg deployment
later. Replication in Pg remains ... interesting. An n-to-m (or n-to-1)
replication setup can't be achieved with the built-in replication in
9.0; you need to use things like Slony-I, Bucardo, etc each of which
have their own limitations and quirks.

 These are dedicated database servers. There are more or less no stored 
 procedures. The shared database size is about 20GB and each shard database is 
 about 40GB (total of 20 + 40 * 4 = 180GB). I would expect the size will grow 
 10%-15% this year. Server load might increase with 15%-30% this year. This 
 setup is disk I/O bound. The overwhelming majority of sql statements are fast 
 (typically single row selects, updates, inserts and deletes on primary key) 
 but there are some slow long running (10min) queries.

Since you're sharding (and thus clearly don't need strong cluster-wide
ACID) have you considered looking into relaxed semi-ACID / eventually
consistent database systems? If you're doing lots of simple queries and
few of the kind of heavy lifting reporting queries RDBMSs are great for,
it may be worth considering.

If your app uses a data acesss layer, it should be pretty easy to
prototype implementations on other databases and try them out.

Even if you do go for PostgreSQL, if you're not using memcached yet
you're wasting money and effort. You might get lots more life out of
your hardware with a bit of memcached love.

-- 
System  Network Administrator
POST Newspapers

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Migrating to Postgresql and new hardware

2011-01-18 Thread Scott Marlowe
On Tue, Jan 18, 2011 at 3:56 AM, Lars l...@unifaun.com wrote:

 Any comments on the setups? How would an alternative with 15K disks (6 RAID 
 10 + 1 spare, or even 10 RAID10 + 1 spare) compare?

RAID-10 is going to trounce RAID-5 for writes, which is where you
usually have the most issues.

 How would these alternatives compare in I/O performance compared to the old 
 setup?

Only testing can tell, but I've seen 4 SATA drives in RAID-10 with no
caching or fancy controller beat a 4 disk RAID-5 with BBU controller
more than once.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance