Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-27 Thread Terry Schmitt
Hi Chris,

A couple comments on the NetApp SAN.
We use NetApp, primarily with Fiber connectivity and FC drives. All of the
Postgres files are located on the SAN and this configuration works well.
We have tried iSCSI, but performance his horrible. Same with SATA drives.
The SAN will definitely be more costly then local drives. It really depends
on what your needs are.
The biggest benefit for me in using SAN is using the special features that
it offers. We use snapshots and flex clones, which is a great way to backup
and clone large databases.

Cheers,
Terry


On Thu, Jul 14, 2011 at 11:34 PM, chris chri...@gmx.net wrote:

 Hi list,

 My employer will be donated a NetApp FAS 3040 SAN [1] and we want to run
 our warehouse DB on it. The pg9.0 DB currently comprises ~1.5TB of
 tables, 200GB of indexes, and grows ~5%/month. The DB is not update
 critical, but undergoes larger read and insert operations frequently.

 My employer is a university with little funds and we have to find a
 cheap way to scale for the next 3 years, so the SAN seems a good chance
 to us. We are now looking for the remaining server parts to maximize DB
 performance with costs = $4000. I digged out the following
 configuration with the discount we receive from Dell:

  1 x Intel Xeon X5670, 6C, 2.93GHz, 12M Cache
  16 GB (4x4GB) Low Volt DDR3 1066Mhz
  PERC H700 SAS RAID controller
  4 x 300 GB 10k SAS 6Gbps 2.5 in RAID 10

 I was thinking to put the WAL and the indexes on the local disks, and
 the rest on the SAN. If funds allow, we might downgrade the disks to
 SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible).

 Any comments on the configuration? Any experiences with iSCSI vs. Fibre
 Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a
 cheap alternative how to connect as many as 16 x 2TB disks as DAS?

 Thanks so much!

 Best,
 Chris

 [1]: http://www.b2net.co.uk/netapp/fas3000.pdf


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Greg Smith

chris wrote:

My employer is a university with little funds and we have to find a
cheap way to scale for the next 3 years, so the SAN seems a good chance
to us.


A SAN is rarely ever the cheapest way to scale anything; you're paying 
extra for reliability instead.




I was thinking to put the WAL and the indexes on the local disks, and
the rest on the SAN. If funds allow, we might downgrade the disks to
SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible).
  


If you want to keep the bulk of the data on the SAN, this is a 
reasonable way to go, performance-wise.  But be aware that losing the 
WAL means your database is likely corrupted.  That means that much of 
the reliability benefit of the SAN is lost in this configuration.




Any experiences with iSCSI vs. Fibre
Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a
cheap alternative how to connect as many as 16 x 2TB disks as DAS?
  


I've never heard anyone recommend iSCSI if you care at all about 
performance, while FC works fine for this sort of job.  The physical 
dimensions of 3.5 drives makes getting 16 of them in one reasonably 
sized enclosure normally just out of reach.  But a Dell PowerVault 
MD1000 will give you 15 x 2TB as inexpensively as possible in a single 
3U space (well, as cheaply as you want to go--you might build your own 
giant box cheaper but I wouldn't recommend ).  I've tested MD1000, 
MD1200, and MD1220 arrays before, and always gotten seriously good 
performance relative to the dollars spent with that series.  Only one of 
these Dell storage arrays I've heard two disappointing results from (but 
not tested directly yet) is the MD3220.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread jesper
   1 x Intel Xeon X5670, 6C, 2.93GHz, 12M Cache
   16 GB (4x4GB) Low Volt DDR3 1066Mhz
   PERC H700 SAS RAID controller
   4 x 300 GB 10k SAS 6Gbps 2.5 in RAID 10

Apart from Gregs excellent recommendations. I would strongly suggest
more memory. 16GB in 2011 is really on the low side.

PG is using memory (either shared_buffers og OS cache) for
keeping frequently accessed data in. Good recommendations are hard
without knowledge of data and access-patterns, but 64, 128 and 256GB
system are quite frequent when you have data that can't all be
in memory at once.

SAN's are nice, but I think you can buy a good DAS thing each year
for just the support cost of a Netapp, but you might have gotten a
really good deal there too. But you are getting a huge amount of
advanced configuration features and potential ways of sharing and..
and .. just see the specs.

.. and if you need those the SAN is a good way to go, but
they do come with a huge pricetag.

Jesper


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Robert Schnabel


On 7/15/2011 2:10 AM, Greg Smith wrote:

chris wrote:

My employer is a university with little funds and we have to find a
cheap way to scale for the next 3 years, so the SAN seems a good chance
to us.

A SAN is rarely ever the cheapest way to scale anything; you're paying
extra for reliability instead.



I was thinking to put the WAL and the indexes on the local disks, and
the rest on the SAN. If funds allow, we might downgrade the disks to
SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible).


If you want to keep the bulk of the data on the SAN, this is a
reasonable way to go, performance-wise.  But be aware that losing the
WAL means your database is likely corrupted.  That means that much of
the reliability benefit of the SAN is lost in this configuration.



Any experiences with iSCSI vs. Fibre
Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a
cheap alternative how to connect as many as 16 x 2TB disks as DAS?


I've never heard anyone recommend iSCSI if you care at all about
performance, while FC works fine for this sort of job.  The physical
dimensions of 3.5 drives makes getting 16 of them in one reasonably
sized enclosure normally just out of reach.  But a Dell PowerVault
MD1000 will give you 15 x 2TB as inexpensively as possible in a single
3U space (well, as cheaply as you want to go--you might build your own
giant box cheaper but I wouldn't recommend ).


I'm curious what people think of these:
http://www.pc-pitstop.com/sas_cables_enclosures/scsase166g.asp

I currently have my database on two of these and for my purpose they 
seem to be fine and are quite a bit less expensive than the Dell 
MD1000.  I actually have three more of the 3G versions with expanders 
for mass storage arrays (RAID0) and haven't had any issues with them in 
the three years I've had them.


Bob




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Scott Marlowe
On Fri, Jul 15, 2011 at 12:34 AM, chris chri...@gmx.net wrote:
 I was thinking to put the WAL and the indexes on the local disks, and
 the rest on the SAN. If funds allow, we might downgrade the disks to
 SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible).

Just to add to the conversation, there's no real advantage to putting
WAL on SSD.  Indexes can benefit from them, but WAL is mosty
seqwuential throughput and for that a pair of SATA 1TB drives at
7200RPM work just fine for most folks.  For example, in one big server
we're running we have 24 drives in a RAID-10 for the /data/base dir
with 4 drives in a RAID-10 for pg_xlog, and those 4 drives tend to
have the same io util % under iostat as the 24 drives under normal
usage.  It takes a special kind of load (lots of inserts happening in
large transactions quickly) for the 4 drive RAID-10 to have more than
50% util ever.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Scott Marlowe
On Fri, Jul 15, 2011 at 10:39 AM, Robert Schnabel
schnab...@missouri.edu wrote:
 I'm curious what people think of these:
 http://www.pc-pitstop.com/sas_cables_enclosures/scsase166g.asp

 I currently have my database on two of these and for my purpose they seem to
 be fine and are quite a bit less expensive than the Dell MD1000.  I actually
 have three more of the 3G versions with expanders for mass storage arrays
 (RAID0) and haven't had any issues with them in the three years I've had
 them.

I have a co-worker who's familiar with them and they seem a lot like
the 16 drive units we use from Aberdeen, which fully outfitted with
15k SAS drives run $5k to $8k depending on the drives etc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Josh Berkus

 Just to add to the conversation, there's no real advantage to putting
 WAL on SSD.  Indexes can benefit from them, but WAL is mosty
 seqwuential throughput and for that a pair of SATA 1TB drives at
 7200RPM work just fine for most folks.  

Actually, there's a strong disadvantage to putting WAL on SSD.  SSD is
very prone to fragmentation if you're doing a lot of deleting and
replacing files.  I've implemented data warehouses where the database
was on SSD but WAL was still on HDD.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread chris r.
Hi list,

Thanks a lot for your very helpful feedback!

 I've tested MD1000, MD1200, and MD1220 arrays before, and always gotten
 seriously good performance relative to the dollars spent
Great hint, but I'm afraid that's too expensive for us. But it's a great
way to scale over the years, I'll keep that in mind.

I had a look at other server vendors who offer 4U servers with slots for
16 disks for 4k in total (w/o disks), maybe that's an even
cheaper/better solution for us. If you had the choice between 16 x 2TB
SATA vs. a server with some SSDs for WAL/indexes and a SAN (with SATA
disk) for data, what would you choose performance-wise?

Again, thanks so much for your help.

Best,
Chris

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Rob Wultsch
On Fri, Jul 15, 2011 at 11:49 AM, chris r. chri...@gmx.net wrote:
 Hi list,

 Thanks a lot for your very helpful feedback!

 I've tested MD1000, MD1200, and MD1220 arrays before, and always gotten
 seriously good performance relative to the dollars spent
 Great hint, but I'm afraid that's too expensive for us. But it's a great
 way to scale over the years, I'll keep that in mind.

 I had a look at other server vendors who offer 4U servers with slots for
 16 disks for 4k in total (w/o disks), maybe that's an even
 cheaper/better solution for us. If you had the choice between 16 x 2TB
 SATA vs. a server with some SSDs for WAL/indexes and a SAN (with SATA
 disk) for data, what would you choose performance-wise?

 Again, thanks so much for your help.

 Best,
 Chris

SATA drives can easily flip bits and postgres does not checksum data,
so it will not automatically detect corruption for you. I would steer
well clear of SATA unless you are going to be using a fs like ZFS
which checksums data. I would hope that a SAN would detect this for
you, but I have no idea.


-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware advice for scalable warehouse db

2011-07-15 Thread Josh Berkus
On 7/14/11 11:34 PM, chris wrote:
 Any comments on the configuration? Any experiences with iSCSI vs. Fibre
 Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a
 cheap alternative how to connect as many as 16 x 2TB disks as DAS?

Here's the problem with iSCSI: on gigabit ethernet, your maximum
possible throughput is 100mb/s, which means that your likely maximum
database throughput (for a seq scan or vacuum, for example) is 30mb/s.
That's about a third of what you can get with good internal RAID.

While multichannel iSCSI is possible, it's hard to configure, and
doesn't really allow you to spread a *single* request across multiple
channels.  So: go with fiber channel if you're using a SAN.

iSCSI also has horrible lag times, but you don't care about that so much
for DW.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance