Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-09-22 Thread Andrew Hammond
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Rod Taylor wrote:
| I've used both a NetApp and Hitachi based SANs with PostgreSQL. Both
| work as well as expected, but do require some tweeking as they normally
| are not optimized for the datablock size that PostgreSQL likes to deal
| with (8k by default) -- this can make as much as a 50% difference in
| performance levels.
I'm looking for documentation about the datablock size you mentioned above.
My goal is to tune the disk / filesystem on our prototype system. It's
an EMC disk array, so sectors on disk are 512 bytes of usable space.
We've decided to go with RAID 10 since the goal is to maximize
performance. Currently the raid element size is set at 16 sectors which
is 8192 bytes of payload. I've got a sysadmin working on getting XFS
going with 8192 byte blocks. My next task will be to calculate the
amount of space used by XFS for headers etc. to find out how much of
those 8192 bytes can be used for the postgres payload. Then configure
postgres to use datablocks that size. So I'm looking for details on how
to manipulate the size of the datablock.
I'm also not entirely sure how to make the datablocks line up with the
filesystem blocks. Any suggestions on this would be greatly appreciated.
- --
Andrew Hammond416-673-4138[EMAIL PROTECTED]
Database Administrator, Afilias Canada Corp.
CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
iD8DBQFBUeHmgfzn5SevSpoRAu2sAJ4nHHup5lhp4+RcgBPGoJpUFoE1SQCgyvW1
ixyAvqb7ZkB+IIdGb36mpxI=
=uDLW
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-09-22 Thread Rod Taylor
 Rod Taylor wrote:
 | I've used both a NetApp and Hitachi based SANs with PostgreSQL. Both
 | work as well as expected, but do require some tweeking as they normally
 | are not optimized for the datablock size that PostgreSQL likes to deal
 | with (8k by default) -- this can make as much as a 50% difference in
 | performance levels.

 I'm also not entirely sure how to make the datablocks line up with the
 filesystem blocks. Any suggestions on this would be greatly appreciated.

We just played with Veritas settings while running pg_bench on a 200GB
database. I no longer have access to the NetApp, but the settings for
the Hitachi are below.

In tunefstab we have:

read_pref_io=8192,read_nstream=4,write_pref_io=8192,write_nstream=2

In fstab it's:
defaults,mincache=tmpcache,noatime


If you have better settings, please shoot them over so we can try them
out. Perhaps even get someone over there to write a new SAN section in
the Tuning Chapter.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-09-22 Thread Greg Stark

Andrew Hammond [EMAIL PROTECTED] writes:

 My goal is to tune the disk / filesystem on our prototype system. It's
 an EMC disk array, so sectors on disk are 512 bytes of usable space.
 We've decided to go with RAID 10 since the goal is to maximize
 performance. Currently the raid element size is set at 16 sectors which
 is 8192 bytes of payload. 

Do people find it works well to have a stripe size that small? It seems like
it would be better to have it be at least a few filesystem/postgres blocks so
that subsequent reads stand a chance of being sequential and not causing
another spindle to have to seek. Does this depend on whether it's an DSS load
vs an OLTP load? If it's a single query at a time DSS system perhaps small
blocksizes work best to get maximum throughput?

 I've got a sysadmin working on getting XFS going with 8192 byte blocks.

Having your filesystem block size match postgres's block size is probably a
good idea. So 8k blocks is good.

 My next task will be to calculate the amount of space used by XFS for
 headers etc. to find out how much of those 8192 bytes can be used for the
 postgres payload.

No filesystem that I know of uses up space in every block. The overhead is all
stored elsewhere in blocks exclusively contain such overhead data. So just
setting postgres to 8k which the default would work well.

 Then configure postgres to use datablocks that size. So I'm looking for
 details on how to manipulate the size of the datablock.

Look in pg_config_manual.h in src/include. Postgres has to be recompiled to
change it and the database has to be reinitialized. But it could be set to 16k
or 32k. In which case you would probably want to adjust your filesystem to
match. But unless you do experiments you won't know if it would be of any
benefit to change.

 I'm also not entirely sure how to make the datablocks line up with the
 filesystem blocks. Any suggestions on this would be greatly appreciated.

They just will. The files start on a block boundary, so every 8k is a new
block. Postgres stores 8k at a time always.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread bsimon

Hi all,

I've been searching the list for a while but couldn't find any up-to-date information relating to my problem.
We have a production server with postgresql on cygwin that currently deels with about 200 Gigs of data (1 big IDE drive). We plan to move to linux for some reasons I don't have to explain.
Our aim is also to be able to increase our storage capacity up to approximately 1 or 2 terabytes and to speed up our production process. As we are a small microsoft addicted company , we have some difficulties to choose the best configuration that would best meet our needs.
Our production process is based on transaction (mostly huge inserts) and disk access is the main bottlle-neck.

Our main concern is hardware related :

Would NAS or SAN be good solutions ? (I've read that NAS uses NFS which could slow down the transfer rate ??)
Has anyone ever tried one of these with postgresql ? 

I would appreciate any comments.
Thanks in advance.

Benjamin.


Benjamin Simon - Ingénieur Développement Cartographie
http://www.loxane.com
tel : 01 30 40 24 00
Fax : 01 30 40 24 04

LOXANE 
271, Chaussée Jules César 95250 Beauchamp
France

Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread Scott Marlowe
On Tue, 2004-07-20 at 01:52, [EMAIL PROTECTED] wrote:
 Hi all,
 
 I've been searching the list for a while but couldn't find any
 up-to-date information relating to my problem.
 We have a production server with postgresql on cygwin that currently
 deels with about 200 Gigs of data (1 big IDE drive). We plan to move
 to linux for some reasons I don't have to explain.
 Our aim is also to be able to increase our storage capacity up to
 approximately 1 or 2 terabytes and to speed up our production process.
 As we are a small microsoft addicted company , we have some
 difficulties to choose the best configuration that would best meet our
 needs.
 Our production process is based on transaction (mostly huge inserts)
 and disk access is the main bottlle-neck.
 
 Our main concern is hardware related :
 
 Would NAS or SAN be good solutions ? (I've read that NAS uses NFS
 which could slow down the transfer rate ??)
 Has anyone ever tried one of these with postgresql ? 

Your best bet would likely be a large external RAID system with lots o
cache.  Next would be a fast internal RAID card like the LSI Megaraid
cards, with lots of drives and batter backed cache.  Next would be a
SAN, but be careful, there may be issues with some cards and their
drivers under linux, research them well before deciding.  NFS is right
out if you want good performance AND reliability.

The cheapest solution that is likely to meet your needs would be the
internal RAID card with battery backed cache.  


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Réf. : Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread bsimon

Thanks a lot Scott.

It seems that we were totally wrong when considering a network storage solution. I've read your techdoc http://techdocs.postgresql.org/guides/DiskTuningGuide and found many interesting remarks. 
I think that we will know focus on external Raid systems which seem to be relativily affordable compared to NAS or SAN (we would have had the budget for one of these). 
As we don't plan to have more than 5 connections (I.E process), we think SATA drives would fit our requirements. Could this be an issue for an after crash recovery ?
We also hesitate concerning the raid level to use. We are currently comparing raid 1+0 and raid 5 but we have no actual idea on which one to use.

Our priorities are : 
1) performance
2) recovery
3) price
4) back-up 

It could be nice to have any comments from people who have already set up a similar platform, giving some precise details of the hardware configuration :
- brand of the raid device, 
- technology used (SCSI/IDE, RAID level ...), 
- size of the database, number of disks/size of disks ...

Such a knowledge base may be useful to convince people to migrate to opensource cheap reliable solutions. 
Thanks again.

Benjamin.








Scott Marlowe [EMAIL PROTECTED]
Envoyé par : [EMAIL PROTECTED]
20/07/2004 10:20


Pour :[EMAIL PROTECTED]
cc :[EMAIL PROTECTED]
Objet :Re: [PERFORM] NAS, SAN or any alternate solution ?


On Tue, 2004-07-20 at 01:52, [EMAIL PROTECTED] wrote:
 Hi all,
 
 I've been searching the list for a while but couldn't find any
 up-to-date information relating to my problem.
 We have a production server with postgresql on cygwin that currently
 deels with about 200 Gigs of data (1 big IDE drive). We plan to move
 to linux for some reasons I don't have to explain.
 Our aim is also to be able to increase our storage capacity up to
 approximately 1 or 2 terabytes and to speed up our production process.
 As we are a small microsoft addicted company , we have some
 difficulties to choose the best configuration that would best meet our
 needs.
 Our production process is based on transaction (mostly huge inserts)
 and disk access is the main bottlle-neck.
 
 Our main concern is hardware related :
 
 Would NAS or SAN be good solutions ? (I've read that NAS uses NFS
 which could slow down the transfer rate ??)
 Has anyone ever tried one of these with postgresql ? 

Your best bet would likely be a large external RAID system with lots o
cache. Next would be a fast internal RAID card like the LSI Megaraid
cards, with lots of drives and batter backed cache. Next would be a
SAN, but be careful, there may be issues with some cards and their
drivers under linux, research them well before deciding. NFS is right
out if you want good performance AND reliability.

The cheapest solution that is likely to meet your needs would be the
internal RAID card with battery backed cache. 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html




Réf. : Re: Réf. : Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread bsimon

I must say that cygwin did well (there exists good software on windows, i've found one)... as a prototype ... when I look at the postgresql poll (http://www.postgresql.org/survey.php?View=1SurveyID=11), it seems like I'm not alone !!
Actually, the major problem was the limit of the available allocable memory restricted by cygwin.

We don't plan to wait for the 7.5 win native version of postgresql. It was hard enough to decide moving to linux, I don't want to rollback everything :)
Thanks for the advice, I will definetely have a look at the new version anyway as soon as it is released.

Regards,
Benjamin.







Mark Kirkwood [EMAIL PROTECTED]
20/07/2004 12:04


Pour :[EMAIL PROTECTED]
cc :[EMAIL PROTECTED]
Objet :Re: Réf. : Re: [PERFORM] NAS, SAN or any alternate solution ?




[EMAIL PROTECTED] wrote:


 As we don't plan to have more than 5 connections (I.E process), we 
 think SATA drives would fit our requirements. Could this be an issue 
 for an after crash recovery ?

If you can disable the write ATA write cache, then you have safety. 
Unfortunately many cards under Linux show up as SCSI devices, and you 
can't access this setting. Does anyone know if the newer SATA cards let 
you control this?

You might want to keep and eye on the upcoming native windows port in 
7.5 - It will come with a fearsome array of caveats... but you have been 
running cygwin in production! - and I am inclined to think the native 
port will be more solid than this configuration.

regards

Mark










Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread Grega Bremec
...and on Tue, Jul 20, 2004 at 09:52:56AM +0200, [EMAIL PROTECTED] used the keyboard:
 Hi all,
 
 I've been searching the list for a while but couldn't find any up-to-date 
 information relating to my problem.
 We have a production server with postgresql on cygwin that currently deels 
 with about 200 Gigs of data (1 big IDE drive). We plan to move to linux 
 for some reasons I don't have to explain.
 Our aim is also to be able to increase our storage capacity up to 
 approximately 1 or 2 terabytes and to speed up our production process. As 
 we are a small microsoft addicted company , we have some difficulties to 
 choose the best configuration that would best meet our needs.
 Our production process is based on transaction (mostly huge inserts) and 
 disk access is the main bottlle-neck.
 
 Our main concern is hardware related :
 
 Would NAS or SAN be good solutions ? (I've read that NAS uses NFS which 
 could slow down the transfer rate ??)
 Has anyone ever tried one of these with postgresql ? 
 
 I would appreciate any comments.
 Thanks in advance.

Hello Simon,

We're testing 3ware Escalade 9000, which is a hardware-raid SATA
controller with VERY good support for Linux (including direct access
for S.M.A.R.T. applications, which is a serious problem with other
RAID controllers), featuring RAID levels 0, 1, 10, 5, JBOD, up to
12 SATA channels (that's 3ware Escalade 9500S-12, they also come in
4- and 8-channel versions, up to four cards can be fitted into a
system), up to 1GB battery-backed ECC RAM (128MB out-of-the-box)
and most of all, excellent tuning guides that actually manage to
exceed the scope of merely making you come up with good benchmark
results for that controller in a specific test environment.

Our preliminary tests show that a setup of four 250GB SATA Maxtors
that aren't really qualified as fast drives, in RAID5 can deliver
block writes of 50MB/s, rewrites at about 35MB/s and reads of
approximately 180MB/s, which is rougly 2.5-times the performance
of previous Escalades.

You can find more info on Escalade 9000 series, benchmarks and
other stuff here:

http://www.3ware.com/products/serial_ata9000.asp
http://www.3ware.com/products/benchmarks_sata.asp

http://www.3ware.dk/fileadmin/3ware/documents/Benchmarks/Linux_kernel_2.6_Benchmarking.pdf

Oh, and not to forget - the price for a 3ware 9500S-12, the version
we're testing ranges between EUR1000 and EUR1500, depending on the
contract you have with the reseller and the intended use of the
device. SATA disks are dirt-cheap nowadays, as has been mentioned
before.

I do agree on the reliability of cache-usage setting those drives
report though, it may or may not be true. But one never knows that
for sure with SCSI drives either. At least you can assert that
proper controller cache sizing with drives that usually feature
8MB (!!!) cache, will mostly ensure that even the largest amount
of data that could fit into a hard disk cache of the entire array
(96MB) will still be available in the controller cache after a
power failure, for it to be re-checked and ensured it is properly
written.

Hope this helps,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software  Media
http://www.noviforum.si/


pgptdbYC1z9Fk.pgp
Description: PGP signature


Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread Grega Bremec
 
 Oh, and not to forget - the price for a 3ware 9500S-12, the version
 we're testing ranges between EUR1000 and EUR1500, depending on the
 contract you have with the reseller and the intended use of the
 device. SATA disks are dirt-cheap nowadays, as has been mentioned
 before.
 

Correction, EUR500 and EUR1000, VAT not included. :)

Sorry for the mix-up.
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software  Media
http://www.noviforum.si/


pgpLKhS3Qk0se.pgp
Description: PGP signature


Re: [PERFORM] NAS, SAN or any alternate solution ?

2004-07-20 Thread Joe Conway
[EMAIL PROTECTED] wrote:
Would NAS or SAN be good solutions ? (I've read that NAS uses NFS which 
could slow down the transfer rate ??)

Has anyone ever tried one of these with postgresql ? 
Not (yet) with Postgres, but my company has run ~100GB Oracle database 
on NAS (NetApp) for the past couple of years. We've found it to 
outperform local attached storage, and it has been extremely reliable 
and flexible. Our DBAs wouldn't give it up without a fight.

Joe
---(end of broadcast)---
TIP 8: explain analyze is your friend