Re: [PERFORM] Postgres on RAID5
Josh Berkus wrote: A, This is a Sun e450 with dual TI UltraSparc II processors and 2G of RAM. It is currently running Debian Sarge with a 2.4.27-sparc64-smp custom compiled kernel. Postgres is installed from the Debian package and uses all the configuration defaults. Please read http://www.powerpostgresql.com/PerfList I have read that document. Very informative/useful. Thanks. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Postgres on RAID5
Alex Turner wrote: a 14 drive stripe will max out the PCI bus long before anything else, the only reason for a stripe this size is to get a total accessible size up. A 6 drive RAID 10 on a good controller can get up to 400Mb/sec which is pushing the limit of the PCI bus (taken from offical 3ware 9500S 8MI benchmarks). 140 drives is not going to beat 6 drives because you've run out of bandwidth on the PCI bus. The debait on RAID 5 rages onward. The benchmarks I've seen suggest that RAID 5 is consistantly slower than RAID 10 with the same number of drivers, but others suggest that RAID 5 can be much faster that RAID 10 (see arstechnica.com) (Theoretical performance of RAID 5 is inline with a RAID 0 stripe of N-1 drives, RAID 10 has only N/2 drives in a stripe, perfomance should be nearly double - in theory of course). 35 Trans/sec is pretty slow, particularly if they are only one row at a time. I typicaly get 200-400/sec on our DB server on a bad day. Up to 1100 on a fresh database. Well, by putting the pg_xlog directory on a separate disk/partition, I was able to increase this rate to about 50 or so per second (still pretty far from your numbers). Next I am going to try putting the pg_xlog on a RAID1+0 array and see if that helps. I suggested running a bonnie benchmark, or some other IO perftest to determine if it's the array itself performing badly, or if there is something wrong with postgresql. If the array isn't kicking out at least 50MB/sec read/write performance, something is wrong. Until you've isolated the problem to either postgres or the array, everything else is simply speculation. In a perfect world, you would have two 6 drive RAID 10s. on two PCI busses, with system tables on a third parition, and archive logging on a fourth. Unsurprisingly this looks alot like the Oracle recommended minimum config. Could you please elaborate on this setup a little more? How do you put system tables on a separate partition? I am still using version 7, and without tablespaces (which is how Oracle controls this), I can't figure out how to put different tables on different partitions. Thanks. Arshavir Also a note for interest is that this is _software_ raid... Alex Turner netEconomist On 13 Mar 2005 23:36:13 -0500, Greg Stark [EMAIL PROTECTED] wrote: Arshavir Grigorian [EMAIL PROTECTED] writes: Hi, I have a RAID5 array (mdadm) with 14 disks + 1 spare. This partition has an Ext3 filesystem which is used by Postgres. People are going to suggest moving to RAID1+0. I'm unconvinced that RAID5 across 14 drivers shouldn't be able to keep up with RAID1 across 7 drives though. It would be interesting to see empirical data. One thing that does scare me is the Postgres transaction log and the ext3 journal both sharing these disks with the data. Ideally both of these things should get (mirrored) disks of their own separate from the data files. But 2-3s pauses seem disturbing. I wonder whether ext3 is issuing a cache flush on every fsync to get the journal pushed out. This is a new linux feature that's necessary with ide but shouldn't be necessary with scsi. It would be interesting to know whether postgres performs differently with fsync=off. This would even be a reasonable mode to run under for initial database loads. It shouldn't make much of a difference with hardware like this though. And you should be aware that running under this mode in production would put your data at risk. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Arshavir Grigorian Systems Administrator/Engineer M-CAM, Inc. [EMAIL PROTECTED] +1 703-682-0570 ext. 432 Contents Confidential ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Postgres on RAID5
Alex Turner wrote: I would recommend running a bonnie++ benchmark on your array to see if it's the array/controller/raid being crap, or wether it's postgres. I have had some very surprising results from arrays that theoretically should be fast, but turned out to be very slow. I would also seriously have to recommend against a 14 drive RAID 5! This is statisticaly as likely to fail as a 7 drive RAID 0 (not counting the spare, but rebuiling a spare is very hard on existing drives). Thanks for the reply. Here are the results of the bonnie test on my array: ./bonnie -s 1 -d . oo 21 File './Bonnie.23736', size: 1048576 Writing with putc()...done Rewriting...done Writing intelligently...done Reading with getc()...done Reading intelligently...done Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done... ---Sequential Output ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 1 4762 96.0 46140 78.8 31180 61.0 3810 99.9 71586 67.7 411.8 13.1 On a different note, I am not sure how the probability of RAID5 over 15 disks failing is the same as that of a RAID0 array over 7 disks. RAID5 can operate in a degraded mode (14 disks - 1 bad), RAID0 on the other hand cannot operate on 6 disks (6 disks - 1 bad). Am I missing something? Are you saying running RAID0 on a set of 2 RAID1 arrays of 7 each? That would work fine, except I cannot afford to loose that much space. Care to comment on these numbers? Thanks. Arshavir ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Postgres on RAID5
(rev 14) :00:04.1 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14) :04:02.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 03) :04:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 03) /proc/scsi# more scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi4 Channel: 00 Id: 06 Lun: 00 Vendor: TOSHIBA Model: XM6201TASUN32XCD Rev: 1103 Type: CD-ROM ANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 00 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 01 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 02 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 03 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 -- Arshavir Grigorian Systems Administrator/Engineer ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Postgres on RAID5
:00:04.1 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 14) :04:02.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 03) :04:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c875 (rev 03) /proc/scsi# more scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi0 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39103LCSUN9.0G Rev: 034A Type: Direct-AccessANSI SCSI revision: 02 Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 02 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi3 Channel: 00 Id: 03 Lun: 00 Vendor: SEAGATE Model: ST39204LCSUN9.0G Rev: 4207 Type: Direct-AccessANSI SCSI revision: 03 Host: scsi4 Channel: 00 Id: 06 Lun: 00 Vendor: TOSHIBA Model: XM6201TASUN32XCD Rev: 1103 Type: CD-ROM ANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 00 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 01 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 02 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 Host: scsi5 Channel: 00 Id: 03 Lun: 00 Vendor: FUJITSU Model: MAG3091L SUN9.0G Rev: Type: Direct-AccessANSI SCSI revision: 02 -- Arshavir Grigorian Systems Administrator/Engineer ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres on RAID5
Tom Lane wrote: Arshavir Grigorian [EMAIL PROTECTED] writes: I have a RAID5 array (mdadm) with 14 disks + 1 spare. This partition has an Ext3 filesystem which is used by Postgres. Currently we are loading a 50G database on this server from a Postgres dump (copy, not insert) and are experiencing very slow write performance (35 records per second). What PG version is this? What version of pg_dump made the dump file? How are you measuring that write rate (seeing that pg_restore doesn't provide any such info)? Sorry I missed the version. Both (the db from which the dump was created and the one it's being loaded on) run on Pg 7.4. Well, if the restore is going on for X number of hours and you have Y records loaded, it's not hard to ballpark. Postgres is installed from the Debian package and uses all the configuration defaults. The defaults are made for a fairly small machine, not big iron. At a minimum you want to kick shared_buffers up to 10K or more. regards, tom lane Will do. Thanks. Arshavir ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Postgres on RAID5
Many thanks for all the response. I guess there are a lot of things to change and tweak and I wonder what would be a good benchmarking sample dataset (size, contents). My tables are very large (the smallest is 7+ mil records) and take several days to load (if not weeks). It would be nice to have a sample dataset that would be large enough to mimic my large datasets, but small enough to load in a short priod of time. Any suggestions? Arshavir ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] index use
Hi, I have a query that when run on similar tables in 2 different databases either uses the index on the column (primary key) in the where clause or does a full table scan. The structure of the tables is the same, except that the table where the index does not get used has an extra million rows (22mil vs 23mil). The 2 boxes where these database run are very different (Sparc with scsi disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would make a difference, but maybe it does. Also, according to our dba both tables have been analyzed about the same time. Any pointers would be much appreciated. Arshavir WORKS: = explain analyze select num from document where num like 'EP100%'; QUERY PLAN - Index Scan using document_pkey on document (cost=0.00..5.77 rows=1 width=14) (actual time=0.147..0.166 rows=2 loops=1) Index Cond: (((num)::text = 'EP100'::character varying) AND ((num)::text 'EP101'::character varying)) Filter: ((num)::text ~~ 'EP100%'::text) Total runtime: 0.281 ms (4 rows) = \d document Table public.document Column | Type | Modifiers ---++--- num | character varying(30) | not null titl | character varying(500) | isscntry | character varying(50) | issdate | date | filedate | date | appnum| character varying(20) | clnum | integer| exnum | integer| exmnr | character varying(300) | agent | character varying(300) | priodate | date | prionum | character varying(100) | priocntry | character varying(50) | legalstat | integer| Indexes: document_pkey primary key, btree (num) Check constraints: document_legalstat CHECK (legalstat 0 AND legalstat 6) DOES NOT WORK: d5= EXPLAIN ANALYZE select num from document where num like 'EP100%'; QUERY PLAN -- Seq Scan on document (cost=0.00..804355.12 rows=1 width=14) (actual time=97.235..353286.781 rows=2 loops=1) Filter: ((num)::text ~~ 'EP100%'::text) Total runtime: 353286.907 ms (3 rows) d5= \d document Table public.document Column | Type | Modifiers ---++--- num | character varying(30) | not null titl | character varying(500) | isscntry | character varying(50) | issdate | date | filedate | date | clnum | integer| exnum | integer| exmnr | character varying(300) | agent | character varying(300) | priodate | date | prionum | character varying(100) | priocntry | character varying(50) | legalstat | integer| appnum| character varying(20) | Indexes: document_pkey primary key, btree (num) Check constraints: $1 CHECK (legalstat 0 AND legalstat 6) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] index use
Thanks for all the replies. It actually has to do with the locales. The db where the index gets used is running on C vs the the other one that uses en_US.UTF-8. I guess the db with the wrong locale will need to be waxed and recreated with correct locale settings. I wonder if there are any plans to make LIKE work with all locales. Again, many thanks. You guys are great! Arshavir ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly