Re: [PERFORM] Recommended File System Configuration

2004-05-03 Thread Chris Browne
[EMAIL PROTECTED] (James Thornton) writes:
 Back in 2001, there was a lengthy thread on the PG Hackers list about
 PG and journaling file systems
 (http://archives.postgresql.org/pgsql-hackers/2001-05/msg00017.php),
 but there was no decisive conclusion regarding what FS to use. At the
 time the fly in the XFS ointment was that deletes were slow, but this
 was improved with XFS 1.1.

 I think a journaling a FS is needed for PG data since large DBs could
 take hours to recover on a non-journaling FS, but what about WAL files?

If the WAL files are on a small filesystem, it presumably won't take
hours for that filesystem to recover at fsck time.

The results have not been totally conclusive...

 - Several have found JFS to be a bit faster than anything else on
   Linux, but some data loss problems have been experienced;

 - ext2 has the significant demerit that with big filesystems, fsck
   will take forever to run;

 - ext3 appears to be the slowest option out there, and there are some
   stories of filesystem corruption;

 - ReiserFS was designed to be real fast with tiny files, which is not
   the ideal use case for PostgreSQL; the designers there are
   definitely the most aggressive at pushing out bleeding edge code,
   which isn't likely the ideal;

 - XFS is neither fastest nor slowest, but there has been a lack of
   reports of spontaneous data loss under heavy load, which is a
   good thing.  It's not part of official 2.4 kernels, requiring
   backports, but once 2.6 gets more widely deployed, this shouldn't
   be a demerit anymore...

I think that provides a reasonable overview of what has been seen...
-- 
output = reverse(gro.gultn @ enworbbc)
http://cbbrowne.com/info/oses.html
Donny: Are these the Nazis, Walter?
Walter: No, Donny, these men are nihilists. There's nothing to be
afraid of.  -- The Big Lebowski

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] cache table

2004-05-03 Thread Joseph Shraibman
I have a big table with some int fields.  I frequently need to do 
queries like:

SELECT if2, count(*) FROM table WHERE if1 = 20 GROUP BY if2;
The problem is that this is slow and frequently requires a seqscan. I'd 
like to cache the results in a second table and update the counts with 
triggers, but this would a) require another UPDATE for each 
INSERT/UPDATE which would slow down adding and updating of data and b) 
produce a large amount of dead rows for vacuum to clear out.

It would also be nice if this small table could be locked into the pg 
cache somehow.  It doesn't need to store the data on disk because the 
counts can be generated from scratch?

So what is the best solution to this problem?  I'm sure it must come up 
pretty often.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] linux distro for better pg performance

2004-05-03 Thread James Thornton
Joseph Shraibman wrote:
Is raid 5 much faster than raid 10?  On a 4 disk array with 3 data disks 
and 1 parity disk, you have to write 4/3rds the original data, while on 
raid 10 you have to write 2 times the original data, so logically raid 5 
should be faster.
RAID 5 will give you more capacity, but is usually not recommended for 
write intensive applications since RAID 5 writes require four I/O 
operations: parity and data disks must be read, new data is compared to 
data already on the drive and changes are noted, new parity is 
calculated, both the parity and data disks are written to. Furthermore, 
if a disk fails, performance is severely affected since all remaining 
drives must be read for each I/O in order to recalculate the missing 
disk drives data.

RAID 0+1 has the same performance and capacity as RAID 1+0 (10), but 
less reliability since a single drive failure will cause the whole 
array to become, in essence, a RAID Level 0 array so I don't know why 
anyone would choose it over RAID 10 where multiple disks can fail.

RAID 1 has the same capacity as RAID 10 (n/2), but RAID 10 has better 
performance so if you're going to have more than one drive pair, why not 
go for RAID 10 and get the extra performance from striping?

I have been researching how to configure Postgres for a RAID 10 SAME 
configuration as described in the Oracle paper Optimal Storage 
Configuration Made Easy 
(http://otn.oracle.com/deploy/availability/pdf/oow2000_same.pdf). Has 
anyone delved into this before?

The filesystem choice is also a key element in database performance 
tuning. In another Oracle paper entitled Tuning an Oracle8i Database 
Running Linux 
(http://otn.oracle.com/oramag/webcolumns/2002/techarticles/scalzo_linux02.html), 
Dr. Bert Scalzo says, The trouble with these tests-for example, Bonnie, 
Bonnie++, Dbench, Iobench, Iozone, Mongo, and Postmark-is that they are 
basic file system throughput tests, so their results generally do not 
pertain in any meaningful fashion to the way relational database systems 
access data files. Instead he suggests users benchmarking filesystems 
for database applications should use these two well-known and widely 
accepted database benchmarks:

AS3AP (http://www.benchmarkresources.com/handbook/5.html): a scalable, 
portable ANSI SQL relational database benchmark that provides a 
comprehensive set of tests of database-processing power; has built-in 
scalability and portability for testing a broad range of systems; 
minimizes human effort in implementing and running benchmark tests; and 
provides a uniform, metric, straightforward interpretation of the results.

TPC-C (http://www.tpc.org/): an online transaction processing (OLTP) 
benchmark that involves a mix of five concurrent transactions of various 
types and either executes completely online or queries for deferred 
execution. The database comprises nine types of tables, having a wide 
range of record and population sizes. This benchmark measures the number 
of transactions per second.

I encourage you to read the paper -- Dr. Scalzo's results will surprise 
you; however, while he benchmarked ext2, ext3, ReiserFS, JFS, and RAW, 
he did not include XFS.

SGI and IBM did a more detailed study on Linux filesystem performance, 
which included XFS, ext2, ext3 (various modes), ReiserFS, and JRS, and 
the results are presented in a paper entitled Filesystem Performance 
and Scalability in Linux 2.4.17 
(http://oss.sgi.com/projects/xfs/papers/filesystem-perf-tm.pdf). This 
paper goes over the details on how to properly conduct a filesystem 
benchmark and addresses scaling and load more so than Dr. Scalzo's tests.

For further study, I have compiled a list of Linux filesystem resources 
at: http://jamesthornton.com/hotlist/linux-filesystems/.
--

 James Thornton
__
Internet Business Consultant, http://jamesthornton.com
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] linux distro for better pg performance

2004-05-03 Thread Alan Stange
Joseph Shraibman wrote:
J. Andrew Rogers wrote:
Do these features make a difference?  Far more than you would 
imagine. On one postgres server I just upgraded, we went from a 3Ware 
8x7200-RPM
RAID-10 configuration to an LSI 320-2 SCSI 3x10k RAID-5, with 256M
Is raid 5 much faster than raid 10?  On a 4 disk array with 3 data 
disks and 1 parity disk, you have to write 4/3rds the original data, 
while on raid 10 you have to write 2 times the original data, so 
logically raid 5 should be faster. 
I think this comparison is a bit simplistic.   For example, most raid5 
setups have full stripes that are more than 8K  (the typical IO size in 
postgresql), so one might have to read in portions of the stripe in 
order to compute the parity.   The needed bits might be in some disk or 
controller cache;  if it's not then you lose.   If one is able to 
perform full stripe writes then the raid5 config should be faster for 
writes.

Note also that the mirror has 2 copies of the data, so that the read IOs 
would be divided across 2 (or more) spindles using round robin or a more 
advanced algorithm to reduce seek times. 

Of course, I might be completely wrong...
-- Alan
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] INSERT RULE

2004-05-03 Thread Pailloncy Jean-Gérard
Hi,
I test a configuration where one table is divided in 256 sub-table.
And I use a RULE to offer a single view to the data.
For INSERT I have create 256 rules like:
CREATE RULE ndicti_000 AS ON INSERT TO ndict
WHERE (NEW.word_id  255) = 000 DO INSTEAD
INSERT INTO ndict_000 VALUES( NEW.url_id, 000, NEW.intag);
CREATE RULE ndicti_001 AS ON INSERT TO ndict
WHERE (NEW.word_id  255) = 001 DO INSTEAD
INSERT INTO ndict_001 VALUES( NEW.url_id, 001, NEW.intag);
And that works, a bit slow.
I try to do:
CREATE RULE ndicti AS ON INSERT TO ndict
DO INSTEAD INSERT INTO 'ndict_' || (NEW.word_id  255)
VALUES( NEW.url_id, NEW.word_id, NEW.intag);
I got an error on 'ndict_' .
I did not found the right syntax.
Any help is welcomed.
Cordialement,
Jean-Gérard Pailloncy
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-05-03 Thread Pailloncy Jean-Gérard
Hello,
We're having a substantial problem with our FreeBSD 5.2 database 
server
running PostgreSQL - it's getting a lot of traffic (figure about 3,000
queries per second), but queries are slow, and it's seemingly waiting 
on
other things than CPU time

Could this be a 5.2 performance issue ?
In spite of certain areas where the 5.x series performance is known to 
be much better than 4.x (e.g networking), this may not be manifested 
in practice for a complete application.
(e.g. I am still running 4.9 as it outperformed 5.1 vastly for a ~100 
database sessions running queries - note that I have not tried out 
5.2, so am happy to be corrected on this)
I found the same problem.
I use OpenBSD 3.3,
On Pentium 2,4 GHz with 1 Gb RAM, RAID 10.
With PostgreSQL 7.4.1 with 32 Kb bock's size (to match ffs and raid 
block's size)
With pg_autovacuum daemon from Pg 7.5.

I run a web indexer.
sd0 raid-1 with system pg-log and indexer-log
sd1 raid-10 with pg-data and indexer-data
The sd1 disk achives between 10 and 40 Mb/s on normal operation.
When I get semwait in top, system waits ;-)
Not much disk activity.
Not much log in pg or indexer.
Just wait
What can I do ?
 sudo top  -s1 -S -I
load averages:  4.45,  4.45,  3.86  
  11:25:52
97 processes:  1 running, 96 idle
CPU states:  2.3% user,  0.0% nice,  3.8% system,  0.8% interrupt, 
93.1% idle
Memory: Real: 473M/803M act/tot  Free: 201M  Swap: 0K/3953M used/tot

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT TIMECPU COMMAND
 2143 postgres  -50 4008K   37M sleep biowai   1:02  1.81% postgres
28662 postgres  140 4060K   37M sleep semwai   0:59  1.17% postgres
25794 postgres  140 4072K   37M sleep semwai   1:30  0.93% postgres
23271 postgres  -50 4060K   37M sleep biowai   1:13  0.29% postgres
14619 root  280  276K  844K run   -0:01  0.00% top
 vmstat -w1 sd0 sd1
 r b wavmfre   flt  re  pi  po  fr  sr sd0 sd1   insy   cs 
us sy id
 0 4 0 527412  36288  1850   0   0   0   0   0  26  72  368  8190  588  
0  4 96
 0 4 0 527420  36288  1856   0   0   0   0   0   0  86  356  8653  620  
2  2 97
 0 4 0 527432  36280  1853   0   0   0   0   0   0  54  321  8318  458  
1  3 96
 0 4 0 527436  36248  1864   0   0   0   0   0   0  77  358  8417  539  
1  2 97
 0 4 0 522828  40932  2133   0   0   0   0   0   7  70  412 15665  724  
2  3 95
 0 4 0 522896  40872  1891   0   0   0   0   0  15  72  340  9656  727  
3  5 92
 0 4 0 522900  40872  1841   0   0   0   0   0   0  69  322  8308  536  
1  2 98
 0 4 0 522920  40860  1846   0   0   0   0   0   1  69  327  8023  520  
2  2 97
 0 4 0 522944  40848  1849   0   0   0   0   0   4  76  336  8035  567  
1  2 97
 0 4 0 522960  40848  1843   0   0   0   0   0   0  77  331 14669  587  
3  2 95
 0 4 0 522976  40836  1848   0   0   0   0   0   4  81  339  8384  581  
1  2 97
 0 4 0 522980  40836  1841   0   0   0   0   0   3  65  320  8068  502  
1  4 95
 0 4 0 523000  40824  1848   0   0   0   0   0  14  74  341  8226  564  
3  2 95
 0 4 0 523020  40812  1844   0   0   0   0   0   0  67  317  7606  530  
2  1 97
 1 4 0 523052  40796  1661   0   0   0   0   0   0  68  315 11603  493  
2  2 97
 1 4 0 523056  40800   233   0   0   0   0   0  12  87  341 12550  609  
2  2 96
 0 4 0 523076  40788  1845   0   0   0   0   0   0  82  334 12457  626  
2  2 96
 0 4 0 523100  40776  1851   0   0   0   0   0   0  91  345 10914  623  
2  3 95
 0 4 0 523120  40764  1845   0   0   0   0   0   0  92  343 19213  596  
1  5 95
 0 4 0 523136  40752  1845   0   0   0   0   0   0  97  349  8659  605  
2  2 96
 0 4 0 523144  40748  4501   0   0   0   0   0  32  78  385 15632  934 
25 12 64
 0 4 0 523168  40728  1853   0   0   0   0   0   3  74  335  3965  531  
0  2 98

 ps -Upostgresql -Ostart | grep -v idle
  PID STARTED  TT   STAT  TIME COMMAND
 8267 10:53AM  ??  Is  0:00.28 /usr/local/bin/pg_autovacuum -D -L 
/var/pgsql/autovacuum
23271 10:54AM  ??  I   1:13.56 postmaster: dps dps 127.0.0.1 SELECT 
(postgres)
28662 10:55AM  ??  I   0:59.98 postmaster: dps dps 127.0.0.1 SELECT 
(postgres)
25794 10:56AM  ??  D   1:30.48 postmaster: dps dps 127.0.0.1 SELECT 
(postgres)
 2143 11:02AM  ??  D   1:02.06 postmaster: dps dps 127.0.0.1 DELETE 
(postgres)
25904 10:52AM  C0- I   0:00.07 /usr/local/bin/postmaster -D 
/var/pgsql (postgres)
10908 10:52AM  C0- I   0:05.96 postmaster: stats collector process  
  (postgres)
 7045 10:52AM  C0- I   0:05.19 postmaster: stats buffer process
(postgres)

 grep -v -E '^#' /var/pgsql/postgresql.conf
tcpip_socket = true
max_connections = 100
shared_buffers = 1024   # 32KB
max_fsm_pages = 100 # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 200 # min 100, ~50 bytes each
wal_buffers = 32# min 4, 8KB each
checkpoint_segments = 16# in logfile segments, min 1, 16MB each
commit_delay = 100  # range 0-10, in microseconds