Re: [PERFORM] How to improve Postgres performance

2005-08-30 Thread Dennis Bjorklund
On Tue, 30 Aug 2005, Hemant Pandey wrote:

 So please tell me how can i improve database performance through
 configuration parameters. I had tried to change parameters in
 postgresql.conf file but of no avail.
 Now i am trying to Auto Vacuum, but don't know how to run Auto Vacuum.

The most important part is that you need to run VACUUM ANALYZE regulary.  
Vacuum can be started each night in a cron job, started from pg_autovacuum
when it thinks it's needed, or started in some other way. In any case, it
has to be run whenever the data in the database have changed enough.

The parameters in the config that is most important in my experience is 
effective_cache_size and shared_buffers.

This is a text I like (it's for pg 7.4 but still useful):

  http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] shared buffers

2005-08-30 Thread William Yu

Carlos Henrique Reimer wrote:

I forgot to say that it´s a 12GB database...
 
Ok, I´ll set shared buffers to 30.000 pages but even so meminfo and 
top shouldn´t show some shared pages?
 
I heard something about that Redhat 9 can´t handle very well RAM higher 
than 2GB. Is it right?

Thanks in advance!


RH9, like any 32-bit OS, is limited to 2GB address space w/o special 
tricks. However, it can access  2GB for the OS disk cache using PAE if 
you are running the bigmem kernel.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Alvaro Nunes Melo

Hello,

We are about to install a new PostgreSQL server, and despite of being a 
very humble configuration compared to the ones we see in the list, it's 
the biggest one we've got till now.


The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main 
doubt is what is the best configuration for the disks. We are thinking 
about use them in a RAID-0 array. Is this the best option? What do you 
suggest on partitioning? Separate partitions for the OS, data and pg_xlog?


We'll have some time to work on performance tests, and if someone is 
interested we can provide our results.


Thanks in advance,
Alvaro

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Bruno Wolff III
On Tue, Aug 30, 2005 at 09:37:17 -0300,
  Alvaro Nunes Melo [EMAIL PROTECTED] wrote:
 
 The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main 
 doubt is what is the best configuration for the disks. We are thinking 
 about use them in a RAID-0 array. Is this the best option? What do you 
 suggest on partitioning? Separate partitions for the OS, data and pg_xlog?

You don't have a lot of options with just two disks. What are you trying
to accomplish with raid?

Raid 0 will possibly give you some speed up, while raid 1 will give you some
fault tolerance, some speed of of reads, but cost you half your disk space.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Observation about db response time

2005-08-30 Thread Akshay Mathur








Hello Friends,



We were having a database in pgsql7.4.2 The database was
responding very slowly even after full vacuum analyze (select count(*) from some_table_having_18000_records was taking 18
Sec).



We took a backup of that db and restored it back. Now the
same db on same PC is responding fast (same query is taking 18 ms).



But we can't do the same as a solution of slow response. Do
anybody has faced similar problem? Is this due to any internal problem of
pgsql? Is there any clue to fasten the database?



Regards,



akshay





---

Akshay Mathur

SMTS, Product Verification

AirTight Networks, Inc.
(www.airtightnetworks.net)

O: +91 20 2588 1555 ext 205

F: +91 20 2588 1445










Re: [PERFORM] Observation about db response time

2005-08-30 Thread Frank Wiles
On Tue, 30 Aug 2005 18:35:30 +0530
Akshay Mathur [EMAIL PROTECTED] wrote:

 Hello Friends,
  
 We were having a database in pgsql7.4.2 The database was responding
 very slowly even after full vacuum analyze (select count(*) from
 some_table_having_18000_records was taking 18 Sec).
  
 We took a backup of that db and restored it back. Now the same db on
 same PC is responding fast (same query is taking 18 ms).
  
 But we can't do the same as a solution of slow response. Do anybody
 has faced similar problem? Is this due to any internal problem of
 pgsql? Is there any clue to fasten the database?

  This could be because you don't have max_fsm_pages and
  max_fsm_relations setup correctly or are not doing full vacuums 
  often enough. 

  If your database deletes a ton of data as a matter of course then
  sometimes a full vacuum will not clear up as much space as it could.

  Try increasing those configuration values and doing vacuums more
  often. 

  If you should also explore upgrading to the latest 8.0 as you will
  no doubt see noticeable speed improvements. 

 -
   Frank Wiles [EMAIL PROTECTED]
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Bruno Wolff III
Please keep replies copied to the list so that others may contribute to
and learn from the discussion.

On Tue, Aug 30, 2005 at 10:15:13 -0300,
  Alvaro Nunes Melo [EMAIL PROTECTED] wrote:
 Hello Bruno,
 
 Bruno Wolff III wrote:
 
 On Tue, Aug 30, 2005 at 09:37:17 -0300,
  Alvaro Nunes Melo [EMAIL PROTECTED] wrote:
  
 
 The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main 
 doubt is what is the best configuration for the disks. We are thinking 
 about use them in a RAID-0 array. Is this the best option? What do you 
 suggest on partitioning? Separate partitions for the OS, data and pg_xlog?
 
 Our main goal is performance speedup. Disk space might not be a problem. 
 I've read a lot here about movig pg_xlog to different partitions, and 
 we'll surely make tests to see what configuration might be better.

This isn't a very good mix of hardware for running postgres. Xeons have
some context switching issues for which you will probably see some
speed up in 8.1. (So if you aren't going into production for sevral
months you might want to be using 8.1beta.) Having only two disk drives
is also not a good idea.

With what you have you either want to use raid 0 and not worry too much
about how the disks are partitioned or use one disk for wal logging
and the other for other stuff. There are other people on the list who
can probably give you a better idea of which of these options is likely
to be better in your case. However, they may need to know more about
your raid controller. In particular how much battery backed memory does
it have and its model.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] OSX Performance

2005-08-30 Thread Jeff Trout


On Aug 29, 2005, at 1:57 PM, Tom Lane wrote:


You must have CFLAGS set to empty in your build environment, because
configure will certainly default to -O2 if not overridden.  It works
fine for me on OS X.  Maybe you want to trace through the configure
script and see why it's doing something else?



/me hangs head in shame.

Yes. I'd been futzing with various settings and had CFLAGS set to  
empty instead of cleared out.   8.0.3 and -snapshot (8/29) both seem  
to now compile with -O2


Anyway, I tried putting together a nice self-data-producing test case  
but that didn't cause the bug.  So I'm trying to get this dump as  
small as possible (I'll email you a url later).


To tide things over, here's the gprof (and shark) output for my query  
of doom.


linux box:

  6.36  0.41 0.41   240694 0.00 0.00  _bt_compare
  5.97  0.79 0.38   907242 0.00 0.00  AllocSetAlloc
  4.55  1.07 0.29   135008 0.00 0.00  hash_any
  4.16  1.34 0.27   185684 0.00 0.00   
MemoryContextAllocZeroAlig

ned
  3.30  1.55 0.2139152 0.01 0.01  localsub
  2.98  1.74 0.19  1213172 0.00 0.00  AllocSetFreeIndex
  2.83  1.92 0.1852695 0.00 0.00  nocachegetattr
  2.75  2.10 0.17   134775 0.00 0.00  hash_search
  2.51  2.25 0.1647646 0.00 0.01   
StrategyBufferLookup

  2.28  2.40 0.1471990 0.00 0.00  fmgr_isbuiltin
  2.20  2.54 0.1433209 0.00 0.00  _bt_moveright
  1.88  2.66 0.1278864 0.00 0.00  comparetup_heap
  1.57  2.76 0.1063485 0.00 0.00  SearchCatCache
  1.41  2.85 0.0939152 0.00 0.00  timesub
  1.26  2.93 0.08   325246 0.00 0.00  tas
  1.26  3.01 0.08   305883 0.00 0.00  AllocSetFree
  1.26  3.09 0.08   162622 0.00 0.00  LWLockAcquire

and on osx: (self, total, library, func)

29.0%29.0%postmaster_bt_checkkeys
15.6%15.6%postmasterFunctionCall2
10.4%10.4%libSystem.B.dylib__isnand
9.5%9.5%postmastertimestamp_cmp_internal
9.3%9.3%postmaster_bt_step
5.3%5.3%postmastertimestamp_le
4.9%4.9%postmaster_bt_next
3.6%3.6%postmasterdyld_stub___isnand
3.1%3.1%postmastertimestamp_gt
1.9%1.9%postmasterint4eq
1.3%1.3%postmasterBufferGetBlockNumber
0.6%0.6%postmasterLWLockAcquire
0.5%0.5%postmasterLWLockRelease
0.4%0.4%postmasterhash_search

On my failed simulated attempt here's what things looked liek (the  
data should have been relatively similar).


linux:

  5.39  0.28 0.28   852086 0.00 0.00  AllocSetAlloc
  4.90  0.53 0.25   130165 0.00 0.00  hash_any
  4.12  0.73 0.21   214061 0.00 0.00  _bt_compare
  4.12  0.94 0.2139152 0.01 0.01  localsub
  4.02  1.15 0.20   160487 0.00 0.00   
MemoryContextAllocZeroAlig

ned
  3.24  1.31 0.17  1157316 0.00 0.00  AllocSetFreeIndex
  3.14  1.48 0.1664375 0.00 0.00  fmgr_isbuiltin
  2.55  1.60 0.1356142 0.00 0.00  SearchCatCache
  2.35  1.73 0.12   130076 0.00 0.00  hash_search
  1.76  1.81 0.0939152 0.00 0.00  timesub
  1.67  1.90 0.09   221469 0.00 0.00   
timestamp_cmp_internal
  1.67  1.99 0.0956069 0.00 0.00   
MemoryContextCreate

  1.57  2.06 0.08   145787 0.00 0.00  LWLockRelease
  1.37  2.13 0.07   289119 0.00 0.00  pfree
  1.37  2.21 0.07 8002 0.01 0.02   
ExecMakeFunctionResult

  1.37  2.27 0.07 8000 0.01 0.22  ExecInitIndexScan
  1.18  2.33 0.06   291574 0.00 0.00  tas

and on osx: (which runs very fast, usually a couple hundred ms faster  
than the linux box)


5.9%5.9%postmasterLWLockAcquire
5.2%5.2%postmasterAllocSetAlloc
4.9%4.9%postmasterLWLockRelease
3.9%3.9%postmasterhash_any
3.6%3.6%postmaster_bt_compare
2.9%2.9%postmasterhash_search
2.6%2.6%postmasterMemoryContextAllocZeroAligned
2.6%2.6%postmasterExecInitExpr
2.0%2.0%mach_kernelml_set_interrupts_enabled
2.0%2.0%postmasterfmgr_info_cxt_security
2.0%2.0%postmasterAllocSetFree
1.6%1.6%postmasterMemoryContextAlloc
1.6%1.6%postmasterFunctionCall2
1.6%1.6%postmasterAllocSetDelete
1.6%1.6%libSystem.B.dylib__isnand

which to me anyway, looks like basically the same profile.
So there must be something about the exact nature of hte data that is  
kicking it in the nuts.


I tried making a 

Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Ron

At 08:37 AM 8/30/2005, Alvaro Nunes Melo wrote:

Hello,

We are about to install a new PostgreSQL server, and despite of 
being a very humble configuration compared to the ones we see in the 
list, it's the biggest one we've got till now.


The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our 
main doubt is what is the best configuration for the disks. We are 
thinking about use them in a RAID-0 array. Is this the best option? 
What do you suggest on partitioning? Separate partitions for the OS, 
data and pg_xlog?


This is _very_ modest HW.  Unless your DB and/or DB load is similarly 
modest, you are not going to be happy with the performance of your DBMS.


At a minimum, for safety reasons you want 4 HDs: 2 for a RAID 1 set 
for the DB, and 2 for a RAID 1 set for the OS + pg_xlog.
2 extra HDs, even SCSI HDs, is cheap.  Especially when compared to 
the cost of corrupted or lost data.


HD's and RAM are cheap enough that you should be able to upgrade in 
more ways, but do at least that upgrade!


Beyond that, the best ways to spend you limited $ are highly 
dependent on your exact DB and its usage pattern.


Ron Peacetree



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Observation about db response time

2005-08-30 Thread Vivek Khera
On Aug 30, 2005, at 9:05 AM, Akshay Mathur wrote:We were having a database in pgsql7.4.2 The database was responding very slowly even after full vacuum analyze (select count(*) from some_table_having_18000_records was taking 18 Sec).On a 7.4.2 db, there should probably be no index bloat, but there could be.  Does REINDEX on your tables help?  If not, then VACUUM FULL followed by REINDEX may help.  The latter should result in nearly the same as your dump+restore.  And you need to run vacuum often enough to keep your tables from bloating.  How often that is depends on your update/delete rate.Also, updating to 8.0 may help. Vivek Khera, Ph.D. +1-301-869-4449 x806  

Re: [PERFORM] Need indexes on empty tables for good performance ?

2005-08-30 Thread Chris Travers

Lenard, Rohan (Rohan) wrote:

I've read that indexes aren't used for COUNT(*) and I've noticed 
(7.3.x) with EXPLAIN that indexes never seem to be used on empty 
tables - is there any reason to have indexes on empty tables, or will 
postgresql never use them.


You could add a row, vacuum analyze, delete the row, etc  Then you 
are fine until you vacuum analyze again ;-)


This is a feature designed to prevent really bad plans when you are 
loading tables with data.  However, you are right.  It can create bad 
plans sometimes.


Any chance one can eventually come up with a way to tell the planner 
that an empty table is expected not to grow?  Otherwise, I can see 
nightmares in a data warehouse environment where you have an empty 
parent table...


Best Wishes,
Chris Travers
Metatron Technology Consulting
begin:vcard
fn:Chris Travers
n:Travers;Chris
email;internet:[EMAIL PROTECTED]
x-mozilla-html:FALSE
version:2.1
end:vcard


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Observation about db response time

2005-08-30 Thread akshay
Hello Friends,
We were having a database in pgsql7.4. The database was responding very
slowly even after full vacuum (select
count(*) from some_table_having_18000_records was taking 18 Sec).

We took a backup of that db and restored it back. Now the same db on
same PC is responding fast (same query is taking 18 ms).

But we can't do the same as a solution of slow response. Do anybody has
faced similar problem? Is this due to any internal problem of pgsql? Is
there any clue to fasteen the database?

Regards,

akshay




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] High load and iowait but no disk access

2005-08-30 Thread Rémy Beaumont
We have been trying to pinpoint what originally seem to be a I/O bottleneck but which now seems to be an issue with either Postgresql or RHES 3.

We have the following test environment on which we can reproduce the problem:

1) Test System A
Dell 6650 Quad Xeon Pentium 4
8 Gig of RAM
OS: RHES 3 update 2
Storage: NetApp FAS270 connected using an FC card using 10 disks

2) Test System B
Dell Dual Xeon Pentium III
2 Gig o RAM
OS: RHES 3 update 2
Storage: NetApp FAS920 connected using an FC card using 28 disks

Our Database size is around 30G. 

The behavior we see is that when running queries that do random reads on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read operations on the netapps)

The stats of the NetApp do confirm that it is sitting idle. Doing an strace on the Postgresql process shows that is it doing seeks and reads.

So my question is where is this iowait time spent ?
Is there a way to pinpoint the problem in more details ?
We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3

I have included the output of top,vmstat,strace and systat from the Netapp from System B while running a single query that generates this behavior.

Rémy

top output:
06:27:28  up 5 days, 16:59,  6 users,  load average: 1.04, 1.30, 1.01
72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  cpuusernice  systemirq  softirq  iowaitidle
total2.7%0.0%1.0%   0.1% 0.2%   46.0%   49.5%
cpu000.2%0.0%0.2%   0.0% 0.2%2.2%   97.2%
cpu015.3%0.0%1.9%   0.3% 0.3%   89.8%1.9%
Mem:  2061696k av, 2043936k used,   17760k free,   0k shrd,3916k buff
1566332k actv,  296648k in_d,   30504k in_c
Swap: 16771584k av,   21552k used, 16750032k free 1933772k cached

PID USER PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
30960 postgres  15   0 13424  10M  9908 D 2.7  0.5   2:00   1 postmaster
30538 root  15   0  1080  764   524 S 0.7  0.0   0:43   0 sshd
1 root  15   0   496  456   436 S 0.0  0.0   0:08   0 init
2 root  RT   0 00 0 SW0.0  0.0   0:00   0 migration/0
3 root  RT   0 00 0 SW0.0  0.0   0:00   1 migration/1
4 root  15   0 00 0 SW0.0  0.0   0:01   0 keventd
5 root  34  19 00 0 SWN   0.0  0.0   0:00   0 ksoftirqd/0
6 root  34  19 00 0 SWN   0.0  0.0   0:00   1 ksoftirqd/1
9 root  15   0 00 0 SW0.0  0.0   0:24   1 bdflush
7 root  15   0 00 0 SW0.0  0.0   6:53   1 kswapd
8 root  15   0 00 0 SW0.0  0.0   8:44   1 kscand
10 root  15   0 00 0 SW0.0  0.0   0:13   0 kupdated
11 root  25   0 00 0 SW0.0  0.0   0:00   0 mdrecoveryd
17 root  15   0 00 0 SW0.0  0.0   0:00   0 ahc_dv_0


vmstat output 
procs  memory  swap  io system cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
0  1  21552  17796   4872 193192823 3 1   27 6  2  1  7  3
0  1  21552  18044   4880 193165200  1652 0  397   512  1  2 50 47
0  1  21552  17976   4896 193166400  2468 0  407   552  2  2 50 47
1  0  21552  17984   4896 193160800  2124 0  418   538  3  3 48 46
0  1  21552  18028   4900 193153600  1592 0  385   509  1  3 50 46
0  1  21552  18040   4916 193148800  1620   820  419   581  2  2 50 46
0  1  21552  17968   4916 193153604  1708 4  402   554  3  1 50 46
1  1  21552  18052   4916 193138800  1772 0  409   531  3  1 49 47
0  1  21552  17912   4924 193149200  1772 0  408   565  3  1 48 48
0  1  21552  17932   4932 193144004  1356 4  391   545  5  0 49 46
0  1  21552  18320   4944 193101604  1500   840  414   571  1  1 48 50
0  1  21552  17872   4944 193144000  2116 0  392   496  1  5 46 48
0  1  21552  18060   4944 193123200  2232 0  423   597  1  2 48 49
1  1  21552  17684   4944 193158400  1752 0  395   537  1  1 50 48
0  1  21552  18000   4944 193124000  1576 0  401   549  0  1 50 49


NetApp stats:
CPU   NFS  CIFS  HTTP   TotalNet kB/s   Disk kB/s Tape kB/s Cache Cache  CP   CP Disk   DAFS   FCP iSCSI   FCP  kB/s
in   out   read  write  read write   age   hit time  ty util   in   out
2% 0 0 0 139 0 0   2788  0 0 0 3   96%   0%  -   15%  0   139 0 3  2277
2% 0 0 0 144 0 0   2504  0 0 0 3   96%   0%  -   18%  0   144 0 3  2150
2% 0 0 0 130 0 0   2212  0 0 0 3   96%   0%  -   13%  0   130 0 3  1879
3% 0 0 0 169 0 0   2937 80 0

Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Matthew Nuzum
  On Tue, Aug 30, 2005 at 09:37:17 -0300,
   Alvaro Nunes Melo [EMAIL PROTECTED] wrote:
  The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our main
  doubt is what is the best configuration for the disks. We are thinking
  about use them in a RAID-0 array. Is this the best option? What do you
  suggest on partitioning? Separate partitions for the OS, data and pg_xlog?
  
  Our main goal is performance speedup. Disk space might not be a problem.
  I've read a lot here about movig pg_xlog to different partitions, and
  we'll surely make tests to see what configuration might be better.
 

I've set up several servers with a config like this. Its not ideal,
but there's no reason you can't enjoy the benefits of a snappy
application.

The best results I've had involve dedicating one drive to OS, swap,
logs, tmp and everything and dedicate one drive to postgres. If you
use *nix you can mount the second drive as /var/lib/pgsql (or where
ever postgres lives on your server) with noatime as a mount option.

In retrospect, you might have saved the money on the second CPU and
gotten two more hard drives, but if you're running a dual task server
(i.e. LAMP) you may appreciate the second CPU.

The beauty of a server like this is that it puts more of the wizardry
of creating a fast application into the hands of the app developer,
which results in a better db schema, optimized queries and generally
*thinking* about the performance of the code. I personally feel that
to be a very rewarding aspect of my job. (As a hobby I program
microntrollers that run at 4MHz and have only 256 bytes of RAM, so
that could just be me.;-)

-- 
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Need indexes on empty tables for good performance ?

2005-08-30 Thread Lenard, Rohan (Rohan)



Actually the indexes on the child table do seem to get used 
- I just wanted to make sure there was no penalty not havingindexes on the 
empty parent tables.

You are right - the parentis the best way to get at 
the unknown children ...

  
  
  From: Thomas F. O'Connell 
  [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 6:15 
  AMTo: Lenard, Rohan (Rohan)Cc: 
  pgsql-performance@postgresql.orgSubject: Re: [PERFORM] Need indexes 
  on empty tables for good performance ?
  Rohan,
  
  You should note that in Postgres, indexes are not inherited by child 
  tables.
  
  Also, it seems difficult to select from a child table whose name you 
  don't know unless you access the parent. And if you are accessing the data via 
  the parent, I'm reasonably certain that you will find that indexes aren't used 
  (even if they exist on the children) as a result of the way the children are 
  accessed.
  
  
  --
  Thomas F. O'Connell
  Co-Founder, Information Architect
  Sitening, LLC
  
  Strategic Open Source: Open Your i
  
  http://www.sitening.com/
  110 30th Avenue North, Suite 6
  Nashville, TN 37203-6320
  615-469-5150615-469-5151 (fax)
  
  On Aug 22, 2005, at 10:41 PM, Lenard, Rohan (Rohan) wrote:
  
I've read that 
indexes aren't used for COUNT(*) and I've noticed (7.3.x) with EXPLAIN that 
indexes never seem to be used on empty tables - is there any reason to have 
indexes on empty tables, or will postgresql never use 
them.

This is not as 
silly as it sounds - with table inheritance you might have table children 
with the data and a parent that is empty. It'd be nice to make sure 
postgresql knows to never really look at the parent - especially is you 
don't know the names of all the children ..

Thoughts 
?

thx,
 
Rohan


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Rémy Beaumont


On 30-Aug-05, at 12:15, Tom Lane wrote:


=?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes:

The stats of the NetApp do confirm that it is sitting idle.


Really?






  CPU   NFS  CIFS  HTTP   TotalNet kB/s   Disk kB/s Tape kB/s
Cache Cache  CP   CP Disk   DAFS   FCP iSCSI   FCP  kB/s
   in   out   read  write  read write
age   hit time  ty util   in   out
   2% 0 0 0 139 0 0   2788  0 0 0
  3   96%   0%  -   15%  0   139 0 3  2277
   2% 0 0 0 144 0 0   2504  0 0 0
  3   96%   0%  -   18%  0   144 0 3  2150
   2% 0 0 0 130 0 0   2212  0 0 0
  3   96%   0%  -   13%  0   130 0 3  1879
   3% 0 0 0 169 0 0   2937 80 0 0
  3   96%   0%  -   13%  0   169 0 4  2718
   2% 0 0 0 139 0 0   2448  0 0 0
  3   96%   0%  -   12%  0   139 0 3  2096


I know zip about NetApps, but doesn't the 8th column indicate pretty
steady disk reads?

Yes, but they are very low.
At 15% usage, it's pretty much sitting idle if you consider that the OS 
reports that one of the processor is spending more then 80% of it's 
time in IOwait.


Rémy


regards, tom lane



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Michael Stone

On Mon, Aug 29, 2005 at 09:42:46AM -0400, Rémy Beaumont wrote:
We have been trying to pinpoint what originally seem to be a I/O 
bottleneck but which now seems to be an issue with either Postgresql or 
RHES 3. 


Nope, it's an IO bottleneck.

The behavior we see is that when running queries that do random reads 
on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a 
throughput bellow 3000kB/s 


That's the sign of an IO bottleneck.

The stats of the NetApp do confirm that it is sitting idle. Doing an 
strace on the Postgresql process shows that is it doing seeks and 
reads. 

So my question is where is this iowait time spent ? 


Waiting for the seeks. postgres doesn't do async io, so it requests a
block, waits for it to come in, then requests another block, etc. The
utilization on the netapp isn't going to be high because it doesn't have
a queue of requests and can't do readahead because the IO is random. The
only way to improve the situation would be to reduce the latency of the
seeks. If I read the numbers right you're only getting about 130
seeks/s, which ain't great. I don't know how much latency the netapp
adds in the this configuration; have you tried benchmarking
direct-attach disks?

Mike Stone

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Tom Lane
=?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes:
 The stats of the NetApp do confirm that it is sitting idle.

Really?

   CPU   NFS  CIFS  HTTP   TotalNet kB/s   Disk kB/s Tape kB/s 
 Cache Cache  CP   CP Disk   DAFS   FCP iSCSI   FCP  kB/s
in   out   read  write  read write   
 age   hit time  ty util   in   out
2% 0 0 0 139 0 0   2788  0 0 0
   3   96%   0%  -   15%  0   139 0 3  2277
2% 0 0 0 144 0 0   2504  0 0 0
   3   96%   0%  -   18%  0   144 0 3  2150
2% 0 0 0 130 0 0   2212  0 0 0
   3   96%   0%  -   13%  0   130 0 3  1879
3% 0 0 0 169 0 0   2937 80 0 0
   3   96%   0%  -   13%  0   169 0 4  2718
2% 0 0 0 139 0 0   2448  0 0 0
   3   96%   0%  -   12%  0   139 0 3  2096

I know zip about NetApps, but doesn't the 8th column indicate pretty
steady disk reads?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Tom Lane
=?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes:
 On 30-Aug-05, at 12:15, Tom Lane wrote:
 I know zip about NetApps, but doesn't the 8th column indicate pretty
 steady disk reads?

 Yes, but they are very low.

Sure, but that's more or less what you'd expect if the thing is randomly
seeking all over the disk :-(.  Just because it's a NetApp doesn't mean
it's got zero seek time.

You did not say what sort of query this is, but I gather that it's doing
an indexscan on a table that is not at all in index order.  Possible
solutions involve reverting to a seqscan (have you forced the planner to
choose an indexscan here, either directly or by lowering random_page_cost?)
or CLUSTERing the table by the index (which would need to be repeated
periodically, so it's not a great answer).

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Rémy Beaumont


On 30-Aug-05, at 12:29, Tom Lane wrote:


=?ISO-8859-1?Q?R=E9my_Beaumont?= [EMAIL PROTECTED] writes:

On 30-Aug-05, at 12:15, Tom Lane wrote:

I know zip about NetApps, but doesn't the 8th column indicate pretty
steady disk reads?



Yes, but they are very low.


Sure, but that's more or less what you'd expect if the thing is 
randomly

seeking all over the disk :-(.  Just because it's a NetApp doesn't mean
it's got zero seek time.
Per NetApp, the disk utilization percentage they report does include 
seek time, not just read/write operations.
NetApp has been involved in trying to figure out what is going on and 
their claim is that the NetApp filer is not IO bound.




You did not say what sort of query this is, but I gather that it's 
doing

an indexscan on a table that is not at all in index order.
Yes, most of those queries are doing an  indexscan.  It's a fresh 
restore of our production database that we have vacuumed/analyzed.



Possible
solutions involve reverting to a seqscan (have you forced the planner 
to
choose an indexscan here, either directly or by lowering 
random_page_cost?)

No.

or CLUSTERing the table by the index (which would need to be repeated
periodically, so it's not a great answer).
Will try to cluster the tables and see if it changes anything. Still 
doesn't explain what is going on with those seeks.


Thanks,

Rémy



regards, tom lane



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Joshua D. Drake

Ron wrote:


At 08:37 AM 8/30/2005, Alvaro Nunes Melo wrote:


Hello,

We are about to install a new PostgreSQL server, and despite of being 
a very humble configuration compared to the ones we see in the list, 
it's the biggest one we've got till now.


The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. Our 
main doubt is what is the best configuration for the disks. We are 
thinking about use them in a RAID-0 array. Is this the best option? 
What do you suggest on partitioning? Separate partitions for the OS, 
data and pg_xlog?



This is _very_ modest HW.  Unless your DB and/or DB load is similarly 
modest, you are not going to be happy with the performance of your DBMS.


Well that is a pretty blanket statement. I have many customers who 
happily run in less hardware that what is mentioned above.

It all depends on the application itself and how the database is utilized.

At a minimum, for safety reasons you want 4 HDs: 2 for a RAID 1 set 
for the DB, and 2 for a RAID 1 set for the OS + pg_xlog.
2 extra HDs, even SCSI HDs, is cheap.  Especially when compared to the 
cost of corrupted or lost data.


Your real test is going to be prototyping the performance you need. A 
single RAID 1 mirror (don't use RAID 0) may be more
than enough. However based on the fact that you speced Xeons my guess is 
you spent money on CPUs when you should have

spent money on hard drives.

If you still have the budget, I would suggest considering either what 
Ron suggested or possibly using a 4 drive RAID 10 instead.


If you can't afford to put a couple more SCSI disks it may be worth 
while to put a software RAID 1 with ATA disks for the OS and
swap and then use straight SCSI hardware RAID 1 for the DB. That will 
allow you to push any swap operations off to the OS disks

without sacrificing the performance and reliability of the database itself.

Sincerely,

Joshua D. Drake




HD's and RAM are cheap enough that you should be able to upgrade in 
more ways, but do at least that upgrade!


Beyond that, the best ways to spend you limited $ are highly dependent 
on your exact DB and its usage pattern.


Ron Peacetree



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Need indexes on empty tables for good performance ?

2005-08-30 Thread Chris Travers

Lenard, Rohan (Rohan) wrote:

Actually the indexes on the child table do seem to get used - I just 
wanted to make sure there was no penalty not having indexes on the 
empty parent tables.
 
You are right - the parent is the best way to get at the unknown 
children ...


Indexes are created in the inheritance process, iirc.  However, index 
entries are not inherited, which means that index-based unique 
constraints don't properly get inherited.


Best Wishes,
Chris Travers
Metatron Technology Consulting

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Ron

At 12:56 PM 8/30/2005, Joshua D. Drake wrote:

Ron wrote:


At 08:37 AM 8/30/2005, Alvaro Nunes Melo wrote:


Hello,

We are about to install a new PostgreSQL server, and despite of 
being a very humble configuration compared to the ones we see in 
the list, it's the biggest one we've got till now.


The server is a Dual Xeon 3.0 with 2 GB RAM and two SCSI disks. 
Our main doubt is what is the best configuration for the disks. We 
are thinking about use them in a RAID-0 array. Is this the best 
option? What do you suggest on partitioning? Separate partitions 
for the OS, data and pg_xlog?



This is _very_ modest HW.  Unless your DB and/or DB load is 
similarly modest, you are not going to be happy with the 
performance of your DBMS.


Well that is a pretty blanket statement. I have many customers who 
happily run in less hardware that what is mentioned above.

It all depends on the application itself and how the database is utilized.


If your customers run happily on 2 HD's, then IME they have very 
modest DB storage and/or DB performance needs.  For safety reasons, 
the best thing to do if you only have 2 HD's is to run them as a RAID 
1 with everything on them.  The slightly better performing but 
considerably less safe alternative is to put the OS + logs on 1 HD 
and the DB on the other.  Any resemblance to a semi-serious OLTP load 
will reduce either such system to an HD IO bound one with poor IO rates.


If, as above, your DBMS is bounded by the performance of one HD, then 
you are AT BEST getting the raw IO rate of such a device: say 
~70-80MB/s in average sustained raw sequential IO.  Files system 
overhead and any seeking behavior will rapidly reduce that number to 
considerably less.  Consider that the CPU - memory IO subsystem is 
easily capable of ~3.2GBps.  So you are talking about slowing the  DB 
server to at most ~1/40, maybe even as little as ~1/200, its 
potential under such circumstances.


If your DB can fit completely in RAM and/or does light duty write IO, 
this may not be a serious issue.  OTOH, once you start using those 
HD's to any reasonable extent, most of the rest of the investment 
you've made in server HW is wasted.


As I keep saying, the highest priority in purchasing a DBMS is to 
make sure you have enough HD IO bandwidth.  RAM comes second, and CPU 
is a distant third.



At a minimum, for safety reasons you want 4 HDs: 2 for a RAID 1 set 
for the DB, and 2 for a RAID 1 set for the OS + pg_xlog.
2 extra HDs, even SCSI HDs, is cheap.  Especially when compared to 
the cost of corrupted or lost data.


Your real test is going to be prototyping the performance you need. 
A single RAID 1 mirror (don't use RAID 0) may be more
than enough. However based on the fact that you speced Xeons my 
guess is you spent money on CPUs when you should have

spent money on hard drives.


I agree with Josh on both points.  Don't use RAID 0 for persistent 
data unless you like losing data.  Spend more on HDs and RAM and less 
on CPU's (fast FSB is far more important than high clock rate.  In 
general buy the highest FSB with the slowest clock rate.).  If fact, 
if you are that strapped for cash, exchange those 2 SCSI HD's for 
their $ equivalent in SATA HD's.  The extra spindles will be well worth it.



If you still have the budget, I would suggest considering either 
what Ron suggested or possibly using a 4 drive RAID 10 instead.


IME, with only 4 HDs, it's usually better to split them them into two 
RAID 1's (one for the db, one for everything else including the logs) 
than it is to put everything on one RAID 10.  YMMV.



Ron Peacetree



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Josh Berkus
Remy,

 The behavior we see is that when running queries that do random reads
 on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a
 throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s
 on sequential read operations on the netapps)

This seems pretty low for a NetApp -- you should be able to manage up to 
180mb/s, if not higher.   Are you sure it's configured correctly?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Woody Woodring



Have you tried a different kernel? We run with a 
netapp over NFS without any issues, but we have seen high IO-wait on other Dell 
boxes (running and not running postgres) and RHES 3. We have 
replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with more memory 
running RHES3 and it be bogged down with IO waits due to syslog messages writing 
to the disk, the old slower server could handle it fine. I don't know if 
it is a Dell thing or a RH kernel, but we try different kernels on our boxes to 
try to find one that works better. We have not found one that stands out 
over anotherconsistently but we have been moving away from Update 2 kernel 
(2.4.21-15.ELsmp) due to server lockup issues. Unfortunately we get the 
best disk throughput on our few remaining 7.3 boxes.

Woody

IGLASS Networks
www.iglass.net


From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Rémy 
BeaumontSent: Monday, August 29, 2005 9:43 AMTo: 
pgsql-performance@postgresql.orgSubject: [PERFORM] High load and 
iowait but no disk access
We have been trying to pinpoint what originally seem to be a I/O 
bottleneck but which now seems to be an issue with either Postgresql or RHES 
3.We have the following test environment on which we can reproduce the 
problem:1) Test System ADell 6650 Quad Xeon Pentium 48 Gig of 
RAMOS: RHES 3 update 2Storage: NetApp FAS270 connected using an FC card 
using 10 disks2) Test System BDell Dual Xeon Pentium III2 Gig o 
RAMOS: RHES 3 update 2Storage: NetApp FAS920 connected using an FC card 
using 28 disksOur Database size is around 30G. The behavior we 
see is that when running queries that do random reads on disk, IOWAIT goes over 
80% and actual disk IO falls to a crawl at a throughput bellow 3000kB/s (We 
usually average 4 kB/s to 8 kB/s on sequential read operations on the 
netapps)The stats of the NetApp do confirm that it is sitting idle. 
Doing an strace on the Postgresql process shows that is it doing seeks and 
reads.So my question is where is this iowait time spent ?Is there a 
way to pinpoint the problem in more details ?We are able to reproduce this 
behavior with Postgresql 7.4.8 and 8.0.3I have included the output of 
top,vmstat,strace and systat from the Netapp from System B while running a 
single query that generates this behavior.Rémytop 
output:06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 
1.0172 processes: 71 sleeping, 1 running, 0 zombie, 0 stoppedCPU states: 
cpu user nice system irq softirq iowait idletotal 2.7% 0.0% 1.0% 0.1% 0.2% 
46.0% 49.5%cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%cpu01 5.3% 0.0% 1.9% 
0.3% 0.3% 89.8% 1.9%Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 
3916k buff1566332k actv, 296648k in_d, 30504k in_cSwap: 16771584k av, 
21552k used, 16750032k free 1933772k cachedPID USER PRI NI SIZE RSS 
SHARE STAT %CPU %MEM TIME CPU COMMAND30960 postgres 15 0 13424 10M 9908 D 
2.7 0.5 2:00 1 postmaster30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 
sshd1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init2 root RT 0 0 0 0 SW 
0.0 0.0 0:00 0 migration/03 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 
migration/14 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd5 root 34 19 0 0 0 
SWN 0.0 0.0 0:00 0 ksoftirqd/06 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 
ksoftirqd/19 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush7 root 15 0 0 0 0 
SW 0.0 0.0 6:53 1 kswapd8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand10 
root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated11 root 25 0 0 0 0 SW 0.0 0.0 0:00 
0 mdrecoveryd17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0vmstat 
output procs memory swap io system cpur b swpd free buff cache si so bi 
bo in cs us sy id wa0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 30 1 
21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 470 1 21552 17976 4896 
1931664 0 0 2468 0 407 552 2 2 50 471 0 21552 17984 4896 1931608 0 0 2124 0 
418 538 3 3 48 460 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 
460 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 460 1 21552 
17968 4916 1931536 0 4 1708 4 402 554 3 1 50 461 1 21552 18052 4916 1931388 
0 0 1772 0 409 531 3 1 49 470 1 21552 17912 4924 1931492 0 0 1772 0 408 565 
3 1 48 480 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 49 460 1 
21552 18320 4944 1931016 0 4 1500 840 414 571 1 1 48 500 1 21552 17872 4944 
1931440 0 0 2116 0 392 496 1 5 46 480 1 21552 18060 4944 1931232 0 0 2232 0 
423 597 1 2 48 491 1 21552 17684 4944 1931584 0 0 1752 0 395 537 1 1 50 
480 1 21552 18000 4944 1931240 0 0 1576 0 401 549 0 1 50 
49NetApp stats:CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape 
kB/s Cache Cache CP CP Disk DAFS FCP iSCSI FCP kB/sin out read write read 
write age hit time ty util in out2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 
0 139 0 3 22772% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 
21502% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 18793% 0 0 0 
169 0 0 2937 80 0 0 3 96% 0% - 13% 0 169 0 4 27182% 0 0 0 139 0 0 2448 0 0 0 
3 96% 0% - 12% 0 139 0 3 20962% 0 0 0 137 0 0 2116 0 0 0 3 96% 0% - 10% 0 

Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Rémy Beaumont


On 30-Aug-05, at 14:32, Josh Berkus wrote:


Remy,


The behavior we see is that when running queries that do random reads
on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a
throughput bellow 3000kB/s (We usually average 4 kB/s to 8 
kB/s

on sequential read operations on the netapps)


This seems pretty low for a NetApp -- you should be able to manage up 
to

180mb/s, if not higher.   Are you sure it's configured correctly?

Hi Josh,

The config has been reviewed by NetApp. We do get rates higher then 
80mb/s, but on average, that's what we get.


Do you have NetApp filers deployed ?
How many spindles do you have in your volume ?
On which OS are you running Postgres ?

Thanks,

Rémy



--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of 
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Observation about db response time

2005-08-30 Thread george young
 On Sat, 27 Aug 2005 21:28:57 +0530 (IST)
[EMAIL PROTECTED] threw this fish to the penguins:

 Hello Friends,
 We were having a database in pgsql7.4. The database was responding very
 slowly even after full vacuum (select
 count(*) from some_table_having_18000_records was taking 18 Sec).

One comment here: select count(*) may seem like a good benchmark, but 
it's not generally.  If your application really depends on this number, fine.
Otherwise, you should measure performance with a real query from your
application.  The select count(*) can be very slow because it does
not use indexes.

 We took a backup of that db and restored it back. Now the same db on
 same PC is responding fast (same query is taking 18 ms).

This sounds like some index is getting gooped up.  If you do a lot of
deleting from tables, your indexes can collect dead space that vacuum
can not reclaim.  Try in sql reindex table my_slow_table for a
suspect table.  In the contrib directory of the postgresql
distribution there is a script called reindexdb.  You can run this
to reindex your whole database.

I also wonder about file system slowdowns.  What hardware/OS/filesystem
are you using?


-- George

-- 
Are the gods not just?  Oh no, child.
What would become of us if they were? (CSL)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Rémy Beaumont


On 30-Aug-05, at 14:46, Anjan Dave wrote:

I have seen references of changing the kernel io scheduler at boot 
time…not sure if it applies to RHEL3.0, or will help, but try setting 
‘elevator=deadline’ during boot time or via grub.conf.

That's only for RHEL 4.0.


Have you tried running a simple ‘dd’ on the LUN?

We get amazing performance using dd.

The drives are in RAID10 configuration, right?

NetApp has their own type of raid format (RAID4 aka WAFL)

Rémy

 
Thanks,
Anjan

From: Woody Woodring [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 30, 2005 2:30 PM
To: 'Rémy Beaumont'; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] High load and iowait but no disk access
 
Have you tried a different kernel?  We run with a netapp over NFS 
without any issues, but we have seen high IO-wait on other Dell boxes 
(running  and not running postgres) and RHES 3.  We have replaced a 
Dell PowerEdge 350 running RH 7.3  with a PE750 with more memory 
running RHES3 and it be bogged down with IO waits due to syslog 
messages writing to the disk, the old slower server could handle it 
fine.  I don't know if it is a Dell thing or a RH kernel, but we try 
different kernels on our boxes to try to find one that works better.  
We have not found one that stands out over another consistently but we 
have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to 
server lockup issues.  Unfortunately we get the best disk throughput 
on our few remaining 7.3 boxes.

 
Woody
 
IGLASS Networks
www.iglass.net
 


From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Rémy 
Beaumont

Sent: Monday, August 29, 2005 9:43 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] High load and iowait but no disk access
We have been trying to pinpoint what originally seem to be a I/O 
bottleneck but which now seems to be an issue with either Postgresql 
or RHES 3.


 We have the following test environment on which we can reproduce the 
problem:


 1) Test System A
 Dell 6650 Quad Xeon Pentium 4
 8 Gig of RAM
 OS: RHES 3 update 2
 Storage: NetApp FAS270 connected using an FC card using 10 disks

 2) Test System B
 Dell Dual Xeon Pentium III
 2 Gig o RAM
 OS: RHES 3 update 2
 Storage: NetApp FAS920 connected using an FC card using 28 disks

 Our Database size is around 30G.

 The behavior we see is that when running queries that do random reads 
on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a 
throughput bellow 3000kB/s (We usually average 4 kB/s to 8 
kB/s on sequential read operations on the netapps)


 The stats of the NetApp do confirm that it is sitting idle. Doing an 
strace on the Postgresql process shows that is it doing seeks and 
reads.


 So my question is where is this iowait time spent ?
 Is there a way to pinpoint the problem in more details ?
 We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3

 I have included the output of top,vmstat,strace and systat from the 
Netapp from System B while running a single query that generates this 
behavior.


 Rémy

 top output:
 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01
 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped
 CPU states: cpu user nice system irq softirq iowait idle
 total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5%
 cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%
 cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9%
 Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff
 1566332k actv, 296648k in_d, 30504k in_c
 Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached

 PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster
 30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd
 1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init
 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
 3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
 4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd
 5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
 6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
 9 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush
 7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd
 8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand
 10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated
 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
 17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0


 vmstat output
 procs memory swap io system cpu
 r b swpd free buff cache si so bi bo in cs us sy id wa
 0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 3
 0 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 47
 0 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 47
 1 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 46
 0 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 46
 0 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 46
 0 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 46
 1 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 47
 0 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 48
 0 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 

Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Matthew Nuzum
On 8/30/05, Ron [EMAIL PROTECTED] wrote:
 If you still have the budget, I would suggest considering either
 what Ron suggested or possibly using a 4 drive RAID 10 instead.
 
 IME, with only 4 HDs, it's usually better to split them them into two
 RAID 1's (one for the db, one for everything else including the logs)
 than it is to put everything on one RAID 10.  YMMV.

This coresponds to what I have observed as well. Of course, we all
know that work loads varry.

Just a note for the OP who has only two drives, there are tools for a
variety of OSs that monitor the S.M.A.R.T. features of the drive and
give an early warning in case it senses impending failure. I've caught
two drives before failure with these types of tools.

Also note that when reading discussions of this nature you must take
into consideration the value of your data. For some people, restoring
from a nightly backup is inconvienent, but not life-or-death. Some
people even do twice-daily backups so that in case of a failure they
can recover with little loss of data. This might be a good way to
mitigate the cost of expensive server hardware. If you cannot afford
to lose any data then you need to consider it imperitive to use some
type of RAID setup (not RAID 0) and to achieve great performance
you'll want more than 2 drives.
-- 
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread Anjan Dave








I have seen references of changing the
kernel io scheduler at boot timenot sure if it applies to RHEL3.0, or
will help, but try setting elevator=deadline during boot time or
via grub.conf. Have you tried running a simple dd on the LUN? The
drives are in RAID10 configuration, right?



Thanks,

Anjan









From: Woody Woodring
[mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 30, 2005
2:30 PM
To: 'Rémy Beaumont';
pgsql-performance@postgresql.org
Subject: Re: [PERFORM] High load
and iowait but no disk access





Have you tried a different kernel?
We run with a netapp over NFS without any issues, but we have seen high IO-wait
on other Dell boxes (running and not running postgres) and RHES 3.
We have replaced a Dell PowerEdge 350 running RH 7.3 with a PE750 with
more memory running RHES3 and it be bogged down with IO waits due to syslog
messages writing to the disk, the old slower server could handle it fine.
I don't know if it is a Dell thing or a RH kernel, but we try different kernels
on our boxes to try to find one that works better. We have not found one
that stands out over anotherconsistently but we have been moving away
from Update 2 kernel (2.4.21-15.ELsmp) due to server lockup issues.
Unfortunately we get the best disk throughput on our few remaining 7.3 boxes.



Woody



IGLASS Networks

www.iglass.net









From:
[EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Rémy Beaumont
Sent: Monday, August 29, 2005 9:43
AM
To:
pgsql-performance@postgresql.org
Subject: [PERFORM] High load and
iowait but no disk access

We have been trying to pinpoint what originally seem to be a I/O
bottleneck but which now seems to be an issue with either Postgresql or RHES 3.

We have the following test environment on which we can reproduce the problem:

1) Test System A
Dell 6650 Quad Xeon Pentium 4
8 Gig of RAM
OS: RHES 3 update 2
Storage: NetApp FAS270 connected using an FC card using 10 disks

2) Test System B
Dell Dual Xeon Pentium III
2 Gig o RAM
OS: RHES 3 update 2
Storage: NetApp FAS920 connected using an FC card using 28 disks

Our Database size is around 30G. 

The behavior we see is that when running queries that do random reads on disk,
IOWAIT goes over 80% and actual disk IO falls to a crawl at a throughput bellow
3000kB/s (We usually average 4 kB/s to 8 kB/s on sequential read
operations on the netapps)

The stats of the NetApp do confirm that it is sitting idle. Doing an strace on
the Postgresql process shows that is it doing seeks and reads.

So my question is where is this iowait time spent ?
Is there a way to pinpoint the problem in more details ?
We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3

I have included the output of top,vmstat,strace and systat from the Netapp from
System B while running a single query that generates this behavior.

Rémy

top output:
06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01
72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5%
cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%
cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9%
Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff
1566332k actv, 296648k in_d, 30504k in_c
Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster
30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd
1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd
5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
9 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush
7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd
8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand
10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated
11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0


vmstat output 
procs memory swap io system cpu
r b swpd free buff cache si so bi bo in cs us sy id wa
0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 3
0 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 47
0 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 47
1 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 46
0 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 46
0 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 46
0 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 46
1 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 47
0 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 48
0 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 49 46
0 1 21552 18320 4944 1931016 0 4 1500 840 414 571 1 1 48 50
0 1 21552 17872 4944 1931440 0 0 2116 0 392 496 1 5 46 48
0 1 21552 18060 4944 1931232 0 0 2232 0 423 597 1 2 48 49
1 1 21552 17684 4944 1931584 0 0 1752 0 395 537 1 1 50 48
0 1 

Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Joshua D. Drake




If you still have the budget, I would suggest considering either what 
Ron suggested or possibly using a 4 drive RAID 10 instead.



IME, with only 4 HDs, it's usually better to split them them into two 
RAID 1's (one for the db, one for everything else including the logs) 
than it is to put everything on one RAID 10.  YMMV.


Really? That's interesting.  My experience is different, I assume SCSI? 
Software/Hardware Raid?


Sincerely,

Joshua D. Drake





Ron Peacetree




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] High load and iowait but no disk access

2005-08-30 Thread mudfoot
This might be optimal behavior from the hardware.  Random reads are hard to
optimize for--except if you have enough physical memory to hold the entire
dataset.  Cached reads (either in array controller or OS buffer cache) should
return nearly immediately.  But random reads probably aren't cached.  And any
read-ahead alogorithms or other types of performance enhancements in the
hardware or OS go out the window--because the behavior isn't predictable.

Each time a drive spindle needs to move to a new track, it requires at least a
couple of miliseconds.  Sequential reads only require this movement
infrequently.  But random reads may be forcing this movement for every IO 
operation.

Since the bottleneck in random reads is the physical hard drives themselves,
everything else stands around waiting.  Fancy hardware can optimize everything
else -- writes with write cache, sequential reads with read-ahead and read
cache.  But there's no real solution to a purely random read workload except
perhaps creating different disk groups to help avoid spindle contention.

I like this tool:  http://www.soliddata.com/products/iotest.html
It allows you to select pure workloads (read/write/sequential/random), and it
runs against raw devices, so you bypass the OS buffer cache.  When I've run it
I've always seen sequential activity get much much higher throughput than 
random.

Quoting Anjan Dave [EMAIL PROTECTED]:

 I have seen references of changing the kernel io scheduler at boot time...not
 sure if it applies to RHEL3.0, or will help, but try setting
 'elevator=deadline' during boot time or via grub.conf. Have you tried running
 a simple 'dd' on the LUN? The drives are in RAID10 configuration, right?
 
  
 
 Thanks,
 
 Anjan
 
   _  
 
 From: Woody Woodring [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, August 30, 2005 2:30 PM
 To: 'Rémy Beaumont'; pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] High load and iowait but no disk access
 
  
 
 Have you tried a different kernel?  We run with a netapp over NFS without any
 issues, but we have seen high IO-wait on other Dell boxes (running  and not
 running postgres) and RHES 3.  We have replaced a Dell PowerEdge 350 running
 RH 7.3  with a PE750 with more memory running RHES3 and it be bogged down
 with IO waits due to syslog messages writing to the disk, the old slower
 server could handle it fine.  I don't know if it is a Dell thing or a RH
 kernel, but we try different kernels on our boxes to try to find one that
 works better.  We have not found one that stands out over another
 consistently but we have been moving away from Update 2 kernel
 (2.4.21-15.ELsmp) due to server lockup issues.  Unfortunately we get the best
 disk throughput on our few remaining 7.3 boxes.
 
  
 
 Woody
 
  
 
 IGLASS Networks
 
 www.iglass.net
 
  
 
   _  
 
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Rémy Beaumont
 Sent: Monday, August 29, 2005 9:43 AM
 To: pgsql-performance@postgresql.org
 Subject: [PERFORM] High load and iowait but no disk access
 
 We have been trying to pinpoint what originally seem to be a I/O bottleneck
 but which now seems to be an issue with either Postgresql or RHES 3.
 
 We have the following test environment on which we can reproduce the
 problem:
 
 1) Test System A
 Dell 6650 Quad Xeon Pentium 4
 8 Gig of RAM
 OS: RHES 3 update 2
 Storage: NetApp FAS270 connected using an FC card using 10 disks
 
 2) Test System B
 Dell Dual Xeon Pentium III
 2 Gig o RAM
 OS: RHES 3 update 2
 Storage: NetApp FAS920 connected using an FC card using 28 disks
 
 Our Database size is around 30G. 
 
 The behavior we see is that when running queries that do random reads on
 disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a
 throughput bellow 3000kB/s (We usually average 4 kB/s to 8 kB/s on
 sequential read operations on the netapps)
 
 The stats of the NetApp do confirm that it is sitting idle. Doing an strace
 on the Postgresql process shows that is it doing seeks and reads.
 
 So my question is where is this iowait time spent ?
 Is there a way to pinpoint the problem in more details ?
 We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3
 
 I have included the output of top,vmstat,strace and systat from the Netapp
 from System B while running a single query that generates this behavior.
 
 Rémy
 
 top output:
 06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01
 72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped
 CPU states: cpu user nice system irq softirq iowait idle
 total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5%
 cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%
 cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9%
 Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff
 1566332k actv, 296648k in_d, 30504k in_c
 Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached
 
 PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
 30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 

[PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Markus Benne
We have a highly active table that has virtually all
entries updated every 5 minutes.  Typical size of the
table is 50,000 entries, and entries have grown fat.

We are currently vaccuming hourly, and towards the end
of the hour we are seeing degradation, when compared
to the top of the hour.

Vaccum is slowly killing our system, as it is starting
to take up to 10 minutes, and load at the time of
vacuum is 6+ on a Linux box.  During the vacuum,
overall system is goin unresponsive, then comes back
once vacuum completes.

If we run vacuum less frequently, degradation
continues to the point that we can't keep up with the
throughput, plus vacuum takes longer anyway.

Becoming quite a pickle:-)

We are thinking of splitting the table in two: the
part the updates often, and the part the updates
infrequently as we suspect that record size impacts
vacuum.

Any ideas?


Thanks,
Mark

-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Tom Lane
Markus Benne [EMAIL PROTECTED] writes:
 We have a highly active table that has virtually all
 entries updated every 5 minutes.  Typical size of the
 table is 50,000 entries, and entries have grown fat.

 We are currently vaccuming hourly, and towards the end
 of the hour we are seeing degradation, when compared
 to the top of the hour.

On something like this, you really need to be vacuuming more often
not less so; I'd think about how to do it every five or ten minutes 
rather than backing off.  With only 50K rows it should really not take
more than a couple of seconds to do the vacuum.  When you wait till
there are 600K dead rows, it's going to take awhile, plus you are
suffering across-the-board performance degradation from all the dead
rows.

If you are using PG 8.0, there are some vacuum cost knobs you can
fiddle with to slow down vacuum so it doesn't impose as much I/O load.
Ideally you could get it to where you could run vacuum as often as
you need to without noticing much impact on foreground processing.

If you're not using 8.0 ... maybe it's time to update.

Another thing you might want to do is look at vacuum verbose output,
which will give you some idea of the time spent in each step.  It might
be there are specific aspects that could be improved.

 We are thinking of splitting the table in two: the
 part the updates often, and the part the updates
 infrequently as we suspect that record size impacts
 vacuum.

You just said that virtually all rows update constantly --- where's
the infrequent part?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread mark
On Tue, Aug 30, 2005 at 05:29:17PM -0400, Tom Lane wrote:
 Markus Benne [EMAIL PROTECTED] writes:
  We have a highly active table that has virtually all
  entries updated every 5 minutes.  Typical size of the
  table is 50,000 entries, and entries have grown fat.
 ...
  We are thinking of splitting the table in two: the
  part the updates often, and the part the updates
  infrequently as we suspect that record size impacts
  vacuum.
 You just said that virtually all rows update constantly --- where's
 the infrequent part?

I think he means splitting it vertically, instead of horizontally, and
it sounds like an excellent idea, if a large enough portion of each
record is in fact mostly fixed. Otherwise, PostgreSQL is copying data
multiple times, only to have the data expire as part of a dead row.

I've already started to notice such issues with postgresql - but more
because I'm using low-end hardware, and I'm projecting the effect for
when our database becomes much larger with much higher demand on the
database.

This is the sort of scenario where a database without transactional
integrity would significantly out-perform one designed around it. If
records are fixed sized, and updated in place, these problems would
occur far less often. Is it heresy to suggest MySQL in here? :-)

I switched from MySQL to PostgreSQL several months ago, and haven't
looked back - but they do work differently, and for certain uses, one
can destroy the other. Using a MyISAM table would be the way I would
go with this sort of problem.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] 'Real' auto vacuum?

2005-08-30 Thread Ralph Mason
This is a wild and crazy thought which I am sure is invalid for some 
good reason.


But why can't postgres just vacuum itself as it goes along?

When a row is orphaned it's added to a list of possibly available rows.  
When a new row is needed the list of possible rows is examined and the 
first one with a transaction id less then the lowest running transaction 
id is chosen to be the new row?  These rows can be in a heap so it's 
really fast to find one.


Like magic - no more vacuuming.  No more holes for people to fall into.

Is this an oversimplification of the problem?

Ralph

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] 'Real' auto vacuum?

2005-08-30 Thread mark
On Wed, Aug 31, 2005 at 10:21:20AM +1200, Ralph Mason wrote:
 This is a wild and crazy thought which I am sure is invalid for some 
 good reason.
 
 But why can't postgres just vacuum itself as it goes along?
 
 When a row is orphaned it's added to a list of possibly available rows.  
 When a new row is needed the list of possible rows is examined and the 
 first one with a transaction id less then the lowest running transaction 
 id is chosen to be the new row?  These rows can be in a heap so it's 
 really fast to find one.
 
 Like magic - no more vacuuming.  No more holes for people to fall into.

Yes please. :-)

 Is this an oversimplification of the problem?

But, yeah. It's probably not that easy, especially with really big
databases. Where is this free list stored? How efficient is it to keep
track of the lowest running transaction at all times? How does one
synchronize access to this free list, to ensure that processes don't
block up waiting for access to the free list? Is the fre list
journalled to prevent corruption, and the accidental re-use of a still
in use row? And, there would be a cost to scanning this list on every
insert or update.

As an outsider (like you?) I see the current model as a design flaw as
well. A neat and tidy model on paper. Not so nice in real life. The
need to vacuum in batch mode, to keep the database from dying, seems
intuitively bad.

I think there must be answers to this problem. Even simple
optimizations, such as defining a table such that any delete or update
within a table, upon commit, will attempt to vacuum just the rows that
should not be considered free for any new transactions. If it's in
use by an active transaction, oh well. It can be picked up by a batch
run of vacuum. If it's free though - let's do it now.

I think any optimizations we come up with, will be more happily accepted
with a working patch that causes no breakage... :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] 'Real' auto vacuum?

2005-08-30 Thread Josh Berkus
Ralph,

 When a row is orphaned it's added to a list of possibly available rows.
 When a new row is needed the list of possible rows is examined and the
 first one with a transaction id less then the lowest running transaction
 id is chosen to be the new row?  These rows can be in a heap so it's
 really fast to find one.

This is the long-term plan.However, it's actually a lot harder than it 
sounds.  Patches welcome.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Ron

At 03:27 PM 8/30/2005, Joshua D. Drake wrote:


If you still have the budget, I would suggest considering either 
what Ron suggested or possibly using a 4 drive RAID 10 instead.



IME, with only 4 HDs, it's usually better to split them them into 
two RAID 1's (one for the db, one for everything else including the 
logs) than it is to put everything on one RAID 10.  YMMV.


Really? That's interesting.  My experience is different, I assume 
SCSI? Software/Hardware Raid?


The issue exists regardless of technologies used, although the 
technology used does affect when things become an irritation or 
serious problem.


The issue with everything on the same HD set seems to be that under 
light loads anything works reasonably well, but as load increases 
contention between DB table access, OS access, and xlog writes can 
cause performance problems.


In particular, _everything_ else hangs while logs are being written 
with everything on the same HD set.  Thus leaving you with the 
nasty choices of small log writes that cause more seeking behavior, 
and the resultant poor overall HD IO performance, or large log writes 
that basically freeze the server until they are done.


Having the logs on a different HD, and if possible different IO bus, 
reduces this effect to a minimum and seems to be a better choice than 
the shared everything approach.


Although this effect seems largest when there are fewest HDs, the 
general pattern  is that one should use as many spindles as one can 
make use of and that they should be as dedicated as possible in their 
purpose(s).  That's why the TPC bench marked systems tend to have 
literally 100's of HD's and they tend to be split into very focused purposes.


Ron Peacetree



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Chris Browne
[EMAIL PROTECTED] (Markus Benne) writes:
 We have a highly active table that has virtually all
 entries updated every 5 minutes.  Typical size of the
 table is 50,000 entries, and entries have grown fat.

 We are currently vaccuming hourly, and towards the end
 of the hour we are seeing degradation, when compared
 to the top of the hour.

You're not vacuuming the table nearly often enough.

You should vacuum this table every five minutes, and possibly more
often than that.

[We have some tables like that, albeit smaller than 50K entries, which
we vacuum once per minute in production...]

 We are thinking of splitting the table in two: the part the updates
 often, and the part the updates infrequently as we suspect that
 record size impacts vacuum.

There's *some* merit to that.

You might discover that there's a hot spot that needs to be vacuumed
once per minute.

But it may be simpler to just hit the table with a vacuum once every
few minutes even though some tuples are seldom updated.
-- 
output = reverse(gro.gultn @ enworbbc)
http://cbbrowne.com/info/spreadsheets.html
Signs  of  a  Klingon  Programmer #3:  By   filing this  TPR you have
challenged the honor of my family. Prepare to die!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] When to do a vacuum for highly active table

2005-08-30 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I think he means splitting it vertically, instead of horizontally, and
 it sounds like an excellent idea, if a large enough portion of each
 record is in fact mostly fixed. Otherwise, PostgreSQL is copying data
 multiple times, only to have the data expire as part of a dead row.

Only up to a point.  Fields that are wide enough to get toasted
out-of-line (multiple Kb) do not get physically copied if there's
a row update that doesn't affect them.  We don't really have enough
information about his table to guess whether there's any point in
manually partitioning the columns, but my leaning would be probably
not --- the overhead in joining the resulting two tables would be
high enough that you'd need a heck of a big improvement to justify it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] 'Real' auto vacuum?

2005-08-30 Thread Ralph Mason

[EMAIL PROTECTED] wrote:


But, yeah. It's probably not that easy, especially with really big
databases. Where is this free list stored? How efficient is it to keep
track of the lowest running transaction at all times? How does one
synchronize access to this free list, to ensure that processes don't
block up waiting for access to the free list? Is the fre list
journalled to prevent corruption, and the accidental re-use of a still
in use row? And, there would be a cost to scanning this list on every
insert or update.
 

I suspect the freelist could be stored as an index, and just handily 
postgres supports those out of the box.There would be a cost yes, 
but then what is the cost of adding pages to the file all the time?  I 
guess as with all things there is no one size fits all, so perhaps you 
could turn it off - although I expect for 99.9% of the cases 'on' would 
be the better choice. If it gets broken there is already the reindex 
code that can fix it. A coherency / fixing / recover of a table command 
would probably be a useful tool anyway.



As an outsider (like you?) I see the current model as a design flaw as
well. A neat and tidy model on paper. Not so nice in real life. The
need to vacuum in batch mode, to keep the database from dying, seems
intuitively bad.
 

We have a script that vacuums the database every 5 minutes, excessive - 
yes, but turns out that any less is no good really.  I think that this 
is sub optimal, the DB work keeps running, but the vacuum can slow down 
other tasks.  It also probably flushes data that we would need out of 
the page cache so it can look at data that isn't used often as the 
vacuum runs. Not the most optimal data access pattern I could imagine.



I think there must be answers to this problem. Even simple
optimizations, such as defining a table such that any delete or update
within a table, upon commit, will attempt to vacuum just the rows that
should not be considered free for any new transactions. If it's in
use by an active transaction, oh well. It can be picked up by a batch
run of vacuum. If it's free though - let's do it now.
 

Anything would be good - I think it's the achilles  heel of postgres. 
Perhaps there is something simple like that could fix 95% of the problem.



I think any optimizations we come up with, will be more happily accepted
with a working patch that causes no breakage... :-)

 


I am sure they would.

Cheers
Ralph


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Michael Stone

On Tue, Aug 30, 2005 at 07:02:28PM -0400, Ron wrote:
purpose(s).  That's why the TPC bench marked systems tend to have 
literally 100's of HD's and they tend to be split into very focused 
purposes.


Of course, TPC benchmark systems are constructed such that cost and
storage capacity are irrelevant--in the real world things tend to be
more complicated.

Mike Stone

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Ron

At 08:04 PM 8/30/2005, Michael Stone wrote:

On Tue, Aug 30, 2005 at 07:02:28PM -0400, Ron wrote:
purpose(s).  That's why the TPC bench marked systems tend to have 
literally 100's of HD's and they tend to be split into very focused purposes.


Of course, TPC benchmark systems are constructed such that cost and 
storage capacity are irrelevant--in the real world things tend to be

more complicated.


The scary thing is that I've worked on RW production systems that 
bore a striking resemblance to a TPC benchmark system.  As you can 
imagine, they uniformly belonged to BIG organizations (read: lot's 'o 
$$$) who were using the systems for mission critical stuff where 
either it was company existence threatening for the system to be 
done, or they would lose much $$$ per min of down time, or both.


Financial institutions, insurance companies, central data mines for 
Fortune 2000 companies, etc _all_ build systems that push the state 
of the art in how much storage can be managed and how many HDs, CPUs, 
RAM DIMMs, etc are usable.


Historically, this has been the sole province of Oracle and DB2 on 
the SW side and equally outrageously priced custom HW.  Clearly, I'd 
like to see PostgreSQL change that ;-)


Ron Peacetree



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Michael Stone

On Tue, Aug 30, 2005 at 08:41:40PM -0400, Ron wrote:
The scary thing is that I've worked on RW production systems that 
bore a striking resemblance to a TPC benchmark system.  As you can 
imagine, they uniformly belonged to BIG organizations (read: lot's 'o 
$$$) who were using the systems for mission critical stuff where 
either it was company existence threatening for the system to be 
done, or they would lose much $$$ per min of down time, or both.


Yeah, and that market is relevant to someone with one dell server and 2
hard disks how? 


Mike Stone

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] RAID Configuration Sugestion

2005-08-30 Thread Ron

At 08:43 PM 8/30/2005, Michael Stone wrote:

On Tue, Aug 30, 2005 at 08:41:40PM -0400, Ron wrote:
The scary thing is that I've worked on RW production systems that 
bore a striking resemblance to a TPC benchmark system.  As you can 
imagine, they uniformly belonged to BIG organizations (read: lot's 
'o $$$) who were using the systems for mission critical stuff where 
either it was company existence threatening for the system to be 
done, or they would lose much $$$ per min of down time, or both.


Yeah, and that market is relevant to someone with one dell server 
and 2 hard disks how?
Because successful small companies that _start_ with one small server 
and 2 HDs grow to _become_ companies that need far more HW; ...and in 
the perfect world their SW scales to their increased needs...


_Without_ exponentially increasing their costs or overhead (as Oracle 
and DB2 currently do)


THIS is the real long term promise of OS DBMS.

Ron Peacetree



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] 'Real' auto vacuum?

2005-08-30 Thread Mindaugas Riauba

  When a row is orphaned it's added to a list of possibly available rows.
  When a new row is needed the list of possible rows is examined and the
  first one with a transaction id less then the lowest running transaction
  id is chosen to be the new row?  These rows can be in a heap so it's
  really fast to find one.

 This is the long-term plan.However, it's actually a lot harder than it
 sounds.  Patches welcome.

  Some ETA? Since that would be the most welcome addition for us. We
have few very heavily updated databases where table bloat and constant
vacuuming is killing performance.

  Mindaugas


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly