Re: [PERFORM] query produces 1 GB temp file

2006-10-27 Thread Dirk Lutzebäck




Hi,

I'm sorry but it look like my computer has resent older posts from me,
sigh...


Dirk

Alexander Staubo wrote:
While I can't explain why PostgreSQL would use that
memory, I recommend looking into tweaking the work_mem parameter. This
setting specifies how much memory PostgreSQL on certain temporary data
structures (hash tables, sort vectors) until it starts using temporary
files. Quoting the docs:
  
  
  work_mem (integer)

Specifies the amount of memory to be used by internal sort operations
and hash tables before switching to temporary disk files. The value is
specified in kilobytes, and defaults to 1024 kilobytes (1 MB). Note
that for a complex query, several sort or hash operations might be
running in parallel; each one will be allowed to use as much memory as
this value specifies before it starts to put data into temporary files.
Also, several running sessions could be doing such operations
concurrently. So the total memory used could be many times the value of
work_mem; it is necessary to keep this fact in mind when choosing the
value. Sort operations are used for ORDER BY, DISTINCT, and merge
joins. Hash tables are used in hash joins, hash-based aggregation, and
hash-based processing of IN subqueries.

  
  
Alexander.
  
  
On Feb 5, 2005, at 18:25 , Dirk Lutzebaeck wrote:
  
  
  Hi,


here is a query which produces over 1G temp file in pgsql_tmp. This

is on pgsql 7.4.2, RHEL 3.0, XEON MP machine with 32GB RAM, 300MB

sort_mem and 320MB shared_mem.


Below is the query and results for EXPLAIN and EXPLAIN ANALYZE. All

tables have been analyzed before.


Can some please explain why the temp file is so huge? I understand

there are a lot of rows. All relevant indices seem to be used.


Thanks in advance,


Dirk


EXPLAIN

SELECT DISTINCT ON (ft.val_9, ft.created, ft.flatid) ft.docstart,
ft.flatobj, bi.oid, bi.en

FROM bi, en, df AS ft, es

WHERE bi.rc=130170467

AND bi.en=ft.en

AND bi.co=117305223

AND bi.hide=FALSE

AND ft.en=en.oid

AND es.en=bi.en

AND es.co=bi.co

AND es.spec=122293729

AND (ft.val_2='DG' OR ft.val_2='SK')

AND ft.docstart=1

ORDER BY ft.val_9 ASC, ft.created DESC

LIMIT 1000 OFFSET 0;


Limit (cost=8346.75..8346.78 rows=3 width=1361)

 - Unique (cost=8346.75..8346.78 rows=3 width=1361)

 - Sort (cost=8346.75..8346.76 rows=3 width=1361)

 Sort Key: ft.val_9, ft.created, ft.flatid

 - Nested Loop (cost=0.00..8346.73 rows=3
width=1361)

 - Nested Loop (cost=0.00..5757.17 rows=17
width=51)

 - Nested Loop (cost=0.00..5606.39
rows=30 width=42)

 - Index Scan using es_sc_index on
es (cost=0.00..847.71 rows=301 width=8)

 Index Cond: ((spec = 122293729)
AND (co = 117305223::oid))

 - Index Scan using bi_env_index
on bi (cost=0.00..15.80 rows=1 width=42)

 Index Cond: ("outer".en = bi.en)

 Filter: ((rc = 130170467::oid)
AND (co = 117305223::oid) AND (hide = false))

 - Index Scan using en_oid_index on en
(cost=0.00..5.01 rows=1 width=9)

 Index Cond: ("outer".en = en.oid)

 - Index Scan using df_en on df ft
(cost=0.00..151.71 rows=49 width=1322)

 Index Cond: ("outer".en = ft.en)

 Filter: (((val_2 = 'DG'::text) OR (val_2 =
'SK'::text)) AND (docstart = 1))

(17 rows)



--


EXPLAIN ANALYZE gives:



Limit (cost=8346.75..8346.78 rows=3 width=1361) (actual
time=75357.465..75679.964 rows=1000 loops=1)

 - Unique (cost=8346.75..8346.78 rows=3 width=1361) (actual
time=75357.459..75675.371 rows=1000 loops=1)

 - Sort (cost=8346.75..8346.76 rows=3 width=1361) (actual
time=75357.448..75499.263 rows=22439 loops=1)

 Sort Key: ft.val_9, ft.created, ft.flatid

 - Nested Loop (cost=0.00..8346.73 rows=3
width=1361) (actual time=34.104..18016.005 rows=703677 loops=1)

 - Nested Loop (cost=0.00..5757.17 rows=17
width=51) (actual time=0.467..3216.342 rows=48563 loops=1)

 - Nested Loop (cost=0.00..5606.39
rows=30 width=42) (actual time=0.381..1677.014 rows=48563 loops=1)

 - Index Scan using es_sc_index on
es (cost=0.00..847.71 rows=301 width=8) (actual time=0.184..46.519
rows=5863 loops=1)

 Index Cond: ((spec = 122293729)
AND (co = 117305223::oid))

 - Index Scan using bi_env_index
on bi (cost=0.00..15.80 rows=1 width=42) (actual time=0.052..0.218
rows=8 loops=5863)

 Index Cond: ("outer".en = bi.en)

 Filter: ((rc = 130170467::oid)
AND (co = 117305223::oid) AND (hide = false))

 - Index Scan using en_oid_index on en
(cost=0.00..5.01 rows=1 width=9) (actual time=0.015..0.019 rows=1
loops=48563)

 Index Cond: ("outer".en = en.oid)

 - Index Scan using df_en on df ft
(cost=0.00..151.71 rows=49 width=1322) (actual time=0.038..0.148
rows=14 loops=48563)

 Index Cond: ("outer".en = ft.en)

 Filter: (((val_2 = 'DG'::text) OR (val_2 =
'SK'::text)) AND (docstart = 1))

Total 

[PERFORM] Performance problems on 4-way AMD Opteron 875 (dual core)

2005-08-05 Thread Dirk Lutzebäck




[[I'm
posting this on behalf of my co-worker who cannot post to this list at
the moment]]

Hi,


I had installed PostgreSQL on a 4-way AMD Opteron 875 (dual core) and
the performance isn't on the expected level.


Details:

The "old" server is a 4-way XEON MP 3.0 GHz with 4MB L3 cache, 32 GB
RAM (PC1600) and local FC-RAID 10. Hyper-Threading is off. (DL580)

The "old" server is using Red Hat Enterprise Linux 3 Update 5.

The "new" server is a 4-way Opteron 875 with 1 MB L2 cache, 32 GB RAM
(PC3200) and the same local FC-RAID 10. (HP DL585)

The "new" server is using Red Hat Enterprise Linux 4 (with the latest
x86_64 kernel from Red Hat - 2.6.9-11.ELsmp #1 SMP Fri May 20 18:25:30
EDT 2005 x86_64)

I use PostgreSQL version 8.0.3.


The issue is that the Opteron is slower as the XEON MP under high load.
I have created a test with parallel queries which are typical for my
application. The queries are in a range of small queries (0.1 seconds)
and larger queries using join (15 seconds).

The test starts parallel clients. Each clients runs the queries in a
random order. The test takes care that a client use always the same
random order to get valid results.


Here are the number of queries which the server has finished in a fix
period of time.

I used PostgreSQL 8.1 snapshot from last week compiled as 64bit binary
for DL585-64bit.

I used PostgreSQL 8.0.3 compiled as 32bit binary for DL585-32bit and
DL580.

During the tests everything which is needed is in the file cache. I
didn't have read activity.

Context switch spikes are over 5 during the test on both server.
My feeling is that the XEON has a tick more context switches.





PostgreSQL params:

max_locks_per_transaction = 256

shared_buffers = 4

effective_cache_size = 384

work_mem = 30

maintenance_work_mem = 512000

wal_buffers = 32

checkpoint_segments = 24



I was expecting two times more queries on the DL585. The DL585 with
PostgreSQL 8.0.3 32bit does meltdown earlier as the XEON in production
use. Please compare 4 clients and 8 clients. With 4 clients the Opteron
is in front and with 8 clients the XEON doesn't meltdown that much as
the Opteron.


I don't have any idea what cause this. Benchmarks like SAP's SD 2-tier
showing that the DL585 can handle nearly three times more load as the
DL580 with XEON 3.0. We choose the 4-way Opteron 875 based on such
benchmark to replace the 4-way XEON MP.


Does anyone have comments or ideas on which I have to focus my work?


I guess, the shared buffer cause the meltdown when to many clients are
accessing the same data.

I didn't understand why the 4-way XEON MP 3.0 can deal with this better
as the 4-way Opteron 875.

The system load on the Opteron is never over 3.0. The XEON MP has a
load up to 4.0.


Should I try other settings for PostgreSQL in postgresql.conf?

Should I try other setting for the compilation?


I will compile the latest PostgreSQL 8.1 snapshot for 32bit to evaluate
the new shared buffer code from Tom.

I think, the 64bit is slow because my queries are CPU intensive.


Can someone provide a commercial support contact for this issue?


Sven.







Re: [PERFORM] Performance problems on 4/8way Opteron (dualcore) HP

2005-07-31 Thread Dirk Lutzebäck
Anybody knows if RedHat is already supporting this patch on an 
enterprise version?


Regards,

Dirk



J. Andrew Rogers wrote:

On 7/29/05 10:46 AM, Josh Berkus josh@agliodbs.com wrote:


does anybody have expierence with this machine (4x 875 dual core Opteron
CPUs)?


Nope.   I suspect that you may be the first person to report in on
dual-cores.  There may be special compile issues with dual-cores that
we've not yet encountered.




There was recently a discussion of similar types of problems on a couple of
the supercomputing lists, regarding surprisingly substandard performance
from large dual-core opteron installations.

The problem as I remember it boiled down to the Linux kernel handling
memory/process management very badly on large dual core systems --
pathological NUMA behavior.  However, this problem has apparently been fixed
in Linux v2.6.12+, and using the more recent kernel on large dual core
systems generated *massive* performance improvements on these systems for
the individuals with this issue.  Using the patched kernel, one gets the
performance most people were expecting.

The v2.6.12+ kernels are a bit new, but they contain a very important
performance patch for systems like the one above.  It would definitely be
worth testing if possible.


J. Andrew Rogers





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance problems on 4/8way Opteron (dualcore) HP

2005-07-31 Thread Dirk Lutzebäck

Hi Jeff,

which box are you running precisely and which OS/kernel?

We need to run 32bit because we need failover to 32 bit XEON system 
(DL580). If this does not work out we probably need to switch to 64 bit 
(dump/restore) and run a nother 64bit failover box too.


Regards,

Dirk



Jeffrey W. Baker wrote:

On Fri, 2005-07-29 at 10:46 -0700, Josh Berkus wrote:


Dirk,



does anybody have expierence with this machine (4x 875 dual core Opteron
CPUs)?



I'm using dual 275s without problems.


Nope.   I suspect that you may be the first person to report in on 
dual-cores.  There may be special compile issues with dual-cores that 
we've not yet encountered.



Doubtful.  However you could see improvements using recent Linux kernel
code.  There have been some patches for optimizing scheduling and memory
allocations.

However, if you are running this machine in 32-bit mode, why did you
bother paying $14,000 for your CPUs?  You will get FAR better
performance in 64-bit mode.  64-bit mode will give you 30-50% better
performance on PostgreSQL loads, in my experience.  Also, if I remember
correctly, the 32-bit x86 kernel doesn't understand Opteron NUMA
topology, so you may be seeing poor memory allocation decisions.

-jwb


We run RHEL 3.0, 32bit and under high load it is a drag. We 
mostly run memory demanding queries. Context switches are pretty much

around 20.000 on the average, no cs spikes when we run many processes in
parallel. Actually we only see two processes in running state! When
there are only a few processes running context switches go much higher.
At the moment we are much slower that with a 4way XEON box (DL580).


Um, that was a bit incoherent.  Are you seeing a CS storm or aren't you?




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] Performance problems on 4/8way Opteron (dualcore) HP DL585

2005-07-29 Thread Dirk Lutzebäck

Hi,

does anybody have expierence with this machine (4x 875 dual core Opteron 
CPUs)? We run RHEL 3.0, 32bit and under high load it is a drag. We 
mostly run memory demanding queries. Context switches are pretty much 
around 20.000 on the average, no cs spikes when we run many processes in 
parallel. Actually we only see two processes in running state! When 
there are only a few processes running context switches go much higher. 
At the moment we are much slower that with a 4way XEON box (DL580).


We are running 8.0.3 compiled with -mathlon flags.

Regards,

Dirk

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Optimizer seems to be way off, why?

2005-07-20 Thread Dirk Lutzebäck

Hi,

I do not under stand the following explain output (pgsql 8.0.3):

explain analyze
select b.e from b, d
where b.r=516081780 and b.c=513652057 and b.e=d.e;

QUERY PLAN

Nested Loop  (cost=0.00..1220.09 rows=1 width=4) (actual 
time=0.213..2926.845 rows=324503 loops=1)
  -  Index Scan using b_index on b  (cost=0.00..1199.12 rows=1 
width=4) (actual time=0.104..17.418 rows=3293 loops=1)

Index Cond: (r = 516081780::oid)
Filter: (c = 513652057::oid)
  -  Index Scan using d_e_index on d  (cost=0.00..19.22 rows=140 
width=4) (actual time=0.009..0.380 rows=99 loops=3293)

Index Cond: (outer.e = d.e)
Total runtime: 3638.783 ms
(7 rows)

Why is the rows estimate for b_index and the nested loop 1? It is 
actually 3293 and 324503.


I did VACUUM ANALYZE before and I also increased the STATISTICS TARGET 
on b.e to 500. No change.


Here is the size of the tables:

select count(oid) from b;
3532161

select count(oid) from b where r=516081780 and c=513652057;
 3293

select count(oid) from d;
117270


Regards,

Dirk

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Optimizer seems to be way off, why?

2005-07-20 Thread Dirk Lutzebäck

Richard Huxton wrote:

Dirk Lutzebäck wrote:


Hi,

I do not under stand the following explain output (pgsql 8.0.3):

explain analyze
select b.e from b, d
where b.r=516081780 and b.c=513652057 and b.e=d.e;

QUERY PLAN
 

Nested Loop  (cost=0.00..1220.09 rows=1 width=4) (actual 
time=0.213..2926.845 rows=324503 loops=1)
  -  Index Scan using b_index on b  (cost=0.00..1199.12 rows=1 
width=4) (actual time=0.104..17.418 rows=3293 loops=1)

Index Cond: (r = 516081780::oid)
Filter: (c = 513652057::oid)
  -  Index Scan using d_e_index on d  (cost=0.00..19.22 rows=140 
width=4) (actual time=0.009..0.380 rows=99 loops=3293)

Index Cond: (outer.e = d.e)
Total runtime: 3638.783 ms
(7 rows)

Why is the rows estimate for b_index and the nested loop 1? It is 
actually 3293 and 324503.



I'm guessing (and that's all it is) that b.r and b.c have a higher 
correlation than the planner is expecting. That is, it expects the 
b.c=... to reduce the number of matching rows much more than it is.


Try a query just on WHERE b.r=516081780 and see if it gets the estimate 
right for that.


If it's a common query, it might be worth an index on (r,c)

--
  Richard Huxton
  Archonet Ltd



Thanks Richard, dropping the join for b.c now gives better estimates (it 
also uses a different index now) although not accurate (off by factor 
10). This query is embedded in a larger query which now got a 1000 times 
speed up (!) because I can drop b.c because it is redundant.


Though, why can't the planner see this correlation? I think somebody 
said the planner does not know about multiple column correlations, does it?


Regards,

Dirk


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] SURVEY: who is running postgresql on 8 or more CPUs?

2005-06-02 Thread Dirk Lutzebäck

Hi,

I just got one reply for this survey. Is almost nobody using postgresql 
on 8+ machines?


Regards,

Dirk

Dirk Lutzebäck wrote:


Hi,

I would like to start a little survey who is running postgresql on an 
8way or more machine (Intel, Sparc, AMD no matter). Purpose: find out 
how postgresql runs in high performance areas.


Please fillout:

Machine (Vendor, Product):
Architecture (Intel/Sparc/AMD/IBM):
Processors (Type/Number/GHz):
RAM:
Operating System:
PostgreSQL Version:
Database size (GB):
Disk system:
Type of application:
Your email contact:
Willing to answer questions in this group:
Comments:


Please answer here or to me. I compile the results and feed them back 
here.


Regards,

Dirk


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly



--
Dirk Lutzebäck [EMAIL PROTECTED] Tel +49.30.5362.1635 Fax .1638
CTO AEC/communications GmbH, Berlin, Germany, http://www.aeccom.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] SURVEY: who is running postgresql on 8 or more CPUs?

2005-06-02 Thread Dirk Lutzebäck

Hi Dawid,

postgresql is open source and we also want it to be used in high 
performance areas. What's wrong with people telling on which machines 
they use it? I don't care about business details but techinal details 
would be quite interesting. In the end it is interesting to know how you 
need to tune postgresql on high end machines and how well they perform 
on the different highend platforms. This is meant to be more a field 
study and not a benchmark. We know that Opteron performs well but what 
are people actually using in high performance areas? Does postgresql run 
on an E1? Who did it?


Regards,

Dirk

Dawid Kuroczko wrote:


On 6/2/05, Dirk Lutzebäck [EMAIL PROTECTED] wrote:
 


I just got one reply for this survey. Is almost nobody using postgresql
on 8+ machines?
   



My guess is when someone is using PostgreSQL on 8+ machine, she's
in highly competitive (or sensitive) market and either cannot give
company's work details to everyone or simply doesn't want to.

Probably if you asked 'I am thinking about buying 8-way Opteron
box, does PostgreSQL have problems with such hardware' you
would get a response.

But surveys are awfully close to statistics and many people simply
doesn't like them.  (They say that 46.7% of statisticts are just made
up ;-)).

  Regards,
 Dawid
 



--
Dirk Lutzebäck [EMAIL PROTECTED] Tel +49.30.5362.1635 Fax .1638
CTO AEC/communications GmbH, Berlin, Germany, http://www.aeccom.com


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


[PERFORM] SURVEY: who is running postgresql on 8 or more CPUs?

2005-05-31 Thread Dirk Lutzebäck

Hi,

I would like to start a little survey who is running postgresql on an 
8way or more machine (Intel, Sparc, AMD no matter). Purpose: find out 
how postgresql runs in high performance areas.


Please fillout:

Machine (Vendor, Product):
Architecture (Intel/Sparc/AMD/IBM):
Processors (Type/Number/GHz):
RAM:
Operating System:
PostgreSQL Version:
Database size (GB):
Disk system:
Type of application:
Your email contact:
Willing to answer questions in this group:
Comments:


Please answer here or to me. I compile the results and feed them back here.

Regards,

Dirk


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-20 Thread Dirk Lutzebäck
Dirk Lutzebaeck wrote:

c) Dual XEON DP, non-bigmem, HT on, E7500 Intel chipset (Supermicro)

performs well and I could not observe context switch peaks here (one 
user active), almost no extra semop calls
Did Tom's test here: with 2 processes I'll reach 200k+ CS with peaks to 
300k CS. Bummer.. Josh, I don't think you can bash the ServerWorks 
chipset here nor bigmem.

Dirk



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: RESOLVED: Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-19 Thread Dirk Lutzebäck
Josh, I cannot reproduce the excessive semop() on a Dual XEON DP on a 
non-bigmem kernel, HT on. Interesting to know if the problem is related 
to XEON MP (as Tom wrote) or bigmem.

Josh Berkus wrote:

Dirk,

 

I'm not sure if this semop() problem is still an issue but the database 
behaves a bit out of bounds in this situation, i.e. consuming system 
resources with semop() calls 95% while tables are locked very often and 
longer.
   

It would be helpful to us if you could test this with the indexes disabled on 
the non-Bigmem system.   I'd like to eliminate Bigmem as a factor, if 
possible.

 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


RESOLVED: Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-16 Thread Dirk Lutzebäck
Tom, Josh,

I think we have the problem resolved after I found the following note 
from Tom:

 A large number of semops may mean that you have excessive contention 
on some lockable
 resource, but I don't have enough info to guess what resource.

This was the key to look at: we were missing all indices on table which 
is used heavily and does lots of locking. After recreating the missing 
indices the production system performed normal. No, more excessive 
semop() calls, load way below 1.0, CS over 20.000 very rare, more in 
thousands realm and less.

This is quite a relief but I am sorry that the problem was so stupid and 
you wasted some time although Tom said he had also seem excessive 
semop() calls on another Dual XEON system.

Hyperthreading was turned off so far but will be turned on again the 
next days. I don't expect any problems then.

I'm not sure if this semop() problem is still an issue but the database 
behaves a bit out of bounds in this situation, i.e. consuming system 
resources with semop() calls 95% while tables are locked very often and 
longer.

Thanks for your help,

Dirk

At last here is the current vmstat 1 excerpt where the problem has been 
resolved:



procs ---memory-- ---swap-- -io --system-- 
cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us sy 
id wa
1  0   2308 232508 201924 697653200   136   464  628   812  5  
1 94  0
0  0   2308 232500 201928 69766280096   296  495   484  4  
0 95  0
0  1   2308 232492 201928 697662800 0   176  347   278  1  
0 99  0
0  0   2308 233484 201928 69765960040   580  443   351  8  
2 90  0
1  0   2308 233484 201928 69766960076   692  792   651  9  
2 88  0
0  0   2308 233484 201928 697669600 020  13234  0  
0 100  0
0  0   2308 233484 201928 697669600 076  17790  0  
0 100  0
0  1   2308 233484 201928 697669600 0   216  321   250  4  
0 96  0
0  0   2308 233484 201928 697669600 0   116  417   240  8  
0 92  0
0  0   2308 233484 201928 69767840048   600  403   270  8  
0 92  0
0  0   2308 233464 201928 69768600076   452 1064  2611 14  
1 84  0
0  0   2308 233460 201932 69769000032   256  587   587 12  
1 87  0
0  0   2308 233460 201932 69769320032   188  379   287  5  
0 94  0
0  0   2308 233460 201932 697693200 0 0  103 8  0  
0 100  0
0  0   2308 233460 201932 697693200 0 0  10214  0  
0 100  0
0  1   2308 233444 201948 697693200 0   348  300   180  1  
0 99  0
1  0   2308 233424 201948 69769480016   380  739   906  4  
2 93  0
0  0   2308 233424 201948 69770320068   260  724   987  7  
0 92  0
0  0   2308 231924 201948 69771280096   344 1130   753 11  
1 88  0
1  0   2308 231924 201948 697724800   112   324  687   628  3  
0 97  0
0  0   2308 231924 201948 697724800 0   192  575   430  5  
0 95  0
1  0   2308 231924 201948 697724800 0   264  208   124  0  
0 100  0
0  0   2308 231924 201948 69772640016   272  380   230  3  
2 95  0
0  0   2308 231924 201948 697726400 0 0  104 8  0  
0 100  0
0  0   2308 231924 201948 697726400 048  25892  1  
0 99  0
0  0   2308 231816 201948 697748400   212   268  456   384  2  
0 98  0
0  0   2308 231816 201948 697748400 088  453   770  0  
0 99  0
0  0   2308 231452 201948 697768000   196   476  615   676  5  
0 94  0
0  0   2308 231452 201948 697768000 0   228  431   400  2  
0 98  0
0  0   2308 231452 201948 697768000 0 0  23758  3  
0 97  0
0  0   2308 231448 201952 697768000 0 0  36584  2  
0 97  0
0  0   2308 231448 201952 697768000 040  246   108  1  
0 99  0
0  0   2308 231448 201952 6960096   352  606  1026  4  
2 94  0
0  0   2308 231448 201952 69600 0   240  295   266  5  
0 95  0



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Toooo many context switches (maybe SLES8?)

2004-04-15 Thread Dirk Lutzebäck
Hi,

we have a complex modperl database application using postgresql 7.4.1 on 
a new Dual Xeon MP Machine with SLES8 which seems to generate too much 
context switches (way more than 100.000) on higher load (meaning system 
load  2). System response times significantly slow down then. We have 
tuned parameters for weeks now but could not come up with better 
results. It seems that we have had better performance on an older Dual 
XEON DP  Machine running on RedHat 7.3.

Here is the config:

database machine on SuSE SLES 8:

  F-S Primergy RX600
  2x XEON MP 2.5GHz
  8GB RAM
  Hardware Raid 1+0 140GB
  Kernel 2.4.21-169-smp
  Postgresql 7.4.1 (self compiled) with
  max_connections = 170
  shared_buffers = 4
  effective_cache_size = 80
  sort_mem = 3
  vacuum_mem = 42
  max_fsm_relations = 2000
  max_fsm_pages = 20
  random_page_cost = 4
  checkpoint_segments = 24
  wal_buffers = 32
modperl application machine on RH 7.3:

  F-S Primergy RX200
  2x XEON DP 2.4 GHz
  4 GB RAM
  Kernel 2.4.18-10smp, RedHat 7.3
  Apache 1.3.27 setup:
  MinSpareServers 15
  MaxSpareServers 30
  StartServers 15
  MaxClients 80
  MaxRequestsPerChild 100
vmstat 1 excerpt:

procs ---memory-- ---swap-- -io --system-- 
cpu
r  b   swpd   free   buff  cache   si   sobibo   incs us sy 
id wa
1  0   4868 242372 179488 69423160012 8   18 9  6  
2 92  0
2  1   4868 242204 179488 69425000064   500  701 117921 35 
18 48  0
0  1   4868 242032 179392 69415600016   316  412 132295 28 
25 47  0
1  0   4872 242396 179164 693377600   128   276  474 69708 21 
24 56  0
3  0   4872 242536 179164 693380800 0   240  412 113643 27 
27 46  0
2  0   4872 242872 179092 69317080048  1132  521 127916 24 
24 53  0
0  0   4876 242876 179092 69275120048   532  504 117868 32 
21 47  0
0  0   4876 242504 179096 692756000 0   188  412 127147 34 
20 47  0
1  0   4876 242152 179096 69278560096   276  529 117684 28 
23 49  0
2  0   4876 242864 179096 69283840088   560  507 135717 38 
19 43  0
1  0   4876 242848 179096 69285200064   232  433 151380 32 
20 48  0
4  0   4876 242832 179144 69289160016 10380 2913 112583 28 
20 52  0
4  0   4876 242720 179144 692924000   196 0  329 154821 32 
18 50  0
3  2   4876 243576 179144 692940800 0   460  451 160287 29 
18 52  0
3  0   4876 243292 179180 69294680016   436  614 51894 15  
5 80  0
0  0   4876 243884 179180 692958000 0   236  619 154168 29 
21 49  0
2  1   4876 243864 179180 692986000   128   380  493 155903 31 
19 50  0
2  0   4876 244720 179180 69302760016  1208  561 129336 27 
16 56  0
2  0   4876 247204 179180 693030000 0 0  361 146268 33 
20 47  0
3  0   4876 248620 179180 693037200 0   168  346 155915 32 
12 56  0
2  0   4876 250476 179180 693043600 0   184  328 163842 35 
20 46  0
0  0   4876 250496 179180 69306520048   260  450 144930 31 
15 53  0
1  0   4876 252236 179180 69307320016   244  577 167259 35 
15 50  0
0  0   4876 252236 179180 693078000 0   464  622 165488 31 
15 54  0
1  0   4876 252268 179180 693081200 0   132  460 153381 34 
15 52  0
2  0   4876 252268 179180 693096400 0   216  312 141009 31 
19 50  0
1  0   4876 252264 179180 693098000 056  275 153143 33 
20 47  0
2  0   4876 252212 179180 69312120096   296  400 133982 32 
18 50  0
1  0   4876 252264 179180 693133200 0   300  416 136034 32 
18 50  0
1  1   4876 252264 179180 693133200 0   236  377 143300 34 
22 44  0
4  0   4876 254876 179180 693137200 0   124  446 118117 34 
20 45  0
1  0   4876 254876 179180 69314920016   144  462 140499 38 
16 46  0
2  0   4876 255860 179180 69315720016   144  674 126250 33 
20 47  0
1  0   4876 255860 179180 69317880048   264  964 115679 36 
13 51  0
3  0   4876 255864 179180 693180400 0   100  597 127619 36 
19 46  0
5  1   4876 255864 179180 69319240072   352  559 151620 34 
18 48  0
2  0   4876 255860 179184 69321000096   120  339 137821 34 
20 47  0
0  0   4876 255860 179184 693215600 8   168  469 125281 36 
21 43  0
2  0   4876 256092 179184 693244400   112   328  446 137939 34 
19 48  0
2  0   4876 256092 179184 69324840016   184  382 141800 35 
16 49  0
3  0   4876 256464 179184 69327160016   356  448 134238 30 
18 51  0
5  0   4876 256464 179184 69328920096   600  476 142838 34 
20 46  0
1  0   4876 256464 179184 69330120016   176  589 138546 35 
22 43  0
2  0   4876 256436 179184 6933096006076  396 93110 42 
17 41  0
1  0   4876 256464 179184 693348400   212   276  442 83060 45 
11 44  0
5  0   4876 

Re: [PERFORM] Toooo many context switches (maybe SLES8?)

2004-04-15 Thread Dirk Lutzebäck
Joe, do you know where I should look in the 7.4.2 code to find this out?

Dirk

Joe Conway wrote:

Dirk Lutzebäck wrote:

postgresql 7.4.1

a new Dual Xeon MP

too much context switches (way more than 100.000) on higher load 
(meaning system load  2).


I believe this was fixed in 7.4.2, although I can't seem to find it in 
the release notes.

Joe




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster