Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-21 Thread Guillaume Smet
On 3/16/06, Tom Lane [EMAIL PROTECTED] wrote:
 Can you try strace'ing some of the backend processes while the system is
 behaving like this?  I suspect what you'll find is a whole lot of
 delaying select() calls due to high contention for spinlocks ...

As announced, we have migrated our production server from 7.4.8 to
8.1.3 this morning. We did some strace'ing before the migration and
you were right on the select calls. We had a lot of them even when the
database was not highly loaded (one every 3-4 lines).

After the upgrade, we have the expected behaviour with a more linear
scalability and a growing cpu load when the database is highly loaded
(and no cpu idle anymore in this case). We have fewer context switches
too.

8.1.3 definitely is far better for quad Xeon MP and I recommend the
upgrade for everyone having this sort of problem.

Tom, thanks for your great work on this problem.

--
Guillaume

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-18 Thread Kenneth Marshall
On Thu, Mar 16, 2006 at 11:45:12AM +0100, Guillaume Smet wrote:
 Hello,
 
 We are experiencing performances problem with a quad Xeon MP and
 PostgreSQL 7.4 for a year now. Our context switch rate is not so high
 but the load of the server is blocked to 4 even on very high load and
 we have 60% cpu idle even in this case. Our database fits in RAM and
 we don't have any IO problem. I saw this post from Tom Lane
 http://archives.postgresql.org/pgsql-performance/2004-04/msg00249.php
 and several other references to problem with Xeon MP and I suspect our
 problems are related to this.
 We tried to put our production load on a dual standard Xeon on monday
 and it performs far better with the same configuration parameters.
 
 I know that work has been done by Tom for PostgreSQL 8.1 on
 multiprocessor support but I didn't find any information on if it
 solves the problem with Xeon MP or not.
 
 My question is should we expect a resolution of our problem by
 switching to 8.1 or will we still have problems and should we consider
 a hardware change? We will try to upgrade next tuesday so we will have
 the real answer soon but if anyone has any experience or information
 on this, he will be very welcome.
 
 Thanks for your help.
 
 --
 Guillaume
 

Guillaume,

We had a similar problem with poor performance on a Xeon DP and 
PostgreSQL 7.4.x. 8.0 came out in time for preliminary testing but
it did not solve the problem and our production systems went live
using a different database product. We are currently testing against
8.1.x and the seemingly bizarre lack of performance is gone. I would
suspect that a quad-processor box would have the same issue. I would
definitely recommend giving 8.1 a try.

Ken

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Richard Huxton

Guillaume Smet wrote:

Hello,

We are experiencing performances problem with a quad Xeon MP and
PostgreSQL 7.4 for a year now.


I had a similar issue with  a client the other week.


Our context switch rate is not so high
but the load of the server is blocked to 4 even on very high load and
we have 60% cpu idle even in this case. Our database fits in RAM and
we don't have any IO problem.


Actually, I think that's part of the problem - it's the memory bandwidth.

 I saw this post from Tom Lane

http://archives.postgresql.org/pgsql-performance/2004-04/msg00249.php
and several other references to problem with Xeon MP and I suspect our
problems are related to this.


You should be seeing context-switching jump dramatically if it's the 
classic multi-Xeon problem. There's a point at which it seems to just 
escalate without a corresponding jump in activity.



We tried to put our production load on a dual standard Xeon on monday
and it performs far better with the same configuration parameters.

I know that work has been done by Tom for PostgreSQL 8.1 on
multiprocessor support but I didn't find any information on if it
solves the problem with Xeon MP or not.


I checked with Tom last week. Thread starts below:
  http://archives.postgresql.org/pgsql-hackers/2006-02/msg01118.php

He's of the opinion that 8.1.3 will be an improvement.


My question is should we expect a resolution of our problem by
switching to 8.1 or will we still have problems and should we consider
a hardware change? We will try to upgrade next tuesday so we will have
the real answer soon but if anyone has any experience or information
on this, he will be very welcome.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Guillaume Smet
Richard,

 You should be seeing context-switching jump dramatically if it's the
 classic multi-Xeon problem. There's a point at which it seems to just
 escalate without a corresponding jump in activity.

No we don't have this problem of very high context switching in our
case even when the database is very slow. When I mean very slow, we
have pages which loads in a few seconds in the normal case (load
between 3 and 4) which takes several minutes (up to 5-10 minutes) to
be generated in the worst case (load at 4 but really bad
performances).
If I take a look on our cpu load graph, in one year, the cpu load was
never higher than 5 even in the worst cases...

 I checked with Tom last week. Thread starts below:
http://archives.postgresql.org/pgsql-hackers/2006-02/msg01118.php

 He's of the opinion that 8.1.3 will be an improvement.

Thanks for pointing me this thread, I searched in -performance not in
-hackers as the original thread was in -performance. We planned a
migration to 8.1.3 so we'll see what happen with this version.

Do you plan to test it before next tuesday? If so, I'm interested in
your results. I'll post our results here as soon as we complete the
upgrade.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Sven Geisler

Hi Guillaume,

I had a similar issue last summer. Could you please provide details 
about your XEON MP server and some statistics (context-switches/load/CPU 
usage)?


I tried different servers (x86) with different results. I saw a 
difference between XEON MP w/ and w/o EMT64. The memory bandwidth makes 
also a difference.


What version of XEON MP does your server have?
Which type of RAM does you server have?
Do you use Hyperthreading?

You should provide details from the XEON DP?

Regards
Sven.

Guillaume Smet schrieb:

Richard,


You should be seeing context-switching jump dramatically if it's the
classic multi-Xeon problem. There's a point at which it seems to just
escalate without a corresponding jump in activity.


No we don't have this problem of very high context switching in our
case even when the database is very slow. When I mean very slow, we
have pages which loads in a few seconds in the normal case (load
between 3 and 4) which takes several minutes (up to 5-10 minutes) to
be generated in the worst case (load at 4 but really bad
performances).
If I take a look on our cpu load graph, in one year, the cpu load was
never higher than 5 even in the worst cases...


I checked with Tom last week. Thread starts below:
   http://archives.postgresql.org/pgsql-hackers/2006-02/msg01118.php

He's of the opinion that 8.1.3 will be an improvement.


Thanks for pointing me this thread, I searched in -performance not in
-hackers as the original thread was in -performance. We planned a
migration to 8.1.3 so we'll see what happen with this version.

Do you plan to test it before next tuesday? If so, I'm interested in
your results. I'll post our results here as soon as we complete the
upgrade.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


--
/This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you are not the intended recipient, you should not
copy it, re-transmit it, use it or disclose its contents, but should
return it to the sender immediately and delete your copy from your
system. Thank you for your cooperation./

Sven Geisler [EMAIL PROTECTED] Tel +49.30.5362.1627 Fax .1638
Senior Developer,AEC/communications GmbHBerlin,   Germany

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Guillaume Smet
Sven,

On 3/16/06, Sven Geisler [EMAIL PROTECTED] wrote:
 What version of XEON MP does your server have?

The server is a dell 6650 from end of 2004 with 4 xeon mp 2.2 and 2MB
cache per proc.

Here are the information from Dell:
4x PROCESSOR, 80532, 2.2GHZ, 2MB cache, 400Mhz, SOCKET F
8x DUAL IN-LINE MEMORY MODULE, 512MB, 266MHz

 Do you use Hyperthreading?

No, we don't use it.

 You should provide details from the XEON DP?

The only problem is that the Xeon DP is installed with a 2.6 kernel
and a postgresql 8.1.3 (it is used to test the migration from 7.4 to
8.1.3). So it's very difficult to really compare the two behaviours.

It's a Dell 2850 with:
2 x PROCESSOR, 80546K, 2.8G, 1MB cache, XEON NOCONA, 800MHz
4 x DUAL IN-LINE MEMORY MODULE, 1GB, 400MHz

This server is obviously newer than the other one.

--
Guillaume

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Guillaume Smet
On 3/16/06, Sven Geisler [EMAIL PROTECTED] wrote:
 Hi Guillaume,

 I had a similar issue last summer. Could you please provide details
 about your XEON MP server and some statistics (context-switches/load/CPU
 usage)?

I forgot the statistics:
CPU load usually from 1 to 4.
CPU usage  40% for each processor usually and sometimes when the
server completely hangs, it grows to 60%..,

Here is a top output of the server at this time:
 15:21:17  up 138 days, 13:25,  1 user,  load average: 1.29, 1.25, 1.38
82 processes: 81 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  cpuusernice  systemirq  softirq  iowaitidle
   total   25.7%0.0%3.9%   0.0% 0.3%0.1%   69.7%
   cpu00   29.3%0.0%4.7%   0.1% 0.5%0.0%   65.0%
   cpu01   20.7%0.0%1.9%   0.0% 0.3%0.0%   76.8%
   cpu02   25.5%0.0%5.5%   0.0% 0.1%0.3%   68.2%
   cpu03   27.3%0.0%3.3%   0.0% 0.1%0.1%   68.8%
Mem:  3857224k av, 3298580k used,  558644k free,   0k shrd,  105172k buff
   2160124k actv,  701304k in_d,   56400k in_c
Swap: 4281272k av,6488k used, 4274784k free 2839348k cached

We have currently between 3000 and 13000 context switches/s, average
of 5000 I'd say visually.

Here is a top output I had on november 17 when the server completely
hangs (several minutes for each page of the website) and it is typical
of this server behaviour:
17:08:41  up 19 days, 15:16,  1 user,  load average: 4.03, 4.26, 4.36
288 processes: 285 sleeping, 3 running, 0 zombie, 0 stopped
CPU states:  cpuusernice  systemirq  softirq  iowaitidle
   total   59.0%0.0%8.8%   0.2% 0.0%0.0%   31.9%
   cpu00   52.3%0.0%   13.3%   0.9% 0.0%0.0%   33.3%
   cpu01   65.7%0.0%7.6%   0.0% 0.0%0.0%   26.6%
   cpu02   58.0%0.0%7.6%   0.0% 0.0%0.0%   34.2%
   cpu03   60.0%0.0%6.6%   0.0% 0.0%0.0%   33.3%
Mem:  3857224k av, 3495880k used,  361344k free,   0k shrd,   92160k buff
   2374048k actv,  463576k in_d,   37708k in_c
Swap: 4281272k av,   25412k used, 4255860k free 2173392k cached

As you can see, load is blocked to 4, no iowait and cpu idle of 30%.

Vmstat showed 5000 context switches/s on average so we had no context
switch storm.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 Here is a top output I had on november 17 when the server completely
 hangs (several minutes for each page of the website) and it is typical
 of this server behaviour:
 17:08:41  up 19 days, 15:16,  1 user,  load average: 4.03, 4.26, 4.36
 288 processes: 285 sleeping, 3 running, 0 zombie, 0 stopped
 CPU states:  cpuusernice  systemirq  softirq  iowaitidle
total   59.0%0.0%8.8%   0.2% 0.0%0.0%   31.9%
cpu00   52.3%0.0%   13.3%   0.9% 0.0%0.0%   33.3%
cpu01   65.7%0.0%7.6%   0.0% 0.0%0.0%   26.6%
cpu02   58.0%0.0%7.6%   0.0% 0.0%0.0%   34.2%
cpu03   60.0%0.0%6.6%   0.0% 0.0%0.0%   33.3%
 Mem:  3857224k av, 3495880k used,  361344k free,   0k shrd,   92160k buff
2374048k actv,  463576k in_d,   37708k in_c
 Swap: 4281272k av,   25412k used, 4255860k free 2173392k 
 cached

 As you can see, load is blocked to 4, no iowait and cpu idle of 30%.

Can you try strace'ing some of the backend processes while the system is
behaving like this?  I suspect what you'll find is a whole lot of
delaying select() calls due to high contention for spinlocks ...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Sven Geisler

Hi Guillaume,

Guillaume Smet schrieb:


The server is a dell 6650 from end of 2004 with 4 xeon mp 2.2 and 2MB
cache per proc.

Here are the information from Dell:
4x PROCESSOR, 80532, 2.2GHZ, 2MB cache, 400Mhz, SOCKET F
8x DUAL IN-LINE MEMORY MODULE, 512MB, 266MHz






You should provide details from the XEON DP?


The only problem is that the Xeon DP is installed with a 2.6 kernel
and a postgresql 8.1.3 (it is used to test the migration from 7.4 to
8.1.3). So it's very difficult to really compare the two behaviours.

It's a Dell 2850 with:
2 x PROCESSOR, 80546K, 2.8G, 1MB cache, XEON NOCONA, 800MHz
4 x DUAL IN-LINE MEMORY MODULE, 1GB, 400MHz



Did you compare 7.4 on a 4-way with 8.1 on a 2-way?
How many queries and clients did you use to test the performance?
How much faster is the XEON DP?

I think, you can expect that your XEON DP is faster on a single query 
because CPU and RAM are faster. The overall performance can be better on 
your XEON DP if you only have a few clients.


I guess, the newer hardware and the newer PostgreSQL version cause the 
better performance.


Regards
Sven.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Guillaume Smet
On 3/16/06, Sven Geisler [EMAIL PROTECTED] wrote:
 Did you compare 7.4 on a 4-way with 8.1 on a 2-way?

I know there are too many parameters changing between the two servers
but I can't really change anything before tuesday. On tuesday, we will
be able to compare both servers with the same software.

 How many queries and clients did you use to test the performance?

Googlebot is indexing this site generating 2-3 mbits/s of traffic so
we use the googlebot to stress this server. There was a lot of clients
and a lot of queries.

 How much faster is the XEON DP?

Well, on high load, PostgreSQL scales well on the DP (load at 40,
queries slower but still performing well) and is awfully slow on the
MP box.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Guillaume Smet
On 3/16/06, Tom Lane [EMAIL PROTECTED] wrote:
 Can you try strace'ing some of the backend processes while the system is
 behaving like this?  I suspect what you'll find is a whole lot of
 delaying select() calls due to high contention for spinlocks ...

Tom,

I think we can try to do it.

You mean strace -p pid with pid on some of the postgres process not on
the postmaster itself, does you? Do we need other options?
Which pattern should we expect? I'm not really familiar with strace
and its output.

Thanks for your help.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 You mean strace -p pid with pid on some of the postgres process not on
 the postmaster itself, does you?

Right, pick a couple that are accumulating CPU time.

 Do we need other options?

strace will generate a *whole lot* of output to stderr.  I usually do
something like
strace -p pid 2outfile
and then control-C it after a few seconds.

 Which pattern should we expect?

What we want to find out is if there's a lot of select()s and/or
semop()s shown in the result.  Ideally there wouldn't be any, but
I fear that's not what you'll find.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Sven Geisler

Hi Guillaume,

Guillaume Smet schrieb:

How much faster is the XEON DP?


Well, on high load, PostgreSQL scales well on the DP (load at 40,
queries slower but still performing well) and is awfully slow on the
MP box.


I know what you mean with awfully slow.
I think, your application is facing contention. The contention becomes 
larger as more CPU you have. PostgreSQL 8.1 is addressing contention on 
multiprocessor servers as you mentioned before.


I guess, you will see that your 4-way XEON MP isn't that bad if you 
compare both servers with the same PostgreSQL version.


Regards
Sven.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-16 Thread Guillaume Smet
On 3/16/06, Tom Lane [EMAIL PROTECTED] wrote:
 What we want to find out is if there's a lot of select()s and/or
 semop()s shown in the result.  Ideally there wouldn't be any, but
 I fear that's not what you'll find.

OK, I'll try to do it on monday before our upgrade then see what
happens with PostgreSQL 8.1.3.

Thanks for your help.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Postgresql and xeon.

2005-05-30 Thread Josh Berkus
Eric,

 What about xeon and postgresql, i have been told that
 postgresql wouldn't perform as well when running
 under xeon processors due to some cache trick that postgresql
 uses?

Search the archives of this list.   This has been discussed ad nauseum.
www.pgsql.ru

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgresql and xeon.

2005-05-30 Thread Steinar H. Gunderson
On Mon, May 30, 2005 at 09:19:40AM -0700, Josh Berkus wrote:
 Search the archives of this list.   This has been discussed ad nauseum.
 www.pgsql.ru

I must admit I still haven't really understood it -- I know that it appears
on multiple operating systems, on multiple architectures, but most with Xeon
CPUs, and that it's probably related to the poor memory bandwidth between the
CPUs, but that's about it. I've read the threads I could find on the list
archives, but I've yet to see somebody pinpoint exactly what in PostgreSQL is
causing this.

Last time someone claimed this was bascially understood and just a lot of
work to fix, I asked for pointers to a more detailed analysis, but nobody
answered.  Care to explain? :-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgresql and xeon.

2005-05-30 Thread Eric Lauzon


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Steinar H. Gunderson
 Sent: 30 mai 2005 12:55
 To: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Postgresql and xeon.
 
 On Mon, May 30, 2005 at 09:19:40AM -0700, Josh Berkus wrote:
  Search the archives of this list.   This has been discussed 
 ad nauseum.
  www.pgsql.ru
 
 I must admit I still haven't really understood it -- I know 
 that it appears on multiple operating systems, on multiple 
 architectures, but most with Xeon CPUs, and that it's 
 probably related to the poor memory bandwidth between the 
 CPUs, but that's about it. I've read the threads I could find 
 on the list archives, but I've yet to see somebody pinpoint 
 exactly what in PostgreSQL is causing this.
 
 Last time someone claimed this was bascially understood and 
 just a lot of work to fix, I asked for pointers to a more 
 detailed analysis, but nobody answered.  Care to explain? :-)

Same here archives references are just overview but no real data
to where and why, i would state pg 7.4.8 and kernel 2.6 with preemptive 
scheduler
and dual xeon 3.2 ghz 6 gig of ram.


Eric Lauzon
[Recherche  Développement]
Above Sécurité / Above Security
Tél  : (450) 430-8166
Fax : (450) 430-1858 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Postgresql and xeon.

2005-05-30 Thread Tom Lane
Steinar H. Gunderson [EMAIL PROTECTED] writes:
 I must admit I still haven't really understood it -- I know that it appears
 on multiple operating systems, on multiple architectures, but most with Xeon
 CPUs, and that it's probably related to the poor memory bandwidth between the
 CPUs, but that's about it. I've read the threads I could find on the list
 archives, but I've yet to see somebody pinpoint exactly what in PostgreSQL is
 causing this.

The problem appears to be that heavy contention for a spinlock is
extremely expensive on multiprocessor Xeons --- apparently, the CPUs
waste tremendous amounts of time passing around exclusive ownership
of the memory cache line containing the spinlock.  While any SMP system
is likely to have some issues here, the Xeons seem to be particularly
bad at it.

In the case that was discussed extensively last spring, the lock that
was causing the problem was the BufMgrLock.  Since 8.0 we've rewritten
the buffer manager in hopes of reducing contention, but I don't know
if the problem is really gone or not.  The buffer manager is hardly the
only place with the potential for heavy contention...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster