[PERFORM] Performance problems on 4-way AMD Opteron 875 (dual core)

2005-08-05 Thread Dirk Lutzebäck




[[I'm
posting this on behalf of my co-worker who cannot post to this list at
the moment]]

Hi,


I had installed PostgreSQL on a 4-way AMD Opteron 875 (dual core) and
the performance isn't on the expected level.


Details:

The "old" server is a 4-way XEON MP 3.0 GHz with 4MB L3 cache, 32 GB
RAM (PC1600) and local FC-RAID 10. Hyper-Threading is off. (DL580)

The "old" server is using Red Hat Enterprise Linux 3 Update 5.

The "new" server is a 4-way Opteron 875 with 1 MB L2 cache, 32 GB RAM
(PC3200) and the same local FC-RAID 10. (HP DL585)

The "new" server is using Red Hat Enterprise Linux 4 (with the latest
x86_64 kernel from Red Hat - 2.6.9-11.ELsmp #1 SMP Fri May 20 18:25:30
EDT 2005 x86_64)

I use PostgreSQL version 8.0.3.


The issue is that the Opteron is slower as the XEON MP under high load.
I have created a test with parallel queries which are typical for my
application. The queries are in a range of small queries (0.1 seconds)
and larger queries using join (15 seconds).

The test starts parallel clients. Each clients runs the queries in a
random order. The test takes care that a client use always the same
random order to get valid results.


Here are the number of queries which the server has finished in a fix
period of time.

I used PostgreSQL 8.1 snapshot from last week compiled as 64bit binary
for DL585-64bit.

I used PostgreSQL 8.0.3 compiled as 32bit binary for DL585-32bit and
DL580.

During the tests everything which is needed is in the file cache. I
didn't have read activity.

Context switch spikes are over 5 during the test on both server.
My feeling is that the XEON has a tick more context switches.





PostgreSQL params:

max_locks_per_transaction = 256

shared_buffers = 4

effective_cache_size = 384

work_mem = 30

maintenance_work_mem = 512000

wal_buffers = 32

checkpoint_segments = 24



I was expecting two times more queries on the DL585. The DL585 with
PostgreSQL 8.0.3 32bit does meltdown earlier as the XEON in production
use. Please compare 4 clients and 8 clients. With 4 clients the Opteron
is in front and with 8 clients the XEON doesn't meltdown that much as
the Opteron.


I don't have any idea what cause this. Benchmarks like SAP's SD 2-tier
showing that the DL585 can handle nearly three times more load as the
DL580 with XEON 3.0. We choose the 4-way Opteron 875 based on such
benchmark to replace the 4-way XEON MP.


Does anyone have comments or ideas on which I have to focus my work?


I guess, the shared buffer cause the meltdown when to many clients are
accessing the same data.

I didn't understand why the 4-way XEON MP 3.0 can deal with this better
as the 4-way Opteron 875.

The system load on the Opteron is never over 3.0. The XEON MP has a
load up to 4.0.


Should I try other settings for PostgreSQL in postgresql.conf?

Should I try other setting for the compilation?


I will compile the latest PostgreSQL 8.1 snapshot for 32bit to evaluate
the new shared buffer code from Tom.

I think, the 64bit is slow because my queries are CPU intensive.


Can someone provide a commercial support contact for this issue?


Sven.







Re: [PERFORM] Performance problems on 4-way AMD Opteron 875 (dual core)

2005-08-05 Thread Michael Stone

On Fri, Aug 05, 2005 at 01:11:31PM +0200, Dirk Lutzebäck wrote:
I will compile the latest PostgreSQL 8.1 snapshot for 32bit to evaluate 
the new shared buffer code from Tom.

I think, the 64bit is slow because my queries are CPU intensive.


Have you actually tried it or are you guessing? If you're guessing, then
compile it as a 64 bit binary and benchmark that.

Mike Stone

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match