[PERFORM] Performance problems on 4-way AMD Opteron 875 (dual core)
[[I'm posting this on behalf of my co-worker who cannot post to this list at the moment]] Hi, I had installed PostgreSQL on a 4-way AMD Opteron 875 (dual core) and the performance isn't on the expected level. Details: The "old" server is a 4-way XEON MP 3.0 GHz with 4MB L3 cache, 32 GB RAM (PC1600) and local FC-RAID 10. Hyper-Threading is off. (DL580) The "old" server is using Red Hat Enterprise Linux 3 Update 5. The "new" server is a 4-way Opteron 875 with 1 MB L2 cache, 32 GB RAM (PC3200) and the same local FC-RAID 10. (HP DL585) The "new" server is using Red Hat Enterprise Linux 4 (with the latest x86_64 kernel from Red Hat - 2.6.9-11.ELsmp #1 SMP Fri May 20 18:25:30 EDT 2005 x86_64) I use PostgreSQL version 8.0.3. The issue is that the Opteron is slower as the XEON MP under high load. I have created a test with parallel queries which are typical for my application. The queries are in a range of small queries (0.1 seconds) and larger queries using join (15 seconds). The test starts parallel clients. Each clients runs the queries in a random order. The test takes care that a client use always the same random order to get valid results. Here are the number of queries which the server has finished in a fix period of time. I used PostgreSQL 8.1 snapshot from last week compiled as 64bit binary for DL585-64bit. I used PostgreSQL 8.0.3 compiled as 32bit binary for DL585-32bit and DL580. During the tests everything which is needed is in the file cache. I didn't have read activity. Context switch spikes are over 5 during the test on both server. My feeling is that the XEON has a tick more context switches. PostgreSQL params: max_locks_per_transaction = 256 shared_buffers = 4 effective_cache_size = 384 work_mem = 30 maintenance_work_mem = 512000 wal_buffers = 32 checkpoint_segments = 24 I was expecting two times more queries on the DL585. The DL585 with PostgreSQL 8.0.3 32bit does meltdown earlier as the XEON in production use. Please compare 4 clients and 8 clients. With 4 clients the Opteron is in front and with 8 clients the XEON doesn't meltdown that much as the Opteron. I don't have any idea what cause this. Benchmarks like SAP's SD 2-tier showing that the DL585 can handle nearly three times more load as the DL580 with XEON 3.0. We choose the 4-way Opteron 875 based on such benchmark to replace the 4-way XEON MP. Does anyone have comments or ideas on which I have to focus my work? I guess, the shared buffer cause the meltdown when to many clients are accessing the same data. I didn't understand why the 4-way XEON MP 3.0 can deal with this better as the 4-way Opteron 875. The system load on the Opteron is never over 3.0. The XEON MP has a load up to 4.0. Should I try other settings for PostgreSQL in postgresql.conf? Should I try other setting for the compilation? I will compile the latest PostgreSQL 8.1 snapshot for 32bit to evaluate the new shared buffer code from Tom. I think, the 64bit is slow because my queries are CPU intensive. Can someone provide a commercial support contact for this issue? Sven.
Re: [PERFORM] Performance problems on 4-way AMD Opteron 875 (dual core)
On Fri, Aug 05, 2005 at 01:11:31PM +0200, Dirk Lutzebäck wrote: I will compile the latest PostgreSQL 8.1 snapshot for 32bit to evaluate the new shared buffer code from Tom. I think, the 64bit is slow because my queries are CPU intensive. Have you actually tried it or are you guessing? If you're guessing, then compile it as a 64 bit binary and benchmark that. Mike Stone ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match