Re: [PERFORM] disk I/O problems and Solutions
- Alan McKay alan.mc...@gmail.com escreveu: CentOS / PostgreSQL shop over here. Our system IBM 3650 - quad 2Ghz e5405 Xeon 8K SAS RAID Controller 6 x 300G 15K/RPM SAS Drives /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS /dev/sdb - 3 drives configured as RAID5 for 600G for the DB 1 drive as a global hot spare /dev/sdb is the one that is maxing out. What are you calling maxing out? Excess IOPS, MB/s or high response times? Each of these have different approaches when trying to find out a solution. Is there a public library somewhere of disk subsystems and their performance figures? Done with some semblance of a standard benchmark? you should try using iostat or sar utilities. Both can give you complete reports of your online disk activity and probably were the tools in the backend used by your tool as the frontend. It's very important to figure out that the percentage seen is all about CPU time used when in an I/O operation. If you have 100% you have to worry but not too desperatelly. What matters most for me is the disk operation response time and queue size. If you have these numbers increasing then your database performance will suffer. Always check the man pages for iostat to understand what those numbers are all about. One thing I am thinking of in the immediate term is taking the RAID5 + hot spare and converting it to RAID10 with the same amount of storage. Will that perform much better? Usually yes for write operations because the raid controller doesn't have to calculate parity for the spare disk. You'll have some improvements in the disk seek time and your database will be snapier if you have an OLTP application. RAID5 can handle more IOPS, otherwise. It can be good for your pg_xlog directory, but the amount of disk space needed for WAL is just a small amount. In general we are planning to move away from RAID5 toward RAID10. We also have on order an external IBM array (don't have the exact name on hand but model number was 3000) with 12 drive bays. We ordered it with just 4 x SATAII drives, and were going to put it on a different system as a RAID10. These are just 7200 RPM drives - the goal was cheaper storage because the SAS drives are about twice as much per drive, and it is only a 300G drive versus the 1T SATA2 drives. IIRC the SATA2 drives are about $200 each and the SAS 300G drives about $500 each. I think it's a good choice. So I have 2 thoughts with this 12 disk array. 1 is to fill it up with 12 x cheap SATA2 drives and hope that even though the spin-rate is a lot slower, that the fact that it has more drives will make it perform better. But somehow I am doubtful about that. The other thought is to bite the bullet and fill it up with 300G SAS drives. any thoughts here? recommendations on what to do with a tight budget? Take you new storage system when it arrives, make it RAID10 and administer it using LVM in Linux. If you need greater performance later you will be able to make stripes between raid arrays. Regards Flavio Henrique A. Gurgel Consultor -- 4Linux tel. 55-11-2125.4765 fax. 55-11-2125.4777 www.4linux.com.br -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Continuent (was: Postgres Clustering)
- Alan McKay alan.mc...@gmail.com escreveu: Hmmm. Anyone out there have the Continuent solution working with PostgreSQL? If so, what release? We're at 8.3 right now. I have tested Sequoia 2.10.10 with a high transaction rate database with good servers and plenty of memory. Since that's a OLTP system the performance droped to as low as 30%. I can't recomend their solution for a OLAP system because I never tested in this situation. Configuration of Sequoia is quite complicated and I think a very good Database Administrator is needed to keep it working correctly and nodes syncronized. Sequoia also is very complicated to run ddl and dml scripts since your scrips should be written for Sequoia, not for PostgreSQL. If log-shipping works for you, try Slony. Your slaves can serve as read-only databases and you can distribute some load. Flavio
Re: [PERFORM] Scalability in postgres
- Scott Marlowe scott.marl...@gmail.com escreveu: On Thu, May 28, 2009 at 12:50 PM, Fabrix fabrix...@gmail.com wrote: HI. Someone had some experience of bad performance with postgres in some server with many processors? I had. but I have experienced problems with another server that has 8 CPUS quad core (32 cores). The second one only gives me about 1.5 of performance of the first one. I have had problems with 4 CPUS dual core Hyper Threading (16 logical CPUS). What model CPUs and chipset on the mobo I wonder? Monitoring (nmon, htop, vmstat) see that everything is fine (memory, HD, eth, etc) except that processors regularly climb to 100%. I can see that the processes are waiting for CPU time: Have postgres problems of lock or degradation of performance with many CPU's? Any comments? Looks like a context switch storm, which was pretty common on older Xeon CPUs. I imagine with enough pg processes running on enough CPUs it could still be a problem. I would ask for your kernel version. uname -a please? It was possible to make the context work better with 2.4.24 with kswapd patched around here. 1600 connections working fine at this moment. Try to lower your memory requirements too. Linux kernel needs some space to page and scale up. Install some more memory otherwise. Flavio
Re: [PERFORM] Scalability in postgres
I would ask for your kernel version. uname -a please? sure, and thanks for you answer Flavio... uname -a Linux SERVIDOR-A 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.2 (Tikanga) I had the same problem you're saying with Debian Etch 2.6.18 when the system needed more then 1000 connections. It was possible to make the context work better with 2.4.24 with kswapd patched around here. 1600 connections working fine at this moment. 2.4 is very old, or not? My mistake. It is 2.6.24 We had to apply the kswapd patch also. It's important specially if you see your system % going as high as 99% in top and loosing the machine's control. I have read something about 2.6.28 had this patch accepted in mainstream. Try to lower your memory requirements too. Linux kernel needs some space to page and scale up. Install some more memory otherwise. how much? already I have a lot of memory installed in the server 128GB. Here we have 16GB. I had to limit PostgreSQL memory requirements (shared_buffers + max_connections * work_mem) to about 40% RAM. effective_cache_size was not an issue and about 30% of RAM is working fine. Of course the cache is a matter of your context. Since we have fast queries with low memory requirements for sorting or nested loops, 1.5MB for work_mem was enough around here. 2GB of shared buffers worked like a charm but it's too low for the indexes I work with and I'm planning to increase it when I have more RAM. Flavio
Re: [PERFORM] work_mem in high transaction rate database
- Scott Marlowe scott.marl...@gmail.com escreveu: Oh my lord, that is a foot gun waiting to go off. Assuming 2k connections, and somehow a fair number of them went active with big sorts, you'd be able to exhaust all physical memory with about 8 to 16 connections. Lower work_mem now. To something like 1 to 4 Meg. Do not pass go. If some oddball query really needs a lot of work_mem, and benchmarks show something larger work_mem helps, consider raising the work_mem setting for that one query to something under 1G (way under 1G) That makes it noticeably faster. Don't allocate more than a test shows you helps. Thanks a lot Scott. That's what I thought in the beginning but was very doubtful since the documentation is a bit odd regarding this point and several bloggers talk about increasing this value up to 250MB. I really think that separating regular non pooled distributed applications and pooled application servers makes a lot of difference in this point. Vacuum does not block transactions. unless you're dropping tables or something. I'll try to separate things and check if the DELETE queries have something related here. (...) what you think they mean. Post the first 20 or so lines from top to show us. Unfortunately I can't do it. The data there is very sensitive (it's a public company here in Brazil) and the server is operated only by selected personal. I just ask for information and give written recomendations. Anyway, I'm going to pay some more attention in this topic. This is a very interesting implementation of PostgreSQL (3 large databases, heavy load, things growing all the time) and I'll let you all know what happened when tuning it. I'll feedback you after lowering work_mem and changing related settings. Thanks Flavio
[PERFORM] work_mem in high transaction rate database
Hello all In a dedicated server with 16 cores and 16GB of RAM running PostgreSQL 8.2.5 we have a database with basically two kinds of transactions: - short transactions with a couple of updates and inserts that runs all the day; - batch data loads with hundreds of inserts that runs several times a day; - one delete for thousands of lines after each batch; - selects are made when users need reports, low concurrency here. Today the max_connections is ~2500 where the application is a cluster of JBoss servers with a pool a bit smaller then this total. work_mem = 1GB maintenance_work_mem = 1GB shared_buffers = 4GB autovacuum takes a lot of time running in the largest tables (3 large tables in 50) causing some connections to have to wait for it to finish to start transactioning again. I see a few processes (connections) using 10 ~ 20% of total system memory and the others using no more then 1%. What I want to ask is: is it better to keep the work_mem as high as it is today or is it a safe bet triyng to reduce this number, for example, to 1 or 2MB so I can keep the distribution of memory more balanced among all connections? Thanks! Flavio Henrique A. Gurgel