Re: [HACKERS] PostgreSQL performance issues
On Mon, 2007-10-22 at 05:13 -0700, Tiago J. Adami wrote: Hi all, I'm working for a brazillian company developing and maintaining a ERP sw that uses PostgreSQL as it main OLTP database system. We're just to start the migration to IBM DB2 because of many performance issues. I searched the solution for these performance problems, and can't find anything on the other web foruns. If you are migrating to DB2 then I'm guessing you'll have spent some time discussing things with a DB2 consultant/pre-sales guy. I've noticed they've started to implement some PostgreSQL features now and probably have a list of Ten Questions to Ask the PostgreSQL Community. It's a standard sales technique. I bet they haven't mentioned the things that Postgres does better than DB2 and that they are playing catch-up on. If you are having problems with PostgreSQL you should do a little more than search the forums. If you want to compare two products then you shouldn't just ask an expert in one product, you should ask an expert in each product. I'm available to help, but so are many others; my understanding is that the Brazillian PostgreSQL community has many good people. You may need to pay for the advice, but remember the pre-sales advice on DB2 isn't free either, you just pay for it later. The database schema you describe is very likely to be within the capabilities of PostgreSQL on reasonable hardware. Limiting your hardware so you can use the free version of DB2 could well put a strain on your implementation. I would also note that running any database using RAID-0 is going to lose your data fairly soon, whichever database you choose. I hope that helps your general issues. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PostgreSQL performance issues
Deblauwe Gino wrote: a) I didn't see a reindex in your mail. That's why a backup and a restore work and a vacuum doesn't http://www.postgresql.org/docs/current/static/sql-reindex.html Do this at least daily with that many inserts Hello I'am sorry to say that this advice does not sound correct. He is talking about 10,000 transactions and 15,000 new rows per table/day. This should not require a daily reindex of 12 indexes. It has to be something else and more information will help to find the problem. 1GB ram sounds maybe to little for a 40GB+ database. It is difficult to say more without knowing more about the EXPLAIN output for the selects with problems. It would be good too to know some of the memory values in postgresql.conf. I can count with one hand the times that I have had to run a reindex to fix something in the past years. We have databases with 9millons+ transactions and 2millons+ inserts/updates a day and we have never had to run a reindex to get a good performance PS.- RAID-0 for a database is a disaster waiting to happen. regards -- Rafael Martinez, [EMAIL PROTECTED] Center for Information Technology Services University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] PostgreSQL performance issues
On Mon, 2007-10-22 at 15:40 +0200, Cédric Villemain wrote: Does postgresql use posix_fallocate ? No. -Neil ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PostgreSQL performance issues
Tiago J. Adami wrote: The issue topics: 1) As the database grows on our customers, lower performance occurs. After one week of use, the I/O on database is extremely high. It appears that VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases. That sounds like a classic case of you're not running vacuum often enough. VACUUM FULL shouldn't be necessary in normal operation. Have you investigated which queries are causing the I/O? What version of PostgreSQL are you running? In recent versions, just enabling autovacuum does a reasonable job in most scenarios. Check that you don't have any connections forgotten in idle in transaction state. That would prevent VACUUM from recovering dead space. 2) We have a very complex view mount on other views. When we cancel a simple SELECT on this top-level view (expecting return a max. of 100 rows for example) the PostgreSQL process starts a infinite loop (we left more than 4 days and the loop doesn't stops), using 100% of all processors on the server. PostgreSQL has a multi-process, single-thread architecture, which means that only a single CPU can be executing a single query at at time. That makes it hard to believe that canceling a query uses 100% of *all* processors. Have you tried EXPLAINing that query to see why it take so long? Can you post the query and the EXPLAIN output? 3) On these servers, the disk usage grows very small than the records loaded into database. For example, after restoring a backup, the database DIR have about 40 Gb (with all indexes created). After one week of use, and about 500,000 new records on tables, the database size grows to about 42 Gb, but on Windows 2003 Server we can see the high fragmentation of disk (maybe on linux this occurs too). Do you think the fragmentation causes you problems? Do you do large sequential scans regularly? I suppose you could run a defragmenter if you suspect that's behind the increase in I/O. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PostgreSQL performance issues
Tiago J. Adami [EMAIL PROTECTED] writes: The issue topics: 1) As the database grows on our customers, lower performance occurs. After one week of use, the I/O on database is extremely high. It appears that VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases. VACUUM FULL is a last-resort utility for recovering from a bad situation. It shouldn't be necessary under normal operation. The intended mode is to run VACUUM (or VACUUM ANALYZE) frequently -- possibly several times a day -- to maintain the data density. How frequently are you running VACUUM (or VACUUM ANALYZE)? How many updates and deletes are you executing in that interval? If you run VACUUM (or VACUUM ANALYZE) interactively what does it print at the end of the operation? 2) We have a very complex view mount on other views. When we cancel a simple SELECT on this top-level view (expecting return a max. of 100 rows for example) the PostgreSQL process starts a infinite loop (we left more than 4 days and the loop doesn't stops), using 100% of all processors on the server. That does sound bad. Would it be possible to attach to the process when it's spinning and get a back trace? Also, what version is this precisely? Have you taken all the bug-fix updates for the major version you're using? 3) On these servers, the disk usage grows very small than the records loaded into database. For example, after restoring a backup, the database DIR have about 40 Gb (with all indexes created). After one week of use, and about 500,000 new records on tables, the database size grows to about 42 Gb, but on Windows 2003 Server we can see the high fragmentation of disk (maybe on linux this occurs too). Postgres does extend files as needed and some filesystems may deal better with this than others. I think this is something we don't know much about on Windows. You might find running a CLUSTER on the fragmented tables improves matters. CLUSTER effectively does a full vacuum too so it would leave you in a good situation to monitor the growth and vacuum frequency necessary from that point forward too. The downsides are that CLUSTER locks the table while it runs and it requires enough space to store a whole second copy of the table and its indexes. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] PostgreSQL performance issues
Tiago J. Adami schreef: Hi all, I'm working for a brazillian company developing and maintaining a ERP sw that uses PostgreSQL as it main OLTP database system. We're just to start the migration to IBM DB2 because of many performance issues. I searched the solution for these performance problems, and can't find anything on the other web foruns. I'll put them in this post as topics, but first I'll describe how's the ERP database's schema: - More than 200 tables; - About 10 tables with about 10,000 transactions and 15,000 new rows per day; - These 10 tables has at least 12 table indexes and 3 triggers; - Many of our customer servers uses AMD64 processors running Red Hat Enterprise (with EXT3), but we have some using Xeon processors and Windows 2003 Server (NTFS), and the issues still the same; - All servers have at least 1 Gb of dedicated RAM, with no virtualization; - All servers uses at least 2 disks on RAID 0 (Ultra-SCSI disks); - Database encoding: LATIN 1. *The issue topics:* 1) As the database grows on our customers, lower performance occurs. After one week of use, the I/O on database is extremely high. It appears that VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases. 2) We have a very complex view mount on other views. When we cancel a simple SELECT on this top-level view (expecting return a max. of 100 rows for example) the PostgreSQL process starts a infinite loop (we left more than 4 days and the loop doesn't stops), using 100% of all processors on the server. 3) On these servers, the disk usage grows very small than the records loaded into database. For example, after restoring a backup, the database DIR have about 40 Gb (with all indexes created). After one week of use, and about 500,000 new records on tables, the database size grows to about 42 Gb, but on Windows 2003 Server we can see the high fragmentation of disk (maybe on linux this occurs too). 4) VACUUM FULL and/or VACUUM ANALYZE appears to doesn't work perfectly. *The temporary (but extensive) solution:* I have seem that one action could solve this problems for a short time. It is as follows: 1) Create a database dump into a .SQL plain text file; 2) Drop the OTLP original database; 3) Create a new database using the original name; 4) Restores the .SQL file with psql. The cost of use PostgreSQL database on our sw came to a very high levels, so we're evaluating the same database schema and data on other databases as IBM DB2 9, and these issues doesn't work. However, we need solve this problems on PgSQL, as exists many customers who will not migrate to DB2. Anyone can help me? View this message in context: PostgreSQL performance issues http://www.nabble.com/PostgreSQL-performance-issues-tf4670379.html#a13341797 Sent from the PostgreSQL - hackers mailing list archive http://www.nabble.com/PostgreSQL---hackers-f780.html at Nabble.com. a) I didn't see a reindex in your mail. That's why a backup and a restore work and a vacuum doesn't http://www.postgresql.org/docs/current/static/sql-reindex.html Do this at least daily with that many inserts b) Which version do you run of postgres? pg82 is a lot faster then previous versions, pg83 will be again a lot faster when it comes out c) Fragmentation won't happen on linux as far as I know, but if you want to be faster why do you use the slowest filesystem ever? ReiserFS is a lot better. On windows = frequent defragmenting, it's said that with NTFS it's not necessary anymore, but I don't believe it. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] PostgreSQL performance issues
Gregory Stark a écrit : Tiago J. Adami [EMAIL PROTECTED] writes: The issue topics: 1) As the database grows on our customers, lower performance occurs. After one week of use, the I/O on database is extremely high. It appears that VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases. VACUUM FULL is a last-resort utility for recovering from a bad situation. It shouldn't be necessary under normal operation. The intended mode is to run VACUUM (or VACUUM ANALYZE) frequently -- possibly several times a day -- to maintain the data density. How frequently are you running VACUUM (or VACUUM ANALYZE)? How many updates and deletes are you executing in that interval? If you run VACUUM (or VACUUM ANALYZE) interactively what does it print at the end of the operation? 2) We have a very complex view mount on other views. When we cancel a simple SELECT on this top-level view (expecting return a max. of 100 rows for example) the PostgreSQL process starts a infinite loop (we left more than 4 days and the loop doesn't stops), using 100% of all processors on the server. That does sound bad. Would it be possible to attach to the process when it's spinning and get a back trace? Also, what version is this precisely? Have you taken all the bug-fix updates for the major version you're using? 3) On these servers, the disk usage grows very small than the records loaded into database. For example, after restoring a backup, the database DIR have about 40 Gb (with all indexes created). After one week of use, and about 500,000 new records on tables, the database size grows to about 42 Gb, but on Windows 2003 Server we can see the high fragmentation of disk (maybe on linux this occurs too). Postgres does extend files as needed and some filesystems may deal better with this than others. I think this is something we don't know much about on Windows. humm, kernel 2.6.23 introduce fallocate ... (I am perhaps about re-lauching a flamewar) Does postgresql use posix_fallocate ? You might find running a CLUSTER on the fragmented tables improves matters. CLUSTER effectively does a full vacuum too so it would leave you in a good situation to monitor the growth and vacuum frequency necessary from that point forward too. The downsides are that CLUSTER locks the table while it runs and it requires enough space to store a whole second copy of the table and its indexes. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org