Re: [HACKERS] PostgreSQL performance issues

2007-10-23 Thread Simon Riggs
On Mon, 2007-10-22 at 05:13 -0700, Tiago J. Adami wrote:

 Hi all, I'm working for a brazillian company developing and
 maintaining a ERP sw that uses PostgreSQL as it main OLTP database
 system. We're just to start the migration to IBM DB2 because of many
 performance issues. I searched the solution for these performance
 problems, and can't find anything on the other web foruns. 

If you are migrating to DB2 then I'm guessing you'll have spent some
time discussing things with a DB2 consultant/pre-sales guy. I've noticed
they've started to implement some PostgreSQL features now and probably
have a list of Ten Questions to Ask the PostgreSQL Community. It's a
standard sales technique. I bet they haven't mentioned the things that
Postgres does better than DB2 and that they are playing catch-up on. 

If you are having problems with PostgreSQL you should do a little more
than search the forums. If you want to compare two products then you
shouldn't just ask an expert in one product, you should ask an expert in
each product. I'm available to help, but so are many others; my
understanding is that the Brazillian PostgreSQL community has many good
people. You may need to pay for the advice, but remember the pre-sales
advice on DB2 isn't free either, you just pay for it later.

The database schema you describe is very likely to be within the
capabilities of PostgreSQL on reasonable hardware. Limiting your
hardware so you can use the free version of DB2 could well put a strain
on your implementation. I would also note that running any database
using RAID-0 is going to lose your data fairly soon, whichever database
you choose.

I hope that helps your general issues.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL performance issues

2007-10-23 Thread Rafael Martinez
Deblauwe Gino wrote:

 a) I didn't see a reindex in your mail.  That's why a backup and a
 restore work and a vacuum doesn't
 http://www.postgresql.org/docs/current/static/sql-reindex.html
 Do this at least daily with that many inserts
 

Hello

I'am sorry to say that this advice does not sound correct.

He is talking about 10,000 transactions and 15,000 new rows per
table/day. This should not require a daily reindex of 12 indexes. It has
to be something else and more information will help to find the problem.

1GB ram sounds maybe to little for a 40GB+ database. It is difficult to
say more without knowing more about the EXPLAIN output for the selects
with problems. It would be good too to know some of the memory values in
postgresql.conf.

I can count with one hand the times that I have had to run a reindex to
fix something in the past years. We have databases with 9millons+
transactions and 2millons+ inserts/updates a day and we have never had
to run a reindex to get a good performance

PS.- RAID-0 for a database is a disaster waiting to happen.

regards
-- 
 Rafael Martinez, [EMAIL PROTECTED]
 Center for Information Technology Services
 University of Oslo, Norway

 PGP Public Key: http://folk.uio.no/rafael/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL performance issues

2007-10-23 Thread Neil Conway
On Mon, 2007-10-22 at 15:40 +0200, Cédric Villemain wrote:
 Does postgresql use posix_fallocate ?

No.

-Neil



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PostgreSQL performance issues

2007-10-22 Thread Heikki Linnakangas
Tiago J. Adami wrote:
 The issue topics:
 1) As the database grows on our customers, lower performance occurs. After
 one week of use, the I/O on database is extremely high. It appears that
 VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases.

That sounds like a classic case of you're not running vacuum often
enough. VACUUM FULL shouldn't be necessary in normal operation. Have
you investigated which queries are causing the I/O?

What version of PostgreSQL are you running? In recent versions, just
enabling autovacuum does a reasonable job in most scenarios.

Check that you don't have any connections forgotten in idle in
transaction state. That would prevent VACUUM from recovering dead space.

 2) We have a very complex view mount on other views. When we cancel a simple
 SELECT on this top-level view (expecting return a max. of 100 rows for
 example) the PostgreSQL process starts a infinite loop (we left more than 4
 days and the loop doesn't stops), using 100% of all processors on the
 server.

PostgreSQL has a multi-process, single-thread architecture, which means
that only a single CPU can be executing a single query at at time. That
makes it hard to believe that canceling a query uses 100% of *all*
processors.

Have you tried EXPLAINing that query to see why it take so long? Can you
post the query and the EXPLAIN output?

 3) On these servers, the disk usage grows very small than the records loaded
 into database. For example, after restoring a backup, the database DIR have
 about 40 Gb (with all indexes created). After one week of use, and about
 500,000 new records on tables, the database size grows to about 42 Gb, but
 on Windows 2003 Server we can see the high fragmentation of disk (maybe on
 linux this occurs too).

Do you think the fragmentation causes you problems? Do you do large
sequential scans regularly? I suppose you could run a defragmenter if
you suspect that's behind the increase in I/O.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL performance issues

2007-10-22 Thread Gregory Stark
Tiago J. Adami [EMAIL PROTECTED] writes:

 The issue topics:
 1) As the database grows on our customers, lower performance occurs. After
 one week of use, the I/O on database is extremely high. It appears that
 VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases.

VACUUM FULL is a last-resort utility for recovering from a bad situation. It
shouldn't be necessary under normal operation. The intended mode is to run
VACUUM (or VACUUM ANALYZE) frequently -- possibly several times a day -- to
maintain the data density.

How frequently are you running VACUUM (or VACUUM ANALYZE)? How many updates
and deletes are you executing in that interval?

If you run VACUUM (or VACUUM ANALYZE) interactively what does it print at the
end of the operation?

 2) We have a very complex view mount on other views. When we cancel a simple
 SELECT on this top-level view (expecting return a max. of 100 rows for
 example) the PostgreSQL process starts a infinite loop (we left more than 4
 days and the loop doesn't stops), using 100% of all processors on the
 server.

That does sound bad. Would it be possible to attach to the process when it's
spinning and get a back trace? Also, what version is this precisely? Have you
taken all the bug-fix updates for the major version you're using?

 3) On these servers, the disk usage grows very small than the records loaded
 into database. For example, after restoring a backup, the database DIR have
 about 40 Gb (with all indexes created). After one week of use, and about
 500,000 new records on tables, the database size grows to about 42 Gb, but
 on Windows 2003 Server we can see the high fragmentation of disk (maybe on
 linux this occurs too).

Postgres does extend files as needed and some filesystems may deal better with
this than others. I think this is something we don't know much about on
Windows.

You might find running a CLUSTER on the fragmented tables improves matters.
CLUSTER effectively does a full vacuum too so it would leave you in a good
situation to monitor the growth and vacuum frequency necessary from that point
forward too. The downsides are that CLUSTER locks the table while it runs and
it requires enough space to store a whole second copy of the table and its
indexes.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL performance issues

2007-10-22 Thread Deblauwe Gino

Tiago J. Adami schreef:
Hi all, I'm working for a brazillian company developing and 
maintaining a ERP sw that uses PostgreSQL as it main OLTP database 
system. We're just to start the migration to IBM DB2 because of many 
performance issues. I searched the solution for these performance 
problems, and can't find anything on the other web foruns. I'll put 
them in this post as topics, but first I'll describe how's the ERP 
database's schema: - More than 200 tables; - About 10 tables with 
about 10,000 transactions and 15,000 new rows per day; - These 10 
tables has at least 12 table indexes and 3 triggers; - Many of our 
customer servers uses AMD64 processors running Red Hat Enterprise 
(with EXT3), but we have some using Xeon processors and Windows 2003 
Server (NTFS), and the issues still the same; - All servers have at 
least 1 Gb of dedicated RAM, with no virtualization; - All servers 
uses at least 2 disks on RAID 0 (Ultra-SCSI disks); - Database 
encoding: LATIN 1. *The issue topics:* 1) As the database grows on our 
customers, lower performance occurs. After one week of use, the I/O on 
database is extremely high. It appears that VACUUM FULL and/or VACUUM 
ANALYZE doesn't work on this databases. 2) We have a very complex view 
mount on other views. When we cancel a simple SELECT on this top-level 
view (expecting return a max. of 100 rows for example) the PostgreSQL 
process starts a infinite loop (we left more than 4 days and the loop 
doesn't stops), using 100% of all processors on the server. 3) On 
these servers, the disk usage grows very small than the records loaded 
into database. For example, after restoring a backup, the database DIR 
have about 40 Gb (with all indexes created). After one week of use, 
and about 500,000 new records on tables, the database size grows to 
about 42 Gb, but on Windows 2003 Server we can see the high 
fragmentation of disk (maybe on linux this occurs too). 4) VACUUM FULL 
and/or VACUUM ANALYZE appears to doesn't work perfectly. *The 
temporary (but extensive) solution:* I have seem that one action could 
solve this problems for a short time. It is as follows: 1) Create a 
database dump into a .SQL plain text file; 2) Drop the OTLP original 
database; 3) Create a new database using the original name; 4) 
Restores the .SQL file with psql. The cost of use PostgreSQL database 
on our sw came to a very high levels, so we're evaluating the same 
database schema and data on other databases as IBM DB2 9, and these 
issues doesn't work. However, we need solve this problems on PgSQL, as 
exists many customers who will not migrate to DB2. Anyone can help me?


View this message in context: PostgreSQL performance issues 
http://www.nabble.com/PostgreSQL-performance-issues-tf4670379.html#a13341797
Sent from the PostgreSQL - hackers mailing list archive 
http://www.nabble.com/PostgreSQL---hackers-f780.html at Nabble.com.
a) I didn't see a reindex in your mail.  That's why a backup and a 
restore work and a vacuum doesn't

http://www.postgresql.org/docs/current/static/sql-reindex.html
Do this at least daily with that many inserts

b) Which version do you run of postgres?  pg82 is a lot faster then 
previous versions, pg83 will be again a lot faster when it comes out


c) Fragmentation won't happen on linux as far as I know, but if you want 
to be faster why do you use the slowest filesystem ever?
ReiserFS is a lot better.  On windows = frequent defragmenting, it's 
said that with NTFS it's not necessary anymore, but I don't believe it.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL performance issues

2007-10-22 Thread Cédric Villemain

Gregory Stark a écrit :

Tiago J. Adami [EMAIL PROTECTED] writes:

  

The issue topics:
1) As the database grows on our customers, lower performance occurs. After
one week of use, the I/O on database is extremely high. It appears that
VACUUM FULL and/or VACUUM ANALYZE doesn't work on this databases.



VACUUM FULL is a last-resort utility for recovering from a bad situation. It
shouldn't be necessary under normal operation. The intended mode is to run
VACUUM (or VACUUM ANALYZE) frequently -- possibly several times a day -- to
maintain the data density.

How frequently are you running VACUUM (or VACUUM ANALYZE)? How many updates
and deletes are you executing in that interval?

If you run VACUUM (or VACUUM ANALYZE) interactively what does it print at the
end of the operation?

  

2) We have a very complex view mount on other views. When we cancel a simple
SELECT on this top-level view (expecting return a max. of 100 rows for
example) the PostgreSQL process starts a infinite loop (we left more than 4
days and the loop doesn't stops), using 100% of all processors on the
server.



That does sound bad. Would it be possible to attach to the process when it's
spinning and get a back trace? Also, what version is this precisely? Have you
taken all the bug-fix updates for the major version you're using?

  

3) On these servers, the disk usage grows very small than the records loaded
into database. For example, after restoring a backup, the database DIR have
about 40 Gb (with all indexes created). After one week of use, and about
500,000 new records on tables, the database size grows to about 42 Gb, but
on Windows 2003 Server we can see the high fragmentation of disk (maybe on
linux this occurs too).



Postgres does extend files as needed and some filesystems may deal better with
this than others. I think this is something we don't know much about on
Windows.
  

humm, kernel 2.6.23 introduce fallocate ...
(I am perhaps about re-lauching a flamewar)
Does postgresql use posix_fallocate ?


You might find running a CLUSTER on the fragmented tables improves matters.
CLUSTER effectively does a full vacuum too so it would leave you in a good
situation to monitor the growth and vacuum frequency necessary from that point
forward too. The downsides are that CLUSTER locks the table while it runs and
it requires enough space to store a whole second copy of the table and its
indexes.

  



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org