[PERFORM] 答复: [PERFORM] Postgresql update op is very very slow
I know the problem, because there are about 35 million rows , which cost about 12G disk space and checkpoint segments use 64, but update operation is in one transaction which lead fast fill up the checkpoint segments and lead do checkpoints frequently, but checkpoints will cost lots resources, so update operation become slowly and slowly and bgwrite won't write because it's not commit yet. Create a new table maybe a quick solution, but it's not appropriated in some cases. If we can do commit very 1000 row per round, it may resolve the problem. But PG not support transaction within function yet? -邮件原件- 发件人: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 代表 Heikki Linnakangas 发送时间: 2008年6月25日 18:11 收件人: jay 抄送: pgsql-performance@postgresql.org 主题: Re: [PERFORM] Postgresql update op is very very slow jay wrote: I've a table with about 34601755 rows ,when I execute 'update msg_table set type=0;' is very very slow, cost several hours, but still not complete? Why postgresql is so slowly? Is the PG MVCC problem? Possibly. Because of MVCC, a full-table update will actually create a new version of each row. I presume that's a one-off query, or a seldom-run batch operation, and not something your application needs to do often. In that case, you could drop all indexes, and recreate them after the update, which should help a lot: BEGIN; DROP INDEX index name, index name 2, ...; -- for each index UPDATE msg_table SET type = 0; CREATE INDEX ... -- Recreate indexes COMMIT; Or even better, instead of using UPDATE, do a SELECT INTO a new table, drop the old one, and rename the new one in its place. That has the advantage that the new table doesn't contain the old row version, so you don't need to vacuum right away to reclaim the space. Actually, there's an even more clever trick to do roughly the same thing: ALTER TABLE msg_table ALTER COLUMN type TYPE int4 USING 0; (assuming type is int4, replace with the actual data type if necessary) This will rewrite the table, similar to a DROP + CREATE, and rebuild all indexes. But all in one command. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [PERFORM] 答复: [PERFORM] Postgresql update op is very very slow
2008/6/26 jay [EMAIL PROTECTED]: If we can do commit very 1000 row per round, it may resolve the problem. But PG not support transaction within function yet? Yeah, transaction control is not supported inside functions. There are some hacks using dblink to do transactions inside functions. You may want to check that out. I had suggested another hack in the past for very simplistic updates, when you are sure that the tuple length does not change between updates and you are ready to handle half updated table if there is a crash or failure in between. May be for your case, where you are updating a single column of the entire table and setting it to some default value for all the rows, it may work fine. But please be aware of data consistency issues before you try that. And it must be once in a lifetime kind of hack. http://postgresql-in.blogspot.com/2008/04/postgresql-in-place-update.html Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 答复: [PERFORM] Postgresql update op is very very slow
jay [EMAIL PROTECTED] writes: I know the problem, because there are about 35 million rows , which cost about 12G disk space and checkpoint segments use 64, but update operation is in one transaction which lead fast fill up the checkpoint segments and lead do checkpoints frequently, but checkpoints will cost lots resources, so update operation become slowly and slowly and bgwrite won't write because it's not commit yet. Create a new table maybe a quick solution, but it's not appropriated in some cases. If we can do commit very 1000 row per round, it may resolve the problem. No, that's utterly unrelated. Transaction boundaries have nothing to do with checkpoints. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [PERFORM] Re: [PERFORM] 答复: [PERFORM] Postgresql update op is very very slow
2008/6/26 Pavan Deolasee [EMAIL PROTECTED]: 2008/6/26 jay [EMAIL PROTECTED]: If we can do commit very 1000 row per round, it may resolve the problem. But PG not support transaction within function yet? Yeah, transaction control is not supported inside functions. There are some hacks using dblink to do transactions inside functions. You may want to check that out. If you need autonomous transactions. For most people save points and catching seem to be a n acceptable form of transaction control. I had suggested another hack in the past for very simplistic updates, when you are sure that the tuple length does not change between updates and you are ready to handle half updated table if there is a crash or failure in between. May be for your case, where you are updating a single column of the entire table and setting it to some default value for all the rows, it may work fine. But please be aware of data consistency issues before you try that. And it must be once in a lifetime kind of hack. http://postgresql-in.blogspot.com/2008/04/postgresql-in-place-update.html In a way that's what pg_bulkloader does. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Re: [PERFORM] 答复: [PERFORM] Postgresql update op is very very slow
2008/6/26 Tom Lane [EMAIL PROTECTED]: jay [EMAIL PROTECTED] writes: I know the problem, because there are about 35 million rows , which cost about 12G disk space and checkpoint segments use 64, but update operation is in one transaction which lead fast fill up the checkpoint segments and lead do checkpoints frequently, but checkpoints will cost lots resources, so update operation become slowly and slowly and bgwrite won't write because it's not commit yet. Create a new table maybe a quick solution, but it's not appropriated in some cases. If we can do commit very 1000 row per round, it may resolve the problem. No, that's utterly unrelated. Transaction boundaries have nothing to do with checkpoints. True. But if you update 1 rows and vacuum you can keep the bloat to something reasonable. On another note, I haven't seen anyone suggest adding the appropriate where clause to keep from updating rows that already match. Cheap compared to updating the whole table even if a large chunk aren't a match. i.e. ... set col=0 where col 0; That should be the first thing you reach for in this situation, if it can help. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance