Re: [PERFORM] vacuum performance on insert

2010-08-06 Thread Sean Chen
hi, thank you for the reply.

I ran a number of tests to try to make sense of this.

When I ran with or without vacuum, the number of disk io operations,
cache operations etc. gathered from pg_stat table for the insertions
are pretty much the same.

So I don't see vacuum reduce disk io operations.

Now from what you mentioned below, do you know what's the cost of
postgres requesting new disk space from OS?

I'm seeing a big performance difference with vacuum, but I need a
proof to show it's the requesting new space operation that was the
problem, not disk io, etc. since I would think disk could be expensive
as well.

Thanks,
Sean

On Thu, Aug 5, 2010 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Sean Chen zysc...@gmail.com wrote:
 1, delete records ...
 2, insert records ...

 if I add vacuum analyze in-between this two steps, will it help
 on the performance on the insert?

 Assuming there are no long-running transactions which would still be
 able to see the deleted rows, a VACUUM between those statements
 would allow the INSERT to re-use the space previously occupied by
 the deleted rows, rather than possibly needing to allocate new space
 from the OS.

 But on the other side of the coin, the ANALYZE step is probably not very
 helpful there.  Better to do that after you've loaded the new data.

                        regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] vacuum performance on insert

2010-08-05 Thread Kevin Grittner
Sean Chen zysc...@gmail.com wrote:
 
 1, delete records ...
 2, insert records ...
 
 if I add vacuum analyze in-between this two steps, will it help
 on the performance on the insert?
 
Assuming there are no long-running transactions which would still be
able to see the deleted rows, a VACUUM between those statements
would allow the INSERT to re-use the space previously occupied by
the deleted rows, rather than possibly needing to allocate new space
from the OS.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] vacuum performance on insert

2010-08-05 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Sean Chen zysc...@gmail.com wrote:
 1, delete records ...
 2, insert records ...
 
 if I add vacuum analyze in-between this two steps, will it help
 on the performance on the insert?
 
 Assuming there are no long-running transactions which would still be
 able to see the deleted rows, a VACUUM between those statements
 would allow the INSERT to re-use the space previously occupied by
 the deleted rows, rather than possibly needing to allocate new space
 from the OS.

But on the other side of the coin, the ANALYZE step is probably not very
helpful there.  Better to do that after you've loaded the new data.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] VACUUM Performance

2006-10-20 Thread Tom Lane
Steve Oualline [EMAIL PROTECTED] writes:
 Question:   I have a big table with 120,000,000 records.
 Let's assume that I DELETE 4,000,000 records, VACUUM FULL, and REINDEX.
 Now I have the same table, but with 240,000,000 records.
 I DELETE 8,000,000 records, VACUUM FULL, and REINDEX.
 Should the second operation with twice the data take twice the time as =
 the first?

At least.  If you intend to reindex all the indexes, consider instead
doing
DROP INDEX(es)
VACUUM FULL
re-create indexes
as this avoids the very large amount of effort that VACUUM FULL puts
into index maintenance --- effort that's utterly wasted if you then
reindex.

CLUSTER and some forms of ALTER TABLE can accomplish a table rewrite
with less hassle than the above, although strictly speaking they violate
MVCC by discarding recently-dead tuples.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match