Re: [HACKERS] Resumable vacuum proposal and design overview

2007-03-01 Thread Heikki Linnakangas
Jim C. Nasby wrote: On Wed, Feb 28, 2007 at 10:14:24PM +, Heikki Linnakangas wrote: cache instead. In the index scan phase, it's randomly accessed, but if the table is clustered, it's in fact not completely random access. In the 2nd vacuum pass, the array is scanned sequentially again. I'm

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-03-01 Thread Jim C. Nasby
On Wed, Feb 28, 2007 at 10:14:24PM +, Heikki Linnakangas wrote: > cache instead. In the index scan phase, it's randomly accessed, but if > the table is clustered, it's in fact not completely random access. In > the 2nd vacuum pass, the array is scanned sequentially again. I'm not Only if th

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-03-01 Thread Zeugswetter Andreas ADI SD
> One imho important (not necessarily mandatory) aspect of HOT > is, that it does parts of what vacuum would usually do. > > Thus: > 1. resume, load ctid list > 2. continue filling ctid list > 3. remove index tuples for these ctids (* problem *) > > You have just removed index

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-03-01 Thread Zeugswetter Andreas ADI SD
> I admit that the implementation is much complex, but I can > not see any big problems to save the dead tuples out and read > it in again(like two phase commit does). Why do we need to > hold the lock and transaction? We can open the lock and > abandon the transaction ID, vacuum can take the

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Tom Lane
Galy Lee <[EMAIL PROTECTED]> writes: > Let's come to the core issue we care about: do we need the stop-on-dime > feature to stop vacuum immediately? As my previous opinion: if there > are some problems for long running vacuum, yes we *did need* to stop > vacuum immediately. There's always SIGINT.

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Galy Lee
Simon Riggs wrote: > Galy, please hear that people like your idea and understand your use > case, but just don't like all of the proposal, just the main thrust of > it. The usual way is that > (people that agree + amount of your exact idea remaining) = 100% Thank you. I am glad to hear that. :)

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Heikki Linnakangas
Gregory Stark wrote: "Simon Riggs" <[EMAIL PROTECTED]> writes: How much memory would it save during VACUUM on a 1 billion row table with 200 million dead rows? Would that reduce the number of cycles a normal non-interrupted VACUUM would perform? It would depend on how many dead tuples you hav

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Gregory Stark
"Simon Riggs" <[EMAIL PROTECTED]> writes: > How much memory would it save during VACUUM on a 1 billion row table > with 200 million dead rows? Would that reduce the number of cycles a > normal non-interrupted VACUUM would perform? It would depend on how many dead tuples you have per-page. If you

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Simon Riggs
On Wed, 2007-02-28 at 09:38 +, Heikki Linnakangas wrote: > Tom Lane wrote: > > Galy Lee <[EMAIL PROTECTED]> writes: > >> If we can stop at any point, we can make maintenance memory large > >> sufficient to contain all of the dead tuples, then we only need to > >> clean index for once. No matter

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Simon Riggs
On Wed, 2007-02-28 at 11:19 +0100, Zeugswetter Andreas ADI SD wrote: > > > The things I wanted to say is that: > > > If we can stop at any point, we can make maintenance memory large > > > sufficient to contain all of the dead tuples, then we only need to > > > clean index for once. No matter how

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Zeugswetter Andreas ADI SD
> > You haven't explained how saving the dead-tuple-list could be done in > > a safe mannner and it seems risky to me. > > The files are placed in a new directory $PGDATA/pg_vacuum > with the name: spcNode.dbNode.relNode for each relations > which have been interrupted during vacuum. > > It h

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Zeugswetter Andreas ADI SD
> > The things I wanted to say is that: > > If we can stop at any point, we can make maintenance memory large > > sufficient to contain all of the dead tuples, then we only need to > > clean index for once. No matter how many times vacuum > stops, indexes > > are cleaned for once. > > I agree

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Galy Lee
Simon Riggs wrote: > You haven't explained how saving the dead-tuple-list could be done > in a safe mannner and it seems risky to me. The files are placed in a new directory $PGDATA/pg_vacuum with the name: spcNode.dbNode.relNode for each relations which have been interrupted during vacuum. It

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Heikki Linnakangas
Tom Lane wrote: Galy Lee <[EMAIL PROTECTED]> writes: If we can stop at any point, we can make maintenance memory large sufficient to contain all of the dead tuples, then we only need to clean index for once. No matter how many times vacuum stops, indexes are cleaned for once. I beg your pardon

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes: > On Wed, 2007-02-28 at 13:53 +0900, Galy Lee wrote: >> In the current implementation of concurrent vacuum, the third is not >> satisfied obviously, the first issue comes to my mind is the >> lazy_truncate_heap, it takes AccessExclusiveLock for a long time,

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread Simon Riggs
On Wed, 2007-02-28 at 13:53 +0900, Galy Lee wrote: > > Tom Lane wrote: > > Huh? There is no extra cost in what I suggested; it'll perform > > exactly the same number of index scans that it would do anyway. > > The things I wanted to say is that: > If we can stop at any point, we can make mainten

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-28 Thread tomas
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, Feb 26, 2007 at 01:39:40PM -0500, Tom Lane wrote: [...] > Or were you speaking of the pg_class.reltuples count? Yes (modulo my warning, that is) Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFF5T2SB

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Tom Lane
Galy Lee <[EMAIL PROTECTED]> writes: > If we can stop at any point, we can make maintenance memory large > sufficient to contain all of the dead tuples, then we only need to > clean index for once. No matter how many times vacuum stops, > indexes are cleaned for once. I beg your pardon? You're th

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Galy Lee
Tom Lane wrote: > Huh? There is no extra cost in what I suggested; it'll perform > exactly the same number of index scans that it would do anyway. The things I wanted to say is that: If we can stop at any point, we can make maintenance memory large sufficient to contain all of the dead tuples,

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Tom Lane
Galy Lee <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> ... or set a flag to stop at the next cycle-completion point. > The extra cost to clean indexes may prevent this approach to work in > practices. Huh? There is no extra cost in what I suggested; it'll perform exactly the same number of in

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Galy Lee
Tom Lane wrote: >One problem with it is that a too-small target would result in vacuum >proceeding to scan indexes after having accumulated only a few dead >tuples, resulting in increases (potentially enormous ones) in the total >work needed to vacuum the table completely. Yeah. This is also my bi

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Simon Riggs
On Tue, 2007-02-27 at 12:23 -0500, Tom Lane wrote: > "Matthew T. O'Connor" <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> It occurs to me that we may be thinking about this the wrong way > >> entirely. Perhaps a more useful answer to the problem of using a > >> defined maintenance window is

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Tom Lane
"Matthew T. O'Connor" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> It occurs to me that we may be thinking about this the wrong way >> entirely. Perhaps a more useful answer to the problem of using a >> defined maintenance window is to allow VACUUM to respond to changes in >> the vacuum cost d

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Matthew T. O'Connor
Tom Lane wrote: "Simon Riggs" <[EMAIL PROTECTED]> writes: On Tue, 2007-02-27 at 10:37 -0600, Jim C. Nasby wrote: ... The idea would be to give vacuum a target run time, and it would monitor how much time it had remaining, taking into account how long it should take to scan the indexes based on

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes: > On Tue, 2007-02-27 at 10:37 -0600, Jim C. Nasby wrote: >> ... The idea would be to give vacuum a target run time, and it >> would monitor how much time it had remaining, taking into account how >> long it should take to scan the indexes based on how long

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Simon Riggs
On Tue, 2007-02-27 at 10:37 -0600, Jim C. Nasby wrote: > On Tue, Feb 27, 2007 at 11:44:28AM +0900, Galy Lee wrote: > > For example, there is one table: > >- The table is a hundreds GBs table. > >- It takes 4-8 hours to vacuum such a large table. > >- Enabling cost-based delay may make i

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-27 Thread Jim C. Nasby
On Tue, Feb 27, 2007 at 11:44:28AM +0900, Galy Lee wrote: > For example, there is one table: >- The table is a hundreds GBs table. >- It takes 4-8 hours to vacuum such a large table. >- Enabling cost-based delay may make it last for 24 hours. >- It can be vacuumed during night time

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-26 Thread Tom Lane
Galy Lee <[EMAIL PROTECTED]> writes: > For example, there is one table: >- The table is a hundreds GBs table. >- It takes 4-8 hours to vacuum such a large table. >- Enabling cost-based delay may make it last for 24 hours. >- It can be vacuumed during night time for 2-4 hours. > It

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-26 Thread Galy Lee
Simon Riggs wrote: >>old dead tuple list. If the system manages the dead tuple list we may >>need to keep such files around for long periods, which doesn't sound >>great either. The system manages such files. The files are kept in location like $PGDATA/pg_vacuum. They are removed when CLUSTER, DR

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-26 Thread Tom Lane
[EMAIL PROTECTED] writes: > WARNING: I don't really know what I'm talking about -- but considering > that vaccuming a large table across several "maintainance windows" is > one of the envisioned scenarios, it might make sense to try to update > the statistics (i.e. to do partially step 7) on each p

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-26 Thread tomas
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, Feb 26, 2007 at 06:21:50PM +0900, Galy Lee wrote: > Hi > > We are developing a new feature for vacuum, here is a brief overview > about it. [...] > Concurrent vacuum mainly has the following steps to vacuum a table: > > 1. scan heap to co

Re: [HACKERS] Resumable vacuum proposal and design overview

2007-02-26 Thread Simon Riggs
On Mon, 2007-02-26 at 18:21 +0900, Galy Lee wrote: > This implementation accepts stop request at *blocks level* in step 1-4. > > D) How to stop and resume > > - stop: > > When vacuum stop in step 1-4, vacuum perform following things: > vacuum saves dead tuple list, the heap block number

[HACKERS] Resumable vacuum proposal and design overview

2007-02-26 Thread Galy Lee
Hi We are developing a new feature for vacuum, here is a brief overview about it. Introduction A) What is it? This feature enables vacuum has resumable capability. Vacuum can remembers the point it stops, then resumes interrupted vacuum operation from the point next time. The SQL