Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-28 Thread Chester Kustarz
On Fri, 21 Nov 2003, Matthew T. O'Connor wrote: Do you know of an easy way to get a count of the total pages used by a whole cluster? Select sum(relpages) from pg_class. You might want to exclude indexes from this calculation. Some large read only tables might have indexes larger than the

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-25 Thread Andrew Sullivan
On Fri, Nov 21, 2003 at 07:51:17PM -0500, Greg Stark wrote: The second vacuum waits for the lock to become available. If the situation got really bad there could end up being a growing queue of vacuums waiting. Those of us who have run into this know that the situation got really bad is

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-22 Thread Alvaro Herrera Munoz
On Fri, Nov 21, 2003 at 04:24:25PM -0500, Matthew T. O'Connor wrote: I don't want to add tables to existing databases, as I consider that clutter and I never like using tools that clutter my production databases. [...] Actually, this might be a necessary addition as pg_autovacuum

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Robert Treat
On Thu, 2003-11-20 at 19:40, Matthew T. O'Connor wrote: I'm open to discussion on changing the defaults. Perhaps what it would be better to use some non-linear (perhaps logorithmic) scaling factor. So that you wound up with something roughly like this: #tuples activity% for vacuum 1k

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Josh Berkus [EMAIL PROTECTED] writes: BTW, do we have any provisions to avoid overlapping vacuums? That is, to prevent a second vacuum on a table if an earlier one is still running? Yes, VACUUM takes a lock that prevents another VACUUM on the same

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Robert Treat wrote: Just thinking out loud here, so disregard if you think its chaff but... if we had a system table pg_avd_defaults [snip] As long as pg_autovacuum remains a contrib module, I don't think any changes to the system catelogs will be make. If pg_autovacuum is deemed ready to

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote: Matthew, True, but I think it would be one hour once, rather than 30 minutes 4 times. Well, generally it would be about 6-8 times at 2-4 minutes each. Are you saying that you can vacuum a 1 million row table in 2-4 minutes? While a vacuum of the same table with an

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Shridhar Daithankar
Matthew T. O'Connor wrote: But we track tuples because we can compare against the count given by the stats system. I don't know of a way (other than looking at the FSM, or contrib/pgstattuple ) to see how many dead pages exist. I think making pg_autovacuum dependent of pgstattuple is very good

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus
Matthew, As long as pg_autovacuum remains a contrib module, I don't think any changes to the system catelogs will be make. If pg_autovacuum is deemed ready to move out of contrib, then we can talk about the above. But we could create a config file that would store stuff in a flatfile table,

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote: Matthew, But we could create a config file that would store stuff in a flatfile table, OR we could add our own system table that would be created when one initializes pg_avd. I don't want to add tables to existing databases, as I consider that clutter and I never like

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus
Matthew, Actually, this might be a necessary addition as pg_autovacuum currently suffers from the startup transients that the FSM used to suffer from, that is, it doesn't remember anything that happened the last time it ran. A pg_autovacuum database could also be used to store thresholds

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote: Matthew, I don't see how a seperate database is better than a table in the databases., except that it means scanning only one table and not one per database. For one thing, making it a seperate database could make it hard to back up and move your database+pg_avd

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus
Matthew, Basically, I don't like the idea of modifying users databases, besides, in the long run most of what needs to be tracked will be moved to the system catalogs. I kind of consider the pg_autvacuum database to equivalent to the changes that will need to be made to the system

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote: Matthew, I certainly agree that less than 10% would be excessive, I still feel that 10% may not be high enough though. That's why I kinda liked the sliding scale I mentioned earlier, because I agree that for very large tables, something as low as 10% might be useful,

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: BTW, do we have any provisions to avoid overlapping vacuums? That is, to prevent a second vacuum on a table if an earlier one is still running? Yes, VACUUM takes a lock that prevents another VACUUM on the same table. regards, tom

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar
Josh Berkus wrote: Shridhar, However I do not agree with this logic entirely. It pegs the next vacuum w.r.t current table size which is not always a good thing. No, I think the logic's fine, it's the numbers which are wrong. We want to vacuum when updates reach between 5% and 15% of total

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Matthew T. O'Connor
Shridhar Daithankar wrote: Josh Berkus wrote: Shridhar, However I do not agree with this logic entirely. It pegs the next vacuum w.r.t current table size which is not always a good thing. Ok, what do you recommend? The point of two separate variables allows you to specify if you want

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar
On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote: Shridhar Daithankar wrote: I will submit a patch that would account deletes in analyze threshold. Since you want to delay the analyze, I would calculate analyze count as deletes are already accounted for in the analyze

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar
On Thursday 20 November 2003 20:29, Shridhar Daithankar wrote: On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote: Shridhar Daithankar wrote: I will submit a patch that would account deletes in analyze threshold. Since you want to delay the analyze, I would calculate analyze

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Matthew T. O'Connor
Shridhar Daithankar wrote: On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote: Shridhar Daithankar wrote: I am still wary of inverting vacuum analyze frequency. You think it is better to set inverted default rather than documenting it? I think inverting the vacuum and

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Josh Berkus
Matthew, For small tables, you don't need to vacuum too often. In the testing I did a small table ~100 rows, didn't really show significant performance degredation until it had close to 1000 updates. This is accounted for by using the threshold value. That way small tables get vacuumed

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Josh Berkus
Shridhar, I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that thresholds less than 1 should be used. Yes, but not thresholds, scale factors of less than 1.0. Thresholds should still be in the range of 100 to 1000. I will submit a patch that would account deletes in

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Tom Lane
Chester Kustarz [EMAIL PROTECTED] writes: i have some tables which are insert only. i do not want to vacuum them because there are never any dead tuples in them and the vacuum grows the indexes. Those claims cannot both be true. In any case, plain vacuum cannot grow the indexes --- only a

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Matthew T. O'Connor
Tom Lane wrote: Chester Kustarz [EMAIL PROTECTED] writes: vacuum is to reclaim dead tuples. this means it depends on update and delete. analyze depends on data values/distribution. this means it depends on insert, update, and delete. thus the dependencies are slightly different between the 2

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Matthew T. O'Connor
Josh Berkus wrote: Matthew, For small tables, you don't need to vacuum too often. In the testing I did a small table ~100 rows, didn't really show significant performance degredation until it had close to 1000 updates. This is accounted for by using the threshold value. That way

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Josh Berkus
Matthew, 110% of a 1.1 million row table is updated, though, that vaccuum will take an hour or more. True, but I think it would be one hour once, rather than 30 minutes 4 times. Well, generally it would be about 6-8 times at 2-4 minutes each. This is one of the things I had hoped to add

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-19 Thread Shridhar Daithankar
Josh Berkus wrote: Shridhar, I was looking at the -V/-v and -A/-a settings in pgavd, and really don't understand how the calculation works. According to the readme, if I set -v to 1000 and -V to 2 (the defaults) for a table with 10,000 rows, pgavd would only vacuum after 21,000 rows had

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-19 Thread Josh Berkus
Shridhar, However I do not agree with this logic entirely. It pegs the next vacuum w.r.t current table size which is not always a good thing. No, I think the logic's fine, it's the numbers which are wrong. We want to vacuum when updates reach between 5% and 15% of total rows. NOT when