Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-24 Thread Matthew T. O'Connor
Josh Berkus wrote:

Matthew,
 

For small tables,  you don't need to vacuum too often.  In the testing I
did a small table ~100 rows, didn't really show significant performance
degredation until it had close to 1000 updates. 
   

This is accounted for by using the threshold value.   That way small tables 
get vacuumed less often. However, the way large tables work is very different 
and I think your strategy shows a lack of testing on large active tables.
 

Probably more true than I would like to think...

For large tables,
vacuum is so expensive, that you don't want to do it very often, and
scanning the whole table when there is only 5% wasted space is not very
helpful.
   

5% is probably too low, you're right ... in my experience, performance 
degredation starts to set in a 10-15% updates to, for example, a 1.1 million 
row table, particularly since users tend to request the most recently updated 
rows.   As long as we have the I/O issues that Background Writer and ARC are 
intended to solve, though, I can see being less agressive on the defaults; 
perhaps 20% or 25%.   If you wait until 110% of a 1.1 million row table is 
updated, though, that vaccuum will take an hour or more.
 

True, but I think it would be one hour once, rather than 30 minutes 4 times.

Additionally, you are not thinking of this in terms of an overall database 
maintanence strategy.   Lazy Vacuum needs to stay below the threshold of the 
Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to 
your databases.   With proper configuration of pg_avd, vacuum_mem and FSM 
values, it should be possible to never run a VACUUM FULL again, and as of 7.4 
never run an REINDEX again either.  
 

This is one of the things I had hoped to add to pg_autovacuum, but never 
got to.  In addition to just the information from the stats collector on 
inserts updates and deletes, pg_autovacuum should also look at the FSM, 
and make decisions based on it.  Anyone looking for a project?

But this means running vacuum frequently enough that your max_fsm_pages 
threshold is never reached.   Which for a large database is going to have to 
be more frequently than 110% updates, because setting 20,000,000 
max_fsm_pages will eat your RAM.
 

Again, the think the only way to do this efficiently is to look at the 
FSM.  Otherwise the only way to make sure you keep the FSM populated is 
to run vacuum more than needed.

Yes, the I set the defaults a little high perhaps so as to err on the
side of caution.  I didn't want people to say pg_autovacuum kills the
performance of my server.  A small table will get vacuumed, just not
until it has reached the threshold.  So a table with 100 rows, will get
vacuumed after 1200 updates / deletes.  
   

Ok, I can see that for small tables.
 

In my testing it showed that
there was no major performance problems  until you reached several
thousand updates / deletes.
   

Sure.  But several thousand updates can be only 2% of a very large table.
 

But I can't imagine that 2% makes any difference on a large table.  In 
fact I would think that 10-15% would hardly be noticable, beyond that 
I'm not sure.

HUH?  analyze is very very cheap compared to vacuum.  Why not do it more
often?
   

Because nothing is cheap if it's not needed.   

Analyze is needed only as often as the *aggregate distribution* of data in the 
tables changes.   Depending on the application, this could be frequently, but 
far more often (in my experience running multiple databases for several 
clients) the data distribution of very large tables changes very slowly over 
time.  
 

Valid points, and again I think this points to the fact that 
pg_autovacuum needs to be more configurable.  Being able to set 
different thresholds for different tables will help considerably.  In 
fact, you may find that some tables should have a vac threshold much 
larger than the analyze thresold, while other tables might want the 
opposite.

One client's database, for example, that I have running VACUUM on chron 
scripts  runs on this schedule for the main tables:
VACUUM only: twice per hour
VACUUM ANALYZE: twice per day
 

I would be surprized if you can notice the difference between a vacuum 
analyze and a vacuum, especially on large tables.

On the other hand, I've another client's database where most activity involves 
updates to entire classes of records.   They run ANALYZE at the end of every 
transaction.

So if you're going to have a seperate ANALYZE schedule at all, it should be 
slightly less frequent than VACUUM for large tables.   Either that, or drop 
the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead 
of having 2 seperate schedules.
 

I think you need two separate schedules.  There are lots of times where 
a vacuum doesn't help, and an analyze is all that is needed, and an 
analyze is MUCH cheaper than a vacuum.

BUT  now I see how you arrived at the logic you did.  If you're testing 
only on small tables, and not 

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-24 Thread Matthew T. O'Connor
Tom Lane wrote:

Chester Kustarz [EMAIL PROTECTED] writes:
 

vacuum is to reclaim dead tuples. this means it depends on update and
delete. analyze depends on data values/distribution. this means it depends on
insert, update, and delete. thus the dependencies are slightly different
between the 2 operations, an so you can come up with use-cases that
justify running either more frequently.
   

Agreed.
 

And that is why pg_autovacuum looks at insert, update and delete when 
deciding to do an analyze, but only looks at update and delete when 
deciding to do a vacuum.  In addition, this is why pg_autovacuum was 
given knobs so that the vacuum and analyze thresholds can be set 
independently.

Matthew

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-24 Thread Alvaro Herrera Munoz
On Fri, Nov 21, 2003 at 04:24:25PM -0500, Matthew T. O'Connor wrote:

 I don't want to add tables to existing databases, as I consider that 
 clutter and I never like using tools that clutter my production 
 databases.  [...]
 
 Actually, this might be a necessary addition as pg_autovacuum currently 
 suffers from the startup transients that the FSM used to suffer from, 
 that is, it doesn't remember anything that happened the last time it 
 ran.  A pg_autovacuum database could also be used to store thresholds 
 and counts from the last time it ran.

You could use the same approach the FSM uses: keep a file with the data,
PGDATA/base/global/pg_fsm.cache.  You don't need the data to be in a table
after all ...

-- 
Alvaro Herrera ([EMAIL PROTECTED])
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Robert Treat
On Thu, 2003-11-20 at 19:40, Matthew T. O'Connor wrote:
 I'm open to discussion on changing the defaults.  Perhaps what it would 
 be better to use some non-linear (perhaps logorithmic) scaling factor.  
 So that you wound up with something roughly like this:
 
 #tuples   activity% for vacuum
 1k   100%
 10k   70%
 100k 45%
 1M20%
 10M  10%
 100M  8%
 


Just thinking out loud here, so disregard if you think its chaff but...
if we had a system table pg_avd_defaults that held what we generally
consider the best default percentages based on reltuples/pages, and
added a column to pg_class (could be some place better but..) which
could hold an overriding percentage, you could then have a column added
to pg_stat_all_tables called vacuum_percentage, which would be a
coalesce of the override percentage or the default percentages based on
rel_tuples (or rel_pages).  This would give autovacuum a place to look
for each table as to when it should vacuum, and gives administrators the
option to tweak it on a per table basis if they find they need a
specific table to vacuum at a different rate than the standard.   

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote:

Matthew,
 

True, but I think it would be one hour once, rather than 30 minutes 4
times.
   

Well, generally it would be about 6-8 times at 2-4 minutes each.
 

Are you saying that you can vacuum a 1 million row table in 2-4 
minutes?  While a vacuum of the same table with an additional 1 million 
dead tuples would take an hour?

This is one of the things I had hoped to add to pg_autovacuum, but never
got to.  In addition to just the information from the stats collector on
inserts updates and deletes, pg_autovacuum should also look at the FSM,
and make decisions based on it.  Anyone looking for a project?
   

Hmmm ... I think that's the wrong approach.  Once your database is populated, 
it's very easy to determine how to set the FSM for a given pg_avd level.   If 
you're vacuuming after 20% updates, for example, just set fsm_pages to 20% of 
the total database pages plus growth  safety margins.
 

Ok.

I'd be really reluctant to base pv-avd frequency on the fsm settings instead.  
What if the user loads 8GB of data but leaves fsm_pages at the default of 
10,000?  You can't do much with that; you'd have to vacuum if even 1% of the 
data changed.

Ok, but as you said above it's very easy to set the FSM once you know 
your db size.

The other problem is that calculating data pages from a count of 
updates+deletes would require pg_avd to keep more statistics and do more math 
for every table.  Do we want to do this?
 

I would think the math is simple enough to not be a big problem.  Also, 
I did not recommend looking blindly at the FSM as our guide, rather 
consulting it as another source of information as to when it would be 
useful to vacuum.  I don't have a good plan as to how to incorporate 
this data, but to a large extent the FSM already tracks table activity 
and gives us the most accurate answer about storage growth (short of  
using something like contrib/pgstattuple which takes nearly the same 
amount of time as an actual vacuum)

But I can't imagine that 2% makes any difference on a large table.  In
fact I would think that 10-15% would hardly be noticable, beyond that
I'm not sure.
   

I've seen performance lag at 10% of records, especially in tables where both 
update and select activity focus on one subset of the table (calendar tables, 
for example).
 

Ok.

Valid points, and again I think this points to the fact that
pg_autovacuum needs to be more configurable.  Being able to set
different thresholds for different tables will help considerably.  In
fact, you may find that some tables should have a vac threshold much
larger than the analyze thresold, while other tables might want the
opposite.
   

Sure.  Though I think we can make the present configuration work with a little 
adjustment of the numbers.   I'll have a chance to test on production 
databases soon.
 

I look forward to hearing results from your testing.

I would be surprized if you can notice the difference between a vacuum
analyze and a vacuum, especially on large tables.
   

It's substantial for tables with high statistics settings.   A 1,000,000 row 
table with 5 columns set to statistics=250 can take 3 minutes to analyze on a 
medium-grade server.
 

In my testing, I never changed the default statistics settings.

I think you need two separate schedules.  There are lots of times where
a vacuum doesn't help, and an analyze is all that is needed
   

Agreed.  And I've just talked to a client who may want to use pg_avd's ANALYZE 
scheduling but not use vacuum at all.   BTW, I think we should have a setting 
for this; for example, if -V is -1, don't vacuum.
 

That would be nice.  Easy to add, and something I never thought of

I'm open to discussion on changing the defaults.  Perhaps what it would
be better to use some non-linear (perhaps logorithmic) scaling factor.
   

That would be cool, too.Though a count of data pages would be a better 
scale than a count of rows, and equally obtainable from pg_class.
 

But we track tuples because we can compare against the count given by 
the stats system.  I don't know of a way (other than looking at the FSM, 
or contrib/pgstattuple ) to see how many dead pages exist.



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Shridhar Daithankar
Matthew T. O'Connor wrote:

But we track tuples because we can compare against the count given by 
the stats system.  I don't know of a way (other than looking at the FSM, 
or contrib/pgstattuple ) to see how many dead pages exist.
I think making pg_autovacuum dependent of pgstattuple is very good idea.

Probably it might be a good idea to extend pgstattuple to return pages that are 
excessively contaminated and clean them ASAP. Step by step getting closer to 
daemonized vacuum.

 Shridhar

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Shridhar Daithankar wrote:

Matthew T. O'Connor wrote:

But we track tuples because we can compare against the count given by 
the stats system.  I don't know of a way (other than looking at the 
FSM, or contrib/pgstattuple ) to see how many dead pages exist.
I think making pg_autovacuum dependent of pgstattuple is very good idea. 
Only if pgstattuple can become much cheaper than it is now.  Based on 
the testing I did when I wrote pg_autovacuum, pgstattuple cost nearly 
the same amount as a regular vacuum.  Given that, what have we gained 
from that work?  Wouldn't it just be better to run a vacuum and actually 
reclaim space rather than running pgstattuple, and just look and see if 
there is free space to be reclaimed?

Perhaps we could use pgstattuple ocasionally to see if we are going a 
good job of keeping the amount of dead space to a reasonable level, but 
I'm still not really sure about this.

Probably it might be a good idea to extend pgstattuple to return pages 
that are excessively contaminated and clean them ASAP. Step by step 
getting closer to daemonized vacuum.
I don't know of anyway to clean a particular set of pages.  This is 
something that has been talked about (partial vacuums and such), but I 
think Tom has raised several issues with it, I don't remember the 
details.  Right now the only tool we have to reclaim space is vacuum, a 
whole table at a time.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus
Matthew,

 As long as pg_autovacuum remains a contrib module, I don't think any
 changes to the system catelogs will be make.  If  pg_autovacuum is
 deemed ready to move out of contrib, then we can talk about the above.

But we could create a config file that would store stuff in a flatfile table, 
OR we could add our own system table that would be created when one 
initializes pg_avd.

Just an idea.  Mind you, I'm not so sure that we want to focus immediately on 
per-table settings.   I think that we want to get the automatic settings 
working fairly well first; a lot of new DBAs would use the per-table settings 
to shoot themselves in the foot.  So we need to be able to make a strong 
recommendation to try the automatic settings first.

 Are you saying that you can vacuum a 1 million row table in 2-4
 minutes?  While a vacuum of the same table with an additional 1 million
 dead tuples would take an hour?

I'm probably exaggerating.  I do know that I can vacuum a fairly clean 1-5 
million row table in less than 4 mintues.   I've never let such a table get 
to 50% dead tuples, so I don't really know how long that takes.  Call me a 
coward if you  like ...

 I'd be really reluctant to base pv-avd frequency on the fsm settings
  instead. What if the user loads 8GB of data but leaves fsm_pages at the
  default of 10,000?  You can't do much with that; you'd have to vacuum if
  even 1% of the data changed.

 Ok, but as you said above it's very easy to set the FSM once you know
 your db size.

Actually, thinking about this I realize that PG_AVD and the Perl-based 
postgresql.conf configuration script I was working on (darn, who was doing 
that with me?) need to go togther.   With pg_avd, setting max_fsm_pages is 
very easy; without it its a bit of guesswork.

So I think we can do this:  for 'auto' settings:

If max_fsm_pages is between 13% and 100% of the total database pages, then set 
the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
database = 18,000,000 data pages;
max_fsm_pages = 3,600,000;
set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15

If max_fsm_pages is less than 13% of database pages, issue a warning to the 
user (log it, if possible) and set scale factor to 0.1.   If it's greater 
than 100% set it to 1 and leave it alone.

 I don't have a good plan as to how to incorporate
 this data, but to a large extent the FSM already tracks table activity
 and gives us the most accurate answer about storage growth (short of
 using something like contrib/pgstattuple which takes nearly the same
 amount of time as an actual vacuum)

I don't really think we need to do dynamic monitoring at this point.   It 
would be a lot of engineering to check data page pollution without having 
significant performance impact.   It's doable, but something I think we 
should hold off until version 3.  It would mean hacking the FSM, which is a 
little beyond me right now.

 In my testing, I never changed the default statistics settings.

Ah.  Well, a lot of users do to resolve query problems.

 But we track tuples because we can compare against the count given by
 the stats system.  I don't know of a way (other than looking at the FSM,
 or contrib/pgstattuple ) to see how many dead pages exist.

No, but for scaling you don't need the dynamic count of tuples or of dead 
tuples; pg_class holds a reasonable accurate count of pages per table as of 
last vacuum.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote:

Matthew,
 

But we could create a config file that would store stuff in a flatfile table, 
OR we could add our own system table that would be created when one 
initializes pg_avd.
 

I don't want to add tables to existing databases, as I consider that 
clutter and I never like using tools that clutter my production 
databases.  I had considered using a pg_autovacuum database that if 
found, would store customized settings for individual tables / 
databases.  Dunno if this is a  good idea, but it might make a good 
stopgap until people are comfortable modifying the system catalogs for 
autovacuum. 

Actually, this might be a necessary addition as pg_autovacuum currently 
suffers from the startup transients that the FSM used to suffer from, 
that is, it doesn't remember anything that happened the last time it 
ran.  A pg_autovacuum database could also be used to store thresholds 
and counts from the last time it ran.

Just an idea.  Mind you, I'm not so sure that we want to focus immediately on 
per-table settings.   I think that we want to get the automatic settings 
working fairly well first; a lot of new DBAs would use the per-table settings 
to shoot themselves in the foot.  So we need to be able to make a strong 
recommendation to try the automatic settings first.
 

I agree in principle, question is what are the best settings, I still 
think it will be hard to find a one size fits all, but I'm sure we can 
do better than what we have.

Actually, thinking about this I realize that PG_AVD and the Perl-based 
postgresql.conf configuration script I was working on (darn, who was doing 
that with me?) need to go togther.   With pg_avd, setting max_fsm_pages is 
very easy; without it its a bit of guesswork.

So I think we can do this:  for 'auto' settings:

If max_fsm_pages is between 13% and 100% of the total database pages, then set 
the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
database = 18,000,000 data pages;
max_fsm_pages = 3,600,000;
set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15
 

Where are you getting 13% from?  Do you know of an easy way to get a 
count of the total pages used by a whole cluster?  I guess we can just 
iterate over all the tables in all the databases and sum up the total 
num of pages.  We already iterate over them all, we just don't sum it up.

If max_fsm_pages is less than 13% of database pages, issue a warning to the 
user (log it, if possible) and set scale factor to 0.1.   If it's greater 
than 100% set it to 1 and leave it alone.
 

Again I ask where 13% is coming from and also where is 0.1 coming from?  
I assume these are your best guesses right now, but not more than that.  
I do like the concept though as long as we find good values for 
min_fsm_percentage and min_autovac_scaling_factor.

But we track tuples because we can compare against the count given by
the stats system.  I don't know of a way (other than looking at the FSM,
or contrib/pgstattuple ) to see how many dead pages exist.
   

No, but for scaling you don't need the dynamic count of tuples or of dead 
tuples; pg_class holds a reasonable accurate count of pages per table as of 
last vacuum.
 

Which we already keep a copy of inside of pg_autovacuum, and update 
after we issue a vacuum.



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus
Matthew,

 Actually, this might be a necessary addition as pg_autovacuum currently 
 suffers from the startup transients that the FSM used to suffer from, 
 that is, it doesn't remember anything that happened the last time it 
 ran.  A pg_autovacuum database could also be used to store thresholds 
 and counts from the last time it ran.

I don't see how a seperate database is better than a table in the databases., 
except that it means scanning only one table and not one per database.   For 
one thing, making it a seperate database could make it hard to back up and 
move your database+pg_avd config.

But I don't feel strongly about it.

 Where are you getting 13% from? 

13% * 3/4 ~~ 10%

And I think both of use agree that vacuuming tables with less than 10% changes 
is excessive and could lead to problems on its own, like overlapping vacuums.

  Do you know of an easy way to get a 
 count of the total pages used by a whole cluster?

Select sum(relpages) from pg_class.

 I do like the concept though as long as we find good values for 
 min_fsm_percentage and min_autovac_scaling_factor.

See above.  I propose 0.13 and 0.1

 Which we already keep a copy of inside of pg_autovacuum, and update 
 after we issue a vacuum.

Even easier then.

BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
prevent a second vacuum on a table if an earlier one is still running?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote:

Matthew,

 

I don't see how a seperate database is better than a table in the databases., 
except that it means scanning only one table and not one per database.   For 
one thing, making it a seperate database could make it hard to back up and 
move your database+pg_avd config.
 

Basically, I don't like the idea of modifying users databases, besides, 
in the long run most of what needs to be tracked will be moved to the 
system catalogs.  I kind of consider the pg_autvacuum database to 
equivalent to the changes that will need to be made to the system catalogs.

I guess it could make it harder to backup if you are moving your 
database between clusters.  Perhaps, if you create a pg_autovacuum 
schema inside of your database then we would could use that.  I just 
don't like tools that drop things into your database.

Where are you getting 13% from? 
   

13% * 3/4 ~~ 10%

And I think both of use agree that vacuuming tables with less than 10% changes 
is excessive and could lead to problems on its own, like overlapping vacuums.

 

I certainly agree that less than 10% would be excessive, I still feel 
that 10% may not be high enough though.   That's why I kinda liked the 
sliding scale I mentioned earlier, because I agree that for very large 
tables, something as low as 10% might be useful, but most tables in a 
database would not be that large.

Do you know of an easy way to get a 
count of the total pages used by a whole cluster?
   

Select sum(relpages) from pg_class.

 

duh

BTW, do we have any provisions to avoid overlapping vacuums?  That is, to 
prevent a second vacuum on a table if an earlier one is still running?

 

Only that pg_autovacuum isn't smart enough to kick off more than one 
vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table 
and waits for it to finish, then check the next table in it's list to 
see if it needs to be vacuumed, if so, it does it and waits for that 
vacuum to finish.   There was some discussion of issuing concurrent 
vacuum against different tables, but it was decided that since vacuum is 
I/O bound, it would only make sense to issue concurrent vacuums that 
were on different spindles, which is not something I wanted to get 
into.  Also, given the recent talk about how vacuum is still such a 
performance hog, I can't imagine what multiple concurrent vacuums would 
do to performance.  Maybe as 7.5 develops and many of the vacuum 
performance issues are addressed, we can revisit this question.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Josh Berkus
Matthew,

 Basically, I don't like the idea of modifying users databases, besides, 
 in the long run most of what needs to be tracked will be moved to the 
 system catalogs.  I kind of consider the pg_autvacuum database to 
 equivalent to the changes that will need to be made to the system catalogs.

OK.  As I said, I don't feel strongly about it.

 I certainly agree that less than 10% would be excessive, I still feel 
 that 10% may not be high enough though.   That's why I kinda liked the 
 sliding scale I mentioned earlier, because I agree that for very large 
 tables, something as low as 10% might be useful, but most tables in a 
 database would not be that large.

Yes, but I thought that we were taking care of that through the threshold 
value?

A sliding scale would also be OK.   However, that would definitely require a 
leap to storing per-table pg_avd statistics and settings.

 Only that pg_autovacuum isn't smart enough to kick off more than one 
 vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table 
 and waits for it to finish, then check the next table in it's list to 
 see if it needs to be vacuumed, if so, it does it and waits for that 
 vacuum to finish. 

OK, then, we just need to detect the condition of the vacuums piling up 
because they are happening too often.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Matthew T. O'Connor
Josh Berkus wrote:

Matthew,
 

I certainly agree that less than 10% would be excessive, I still feel 
that 10% may not be high enough though.   That's why I kinda liked the 
sliding scale I mentioned earlier, because I agree that for very large 
tables, something as low as 10% might be useful, but most tables in a 
database would not be that large.
   

Yes, but I thought that we were taking care of that through the threshold 
value?
 

Well the threshold is a combination of the base value and the scaling 
factor which you are proposing is 0.1, so the threshold is base + 
(scaling factor)(num of tuples)  So with the default base of 1000 and 
your 0.1 you would have this:

Num Rowsthreshold  Percent
   1,0001,100 110%
  10,0002,000  20% 
 100,000   11,000  11%
1,000,000  102,000  10%

I don't like how that looks, hence the thought of some non-linear 
scaling factor that would still allow the percent to reach 10%, but at a 
slower rate, perhaps just a larger base value would suffice, but I think 
small table performance is going to suffer much above 1000.  Anyone else 
have an opinion on the table above? Good / Bad / Indifferent?

A sliding scale would also be OK.   However, that would definitely require a 
leap to storing per-table pg_avd statistics and settings.

 

I don't think it would, it would correlate the scaling factor with the 
number of tuples, no per-table settings required.

Only that pg_autovacuum isn't smart enough to kick off more than one 
vacuum at a time.  Basically, pg_autovacuum issues a vacuum on a table 
and waits for it to finish, then check the next table in it's list to 
see if it needs to be vacuumed, if so, it does it and waits for that 
vacuum to finish. 
   

OK, then, we just need to detect the condition of the vacuums piling up 
because they are happening too often.

 

That would be good to look into at some point, especially if vacuum is 
going to get slower as a result of the page loop delay patch that has 
been floating around.



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar
On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
 Shridhar Daithankar wrote:
  I will submit a patch that would account deletes in analyze threshold.
  Since you want to delay the analyze, I would calculate analyze count as

 deletes are already accounted for in the analyze threshold.

Yes. My bad. Deletes are not accounted in initializing analyze count but later 
they are used.

  I am still wary of inverting vacuum analyze frequency. You think it is
  better to set inverted default rather than documenting it?

 I think inverting the vacuum and analyze frequency is wrong.

Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient 
for you?..:-)

Matthew, I am confyused about one thing. Why would autovacuum count updates 
while checking for analyze threshold? Analyze does not change statistics 
right? ( w.r.t line 1072, pg_autovacuum.c). For updating statistics, only 
inserts+deletes should suffice, isn't it?

Other than that, I think autovacuum does everything it can.

Comments?

 Shridhar
*** README.pg_autovacuum.orig	Thu Nov 20 19:58:29 2003
--- README.pg_autovacuum	Thu Nov 20 20:26:39 2003
***
*** 141,150 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes)  AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates)  VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
--- 141,150 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes) = AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates) = VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
***
*** 158,163 
--- 158,186 
  and running ANALYZE more often should not substantially degrade system
  performance.
  
+ Examples:
+ 
+ Following table shows typical usage of pg_autovacuum settings.
+ These are put here so that a DBA can have some starting point while
+ tuning pg_autovacuum.
+ 
+ Vacuum is triggered by updates and deletes. So in case of vacuum,
+ last column indicates total of updates and deletes required
+ to trigger vacuum. In case of analyze, the operations would count total
+ number of inserts, updates and deletes.
+ 
+ Threshold	Scaling factor	Records		No. of Operations
+ 1,000		1		10,000		11,000
+ 1,000		2		10,000		21,000
+ 1,000		0.5		10,000		 6,000
+ 1,000		0.1		10,000		 2,000
+ 
+ Although analyze is cheaper operation compared to vacuum,
+ it might be needed less often. The default is to analyze twice as much as
+ vacuum but that might be too aggressive for some installations. It is advised that
+ such installation tune their analyze threshold separately, rather than relying upon
+ the default behaviour.
+ 
  Sleeping:
  -
  

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar
On Thursday 20 November 2003 20:29, Shridhar Daithankar wrote:
 On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
  Shridhar Daithankar wrote:
   I will submit a patch that would account deletes in analyze threshold.
   Since you want to delay the analyze, I would calculate analyze count as
 
  deletes are already accounted for in the analyze threshold.

 Yes. My bad. Deletes are not accounted in initializing analyze count but
 later they are used.

   I am still wary of inverting vacuum analyze frequency. You think it is
   better to set inverted default rather than documenting it?
 
  I think inverting the vacuum and analyze frequency is wrong.

 Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
 for you?..:-)

use this one. A warning added for too aggressive vacuumming. If it is OK by 
everybody, we can send it to patches list.

 Shridhar
*** README.pg_autovacuum.orig	Thu Nov 20 19:58:29 2003
--- README.pg_autovacuum	Thu Nov 20 20:35:34 2003
***
*** 141,163 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes)  AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates)  VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
! vacuum_base_value + (vacuum_scaling_factor * number of tuples in the table)
! 
  AnalyzeThreshold is equal to:
! analyze_base_value + (analyze_scaling_factor * number of tuples in the table)
! 
  The AnalyzeThreshold defaults to half of the VacuumThreshold since it
  represents a much less expensive operation (approx 5%-10% of vacuum),
  and running ANALYZE more often should not substantially degrade system
  performance.
  
  Sleeping:
  -
  
--- 141,191 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes) = AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates) = VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
! vacuum_base_value + (vacuum_scaling_factor * number of tuples in the 
! table)
  AnalyzeThreshold is equal to:
! analyze_base_value + (analyze_scaling_factor * number of tuples in the 
! table)
  The AnalyzeThreshold defaults to half of the VacuumThreshold since it
  represents a much less expensive operation (approx 5%-10% of vacuum),
  and running ANALYZE more often should not substantially degrade system
  performance.
  
+ Examples:
+ 
+ Following table shows typical usage of pg_autovacuum settings.
+ These are put here so that a DBA can have some starting point while
+ tuning pg_autovacuum.
+ 
+ Vacuum is triggered by updates and deletes. So in case of vacuum,
+ last column indicates total of updates and deletes required
+ to trigger vacuum. In case of analyze, the operations would count total
+ number of inserts, updates and deletes.
+ 
+ Base		Scaling factor	Records		No. of Operations
+ 1,000		1		10,000		11,000
+ 1,000		2		10,000		21,000
+ 1,000		0.5		10,000		 6,000
+ 1,000		0.1		10,000		 2,000
+ 
+ Although analyze is cheaper operation compared to vacuum,
+ it might be needed less often. The default is to analyze twice as much as
+ vacuum but that might be too aggressive for some installations. It is advised 
+ thatsuch installation tune their analyze threshold separately, rather than 
+ relying upon the default behaviour.
+ 
+ Furthermore, for aggressive vacuum/analyze behaviour, it is recommended that
+ scaling factor is set to less than 1. However too aggresive operation can affect
+ performance of normal database operations adversely. Do not apply such setting
+ to production databases without prior testing.
+ 
  Sleeping:
  -
  

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Josh Berkus
Matthew,

 For small tables,  you don't need to vacuum too often.  In the testing I
 did a small table ~100 rows, didn't really show significant performance
 degredation until it had close to 1000 updates. 

This is accounted for by using the threshold value.   That way small tables 
get vacuumed less often. However, the way large tables work is very different 
and I think your strategy shows a lack of testing on large active tables.

 For large tables,
 vacuum is so expensive, that you don't want to do it very often, and
 scanning the whole table when there is only 5% wasted space is not very
 helpful.

5% is probably too low, you're right ... in my experience, performance 
degredation starts to set in a 10-15% updates to, for example, a 1.1 million 
row table, particularly since users tend to request the most recently updated 
rows.   As long as we have the I/O issues that Background Writer and ARC are 
intended to solve, though, I can see being less agressive on the defaults; 
perhaps 20% or 25%.   If you wait until 110% of a 1.1 million row table is 
updated, though, that vaccuum will take an hour or more.

Additionally, you are not thinking of this in terms of an overall database 
maintanence strategy.   Lazy Vacuum needs to stay below the threshold of the 
Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to 
your databases.   With proper configuration of pg_avd, vacuum_mem and FSM 
values, it should be possible to never run a VACUUM FULL again, and as of 7.4 
never run an REINDEX again either.  

But this means running vacuum frequently enough that your max_fsm_pages 
threshold is never reached.   Which for a large database is going to have to 
be more frequently than 110% updates, because setting 20,000,000 
max_fsm_pages will eat your RAM.

 Yes, the I set the defaults a little high perhaps so as to err on the
 side of caution.  I didn't want people to say pg_autovacuum kills the
 performance of my server.  A small table will get vacuumed, just not
 until it has reached the threshold.  So a table with 100 rows, will get
 vacuumed after 1200 updates / deletes.  

Ok, I can see that for small tables.

 In my testing it showed that
 there was no major performance problems  until you reached several
 thousand updates / deletes.

Sure.  But several thousand updates can be only 2% of a very large table.

 HUH?  analyze is very very cheap compared to vacuum.  Why not do it more
 often?

Because nothing is cheap if it's not needed.   

Analyze is needed only as often as the *aggregate distribution* of data in the 
tables changes.   Depending on the application, this could be frequently, but 
far more often (in my experience running multiple databases for several 
clients) the data distribution of very large tables changes very slowly over 
time.  

One client's database, for example, that I have running VACUUM on chron 
scripts  runs on this schedule for the main tables:
VACUUM only: twice per hour
VACUUM ANALYZE: twice per day

On the other hand, I've another client's database where most activity involves 
updates to entire classes of records.   They run ANALYZE at the end of every 
transaction.

So if you're going to have a seperate ANALYZE schedule at all, it should be 
slightly less frequent than VACUUM for large tables.   Either that, or drop 
the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead 
of having 2 seperate schedules.

BUT  now I see how you arrived at the logic you did.  If you're testing 
only on small tables, and not vacuuming them until they reach 110% updates, 
then you *would* need to analyze more frequently. This is because of your 
threshold value ... you'd want to analyze the small table as soon as even 30% 
of its rows changed.

So the answer is to dramatically lower the threshold for the small tables.

 What I think I am hearing is that people would like very much to be able
 to tweak the settings of pg_autovacuum for individual tables / databases
 etc. 

Not from me you're not.   Though that would be nice, too.

So, my suggested defaults based on our conversation above:

Vacuum threshold: 1000 records
Vacuum scale factor:  0.2
Analyze threshold:  50 records
Analyze scale factor: 0.3

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Chester Kustarz
On Thu, 20 Nov 2003, Josh Berkus wrote:
 Additionally, you are not thinking of this in terms of an overall database
 maintanence strategy.   Lazy Vacuum needs to stay below the threshold of the
 Free Space Map (max_fsm_pages) to prevent creeping bloat from setting in to
 your databases.   With proper configuration of pg_avd, vacuum_mem and FSM
 values, it should be possible to never run a VACUUM FULL again, and as of 7.4
 never run an REINDEX again either.

is there any command you can run to see how much of the FSM is filled? is
there any way to tell which tables are filling it?

 Analyze is needed only as often as the *aggregate distribution* of data in the
 tables changes.   Depending on the application, this could be frequently, but
 far more often (in my experience running multiple databases for several
 clients) the data distribution of very large tables changes very slowly over
 time.

analyze does 2 things for me:
1. gets reasonable aggregate statistics
2. generates STATISTICS # of bins for the most frequent hitters

(2) is very important for me. my values typically seem to have power-law
like distributions. i need enough bins to reach a cross-over point where
the last bin is frequent enough to make an index scan useful. also,
i want enough bins so that the planner can choose index a or b for:
select * from foo where a=n and b=m;

the selectivity of either index depends not only on the average selectivity
of index a or index b, but on n and m as well. for example, 1M row table:

value   % of rows
v1  23
v2  12
v3  4.5
v4  4
v5  3.5
...

you can see that picking an index for =v1 would be poor. picking the
20th most common value would be 0.5% selective. much better. of course
this breaks down for more complex operators, but = is fairly common.

 So if you're going to have a seperate ANALYZE schedule at all, it should be
 slightly less frequent than VACUUM for large tables.   Either that, or drop
 the idea, and simplify pg_avd by running VACUUM ANALYZE all the time instead
 of having 2 seperate schedules.

i have some tables which are insert only. i do not want to vacuum them
because there are never any dead tuples in them and the vacuum grows the
indexes. plus it is very expensive (they tables grow rather large.) after they
expire i drop the whole table to make room for a newer one (making sort
of a rolling log with many large tables.)

i need to analyze them every so often so that the planner knows that
there is 1 row, 100 rows, 100k rows, 1M. the funny thing is
that because i never vacuum the tables, the relpages on the index never
grows. don't know if this affects anything (this is on 7.2.3).

vacuum is to reclaim dead tuples. this means it depends on update and
delete. analyze depends on data values/distribution. this means it depends on
insert, update, and delete. thus the dependencies are slightly different
between the 2 operations, an so you can come up with use-cases that
justify running either more frequently.

i am not sure how failed transactions fit into this though, not that i think
anybody ever has very many. maybe big rollbacks during testing?



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Tom Lane
Chester Kustarz [EMAIL PROTECTED] writes:
 i have some tables which are insert only. i do not want to vacuum them
 because there are never any dead tuples in them and the vacuum grows the
 indexes.

Those claims cannot both be true.  In any case, plain vacuum cannot grow
the indexes --- only a VACUUM FULL that moves a significant number of
rows could cause index growth.

 vacuum is to reclaim dead tuples. this means it depends on update and
 delete. analyze depends on data values/distribution. this means it depends on
 insert, update, and delete. thus the dependencies are slightly different
 between the 2 operations, an so you can come up with use-cases that
 justify running either more frequently.

Agreed.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Chester Kustarz
On Thu, 20 Nov 2003, Tom Lane wrote:
 Those claims cannot both be true.  In any case, plain vacuum cannot grow
 the indexes --- only a VACUUM FULL that moves a significant number of
 rows could cause index growth.

er, yeah. you're right of course. having flashbacks of vacuum full.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Josh Berkus
Matthew,

  110% of a 1.1 million row table is updated, though, that vaccuum will
  take an hour or more.

 True, but I think it would be one hour once, rather than 30 minutes 4
 times.

Well, generally it would be about 6-8 times at 2-4 minutes each.

 This is one of the things I had hoped to add to pg_autovacuum, but never
 got to.  In addition to just the information from the stats collector on
 inserts updates and deletes, pg_autovacuum should also look at the FSM,
 and make decisions based on it.  Anyone looking for a project?

Hmmm ... I think that's the wrong approach.  Once your database is populated, 
it's very easy to determine how to set the FSM for a given pg_avd level.   If 
you're vacuuming after 20% updates, for example, just set fsm_pages to 20% of 
the total database pages plus growth  safety margins.

I'd be really reluctant to base pv-avd frequency on the fsm settings instead.  
What if the user loads 8GB of data but leaves fsm_pages at the default of 
10,000?  You can't do much with that; you'd have to vacuum if even 1% of the 
data changed.

The other problem is that calculating data pages from a count of 
updates+deletes would require pg_avd to keep more statistics and do more math 
for every table.  Do we want to do this?

 But I can't imagine that 2% makes any difference on a large table.  In
 fact I would think that 10-15% would hardly be noticable, beyond that
 I'm not sure.

I've seen performance lag at 10% of records, especially in tables where both 
update and select activity focus on one subset of the table (calendar tables, 
for example).

 Valid points, and again I think this points to the fact that
 pg_autovacuum needs to be more configurable.  Being able to set
 different thresholds for different tables will help considerably.  In
 fact, you may find that some tables should have a vac threshold much
 larger than the analyze thresold, while other tables might want the
 opposite.

Sure.  Though I think we can make the present configuration work with a little 
adjustment of the numbers.   I'll have a chance to test on production 
databases soon.

 I would be surprized if you can notice the difference between a vacuum
 analyze and a vacuum, especially on large tables.

It's substantial for tables with high statistics settings.   A 1,000,000 row 
table with 5 columns set to statistics=250 can take 3 minutes to analyze on a 
medium-grade server.

 I think you need two separate schedules.  There are lots of times where
 a vacuum doesn't help, and an analyze is all that is needed

Agreed.  And I've just talked to a client who may want to use pg_avd's ANALYZE 
scheduling but not use vacuum at all.   BTW, I think we should have a setting 
for this; for example, if -V is -1, don't vacuum.

 I'm open to discussion on changing the defaults.  Perhaps what it would
 be better to use some non-linear (perhaps logorithmic) scaling factor.
 So that you wound up with something roughly like this:

 #tuples   activity% for vacuum
 1k   100%
 10k   70%
 100k 45%
 1M20%
 10M  10%
 100M  8%

That would be cool, too.Though a count of data pages would be a better 
scale than a count of rows, and equally obtainable from pg_class.

 Thanks for the lucid feedback / discussion.  autovacuum is a feature
 that, despite it's simple implementation, has generated a lot of
 feedback from users, and I would really like to see it become something
 closer to what it should be.

Well, I hope to help now.  Until very recently, I've not had a chance to 
seriously look at pg_avd and test it in production.   Now that I do, I'm 
interested in improving it.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html