Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-26 Thread Kenneth Marshall
On Wed, Jan 24, 2007 at 07:30:05PM -0500, Tom Lane wrote:
 Kenneth Marshall [EMAIL PROTECTED] writes:
  Not that I am aware of. Even extending the relation by one additional
  block can make a big difference in performance
 
 Do you have any evidence to back up that assertion?
 
 It seems a bit nontrivial to me --- not the extension part exactly, but
 making sure that the space will get used promptly.  With the current
 code the backend extending a relation will do subsequent inserts into
 the block it just got, which is fine, but there's no mechanism for
 remembering that any other newly-added blocks are available --- unless
 you wanted to push them into the FSM, which could work but the current
 FSM code doesn't support piecemeal addition of space, and in any case
 there's some question in my mind about the concurrency cost of increasing
 FSM traffic even more.
 
 In short, it's hardly an unquestionable improvement, so we need some
 evidence.
 
   regards, tom lane
 

My comment was purely based on the reduction in fragmentation of the
file behind the relation. A result that I have seen repeatedly in file
related data processing. It does sound much more complicated to make the
additional space available to other backends. If one backend was doing
many inserts, it might still be of value even for just that backend. As
you mention, testing is needed to see if there is enough value in this
process.

Ken


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-25 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Mon, Jan 22, 2007 at 12:17:39PM -0800, Ron Mayer wrote:
  Gregory Stark wrote:
   
   Actually no. A while back I did experiments to see how fast reading a file
   sequentially was compared to reading the same file sequentially but 
   skipping
   x% of the blocks randomly. The results were surprising (to me) and 
   depressing.
   The breakeven point was about 7%. [...]
   
   The theory online was that as long as you're reading one page from each 
   disk
   track you're going to pay the same seek overhead as reading the entire 
   track.
  
  Could one take advantage of this observation in designing the DSM?
  
  Instead of a separate bit representing every page, having each bit
  represent 20 or so pages might be a more useful unit.  It sounds
  like the time spent reading would be similar; while the bitmap
  would be significantly smaller.
 
 If we extended relations by more than one page at a time we'd probably
 have a better shot at the blocks on disk being contiguous and all read
 at the same time by the OS.

Actually, there is evidence that adding only a single page to the end
causes a lot of contention for that last page, and that adding a few
might be better.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-25 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Jim C. Nasby wrote:
 If we extended relations by more than one page at a time we'd probably
 have a better shot at the blocks on disk being contiguous and all read
 at the same time by the OS.

 Actually, there is evidence that adding only a single page to the end
 causes a lot of contention for that last page, and that adding a few
 might be better.

Evidence where?  The code is designed so that the last page *isn't*
shared --- go read the comments in hio.c sometime.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-25 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Jim C. Nasby wrote:
  If we extended relations by more than one page at a time we'd probably
  have a better shot at the blocks on disk being contiguous and all read
  at the same time by the OS.
 
  Actually, there is evidence that adding only a single page to the end
  causes a lot of contention for that last page, and that adding a few
  might be better.
 
 Evidence where?  The code is designed so that the last page *isn't*
 shared --- go read the comments in hio.c sometime.

I was talking about the last page of a table, where INSERTs all cluster
on that last page and cause lots of page locking.  hio.c does look like
it avoids that problem.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-24 Thread Kenneth Marshall
On Tue, Jan 23, 2007 at 09:01:41PM -0600, Jim Nasby wrote:
 On Jan 22, 2007, at 6:53 PM, Kenneth Marshall wrote:
 The default should
 be approximately the OS standard read-ahead amount.
 
 Is there anything resembling a standard across the OSes we support?  
 Better yet, is there a standard call that allows you to find out what  
 the read-ahead setting is?
 --
 Jim Nasby[EMAIL PROTECTED]
 EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
 
Not that I am aware of. Even extending the relation by one additional
block can make a big difference in performance and should easily fall
within every read-ahead in use today. Or a GUC variable, that defaults
to a small power of 2 number of PostgreSQL blocks, with a default arrived
at by testing.

Ken

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-24 Thread Kenneth Marshall
On Mon, Jan 22, 2007 at 05:11:03PM -0600, Jim C. Nasby wrote:
 On Mon, Jan 22, 2007 at 12:17:39PM -0800, Ron Mayer wrote:
  Gregory Stark wrote:
   
   Actually no. A while back I did experiments to see how fast reading a file
   sequentially was compared to reading the same file sequentially but 
   skipping
   x% of the blocks randomly. The results were surprising (to me) and 
   depressing.
   The breakeven point was about 7%. [...]
   
   The theory online was that as long as you're reading one page from each 
   disk
   track you're going to pay the same seek overhead as reading the entire 
   track.
  
  Could one take advantage of this observation in designing the DSM?
  
  Instead of a separate bit representing every page, having each bit
  represent 20 or so pages might be a more useful unit.  It sounds
  like the time spent reading would be similar; while the bitmap
  would be significantly smaller.
 
 If we extended relations by more than one page at a time we'd probably
 have a better shot at the blocks on disk being contiguous and all read
 at the same time by the OS.
 -- 
 Jim Nasby[EMAIL PROTECTED]
 EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
 
Yes, most OS have some read-ahead  when reading a file from disk. Any
increment over 1 would be an improvement. If you used a counter with
a time-based decrement function, you could increase the amount that
the relation is extended based on temporal proximity. If you have
extended it several times recently, increase the size of the new
extension to reduce the overhead even further. The default should
be approximately the OS standard read-ahead amount.

Ken


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-24 Thread Tom Lane
Kenneth Marshall [EMAIL PROTECTED] writes:
 Not that I am aware of. Even extending the relation by one additional
 block can make a big difference in performance

Do you have any evidence to back up that assertion?

It seems a bit nontrivial to me --- not the extension part exactly, but
making sure that the space will get used promptly.  With the current
code the backend extending a relation will do subsequent inserts into
the block it just got, which is fine, but there's no mechanism for
remembering that any other newly-added blocks are available --- unless
you wanted to push them into the FSM, which could work but the current
FSM code doesn't support piecemeal addition of space, and in any case
there's some question in my mind about the concurrency cost of increasing
FSM traffic even more.

In short, it's hardly an unquestionable improvement, so we need some
evidence.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-23 Thread Jim Nasby

On Jan 22, 2007, at 6:53 PM, Kenneth Marshall wrote:

The default should
be approximately the OS standard read-ahead amount.


Is there anything resembling a standard across the OSes we support?  
Better yet, is there a standard call that allows you to find out what  
the read-ahead setting is?

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Simon Riggs
On Sun, 2007-01-21 at 14:26 -0600, Jim C. Nasby wrote:
 On Sun, Jan 21, 2007 at 11:39:45AM +, Heikki Linnakangas wrote:
  Russell Smith wrote:
  Strange idea that I haven't researched,  Given Vacuum can't be run in a 
  transaction, it is possible at a certain point to quit the current 
  transaction and start another one.  There has been much chat and now a 
  TODO item about allowing multiple vacuums to not starve small tables.  
  But if a big table has a long running vacuum the vacuum of the small 
  table won't be effective anyway will it?  If vacuum of a big table was 
  done in multiple transactions you could reduce the effect of long 
  running vacuum.  I'm not sure how this effects the rest of the system 
  thought.
  
  That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to 
  ignore other vacuums in the oldest xmin calculation.
 
 And IIRC in 8.1 every time vacuum finishes a pass over the indexes it
 will commit and start a new transaction. 

err...It doesn't do this now and IIRC didn't do that in 8.1 either.

 That's still useful even with
 Hannu's patch in case you start a vacuum with maintenance_work_mem too
 small; you can abort the vacuum some time later and at least some of the
 work it's done will get committed.

True, but not recommended, though for a variety of reasons.

The reason is not intermediate commits, but just that the work of VACUUM
is mostly non-transactional in nature, apart from the various catalog
entries when it completes.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Heikki Linnakangas

Russell Smith wrote:
2. Index cleanup is the most expensive part of vacuum.  So doing a 
partial vacuum actually means more I/O as you have to do index cleanup 
more often.


I don't think that's usually the case. Index(es) are typically only a 
fraction of the size of the table, and since 8.2 we do index vacuums in 
a single scan in physical order. In fact, in many applications the index 
is be mostly cached and the index scan doesn't generate any I/O at all.


I believe the heap scans are the biggest issue at the moment.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Russell Smith wrote:
  2. Index cleanup is the most expensive part of vacuum.  So doing a 
  partial vacuum actually means more I/O as you have to do index cleanup 
  more often.
 
 I don't think that's usually the case. Index(es) are typically only a 
 fraction of the size of the table, and since 8.2 we do index vacuums in 
 a single scan in physical order. In fact, in many applications the index 
 is be mostly cached and the index scan doesn't generate any I/O at all.

Are _all_ the indexes cached?  I would doubt that.  Also, for typical
table, what percentage is the size of all indexes combined?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Heikki Linnakangas

Bruce Momjian wrote:

Heikki Linnakangas wrote:

Russell Smith wrote:
2. Index cleanup is the most expensive part of vacuum.  So doing a 
partial vacuum actually means more I/O as you have to do index cleanup 
more often.
I don't think that's usually the case. Index(es) are typically only a 
fraction of the size of the table, and since 8.2 we do index vacuums in 
a single scan in physical order. In fact, in many applications the index 
is be mostly cached and the index scan doesn't generate any I/O at all.


Are _all_ the indexes cached?  I would doubt that.


Well, depends on your schema, of course. In many applications, yes.


 Also, for typical
table, what percentage is the size of all indexes combined?


Well, there's no such thing as a typical table. As an anecdote here's 
the ratios (total size of all indexes of a table)/(size of corresponding 
heap) for the bigger tables for a DBT-2 run I have at hand:


Stock:  1190470/68550 = 6%
Order_line: 950103/274372 = 29%
Customer:   629011 /(5711+20567) = 8%

In any case, for the statement Index cleanup is the most expensive part 
of vacuum to be true, you're indexes would have to take up 2x as much 
space as the heap, since the heap is scanned twice. I'm sure there's 
databases like that out there, but I don't think it's the common case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Bruce Momjian wrote:
  Heikki Linnakangas wrote:
  Russell Smith wrote:
  2. Index cleanup is the most expensive part of vacuum.  So doing a 
  partial vacuum actually means more I/O as you have to do index cleanup 
  more often.
  I don't think that's usually the case. Index(es) are typically only a 
  fraction of the size of the table, and since 8.2 we do index vacuums in 
  a single scan in physical order. In fact, in many applications the index 
  is be mostly cached and the index scan doesn't generate any I/O at all.
  
  Are _all_ the indexes cached?  I would doubt that.
 
 Well, depends on your schema, of course. In many applications, yes.
 
   Also, for typical
  table, what percentage is the size of all indexes combined?
 
 Well, there's no such thing as a typical table. As an anecdote here's 
 the ratios (total size of all indexes of a table)/(size of corresponding 
 heap) for the bigger tables for a DBT-2 run I have at hand:
 
 Stock:1190470/68550 = 6%
 Order_line:   950103/274372 = 29%
 Customer: 629011 /(5711+20567) = 8%
 
 In any case, for the statement Index cleanup is the most expensive part 
 of vacuum to be true, you're indexes would have to take up 2x as much 
 space as the heap, since the heap is scanned twice. I'm sure there's 
 databases like that out there, but I don't think it's the common case.

I agree it index cleanup isn't  50% of vacuum.  I was trying to figure
out how small, and it seems about 15% of the total table, which means if
we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
80%, assuming 5% of the table is scanned.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Simon Riggs
On Mon, 2007-01-22 at 12:18 -0500, Bruce Momjian wrote:
 Heikki Linnakangas wrote:
  
  In any case, for the statement Index cleanup is the most expensive part 
  of vacuum to be true, you're indexes would have to take up 2x as much 
  space as the heap, since the heap is scanned twice. I'm sure there's 
  databases like that out there, but I don't think it's the common case.
 
 I agree it index cleanup isn't  50% of vacuum.  I was trying to figure
 out how small, and it seems about 15% of the total table, which means if
 we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
 80%, assuming 5% of the table is scanned.

Clearly keeping track of what needs vacuuming will lead to a more
efficient VACUUM. Your math applies to *any* design that uses some form
of book-keeping to focus in on the hot spots.

On a separate thread, Heikki has raised a different idea for VACUUM.

Heikki's idea asks an important question: where and how should DSM
information be maintained? Up to now everybody has assumed that it would
be maintained when DML took place and that the DSM would be a
transactional data structure (i.e. on-disk). Heikki's idea requires
similar bookkeeping requirements to the original DSM concept, but the
interesting aspect is that the DSM information is collected off-line,
rather than being an overhead on every statement's response time.

That idea seems extremely valuable to me.

One of the main challenges is how we cope with large tables that have a
very fine spray of updates against them. A DSM bitmap won't help with
that situation, regrettably.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 I agree it index cleanup isn't  50% of vacuum.  I was trying to figure
 out how small, and it seems about 15% of the total table, which means if
 we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
 80%, assuming 5% of the table is scanned.

Actually no. A while back I did experiments to see how fast reading a file
sequentially was compared to reading the same file sequentially but skipping
x% of the blocks randomly. The results were surprising (to me) and depressing.
The breakeven point was about 7%.

That is, if you assum that only 5% of the table will be scanned and you
arrange to do it sequentially then you should expect the i/o to be marginally
faster than just reading the entire table. Vacuum does do some cpu work and
wouldn't have to consult the clog as often, so it would still be somewhat
faster.

The theory online was that as long as you're reading one page from each disk
track you're going to pay the same seek overhead as reading the entire track.
I also had some theories involving linux being confused by the seeks and
turning off read-ahead but I could never prove them.

In short, to see big benefits you would have to have a much smaller percentage
of the table being read. That shouldn't be taken to mean that the DSM is a
loser. There are plenty of use cases where tables can be extremely large and
have only very small percentages that are busy. The big advantage of the DSM
is that it takes the size of the table out of the equation and replaces it
with the size of the busy portion of the table. So updating a single record in
a terabyte table has the same costs as updating a single record in a kilobyte
table.

Sadly that's not quite true due to indexes, and due to the size of the bitmap
itself. But going back to your numbers it does mean that if you update a
single row out of a terabyte table then we'll be removing about 85% of the i/o
(minus the i/o needed to read the DSM, about .025%). If you update about 1%
then you would be removing substantially less, and once you get to about 10%
then you're back where you started.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Bruce Momjian

Yep, agreed on the random I/O issue.  The larger question is if you have
a huge table, do you care to reclaim 3% of the table size, rather than
just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
to take a lot of time, but vacuuming to relaim 3% three times seems like
it is going to be more expensive than just vacuuming the 10% once.  And
vacuuming to reclaim 1% ten times seems even more expensive.  The
partial vacuum idea is starting to look like a loser to me again.

---

Gregory Stark wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  I agree it index cleanup isn't  50% of vacuum.  I was trying to figure
  out how small, and it seems about 15% of the total table, which means if
  we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
  80%, assuming 5% of the table is scanned.
 
 Actually no. A while back I did experiments to see how fast reading a file
 sequentially was compared to reading the same file sequentially but skipping
 x% of the blocks randomly. The results were surprising (to me) and depressing.
 The breakeven point was about 7%.
 
 That is, if you assum that only 5% of the table will be scanned and you
 arrange to do it sequentially then you should expect the i/o to be marginally
 faster than just reading the entire table. Vacuum does do some cpu work and
 wouldn't have to consult the clog as often, so it would still be somewhat
 faster.
 
 The theory online was that as long as you're reading one page from each disk
 track you're going to pay the same seek overhead as reading the entire track.
 I also had some theories involving linux being confused by the seeks and
 turning off read-ahead but I could never prove them.
 
 In short, to see big benefits you would have to have a much smaller percentage
 of the table being read. That shouldn't be taken to mean that the DSM is a
 loser. There are plenty of use cases where tables can be extremely large and
 have only very small percentages that are busy. The big advantage of the DSM
 is that it takes the size of the table out of the equation and replaces it
 with the size of the busy portion of the table. So updating a single record in
 a terabyte table has the same costs as updating a single record in a kilobyte
 table.
 
 Sadly that's not quite true due to indexes, and due to the size of the bitmap
 itself. But going back to your numbers it does mean that if you update a
 single row out of a terabyte table then we'll be removing about 85% of the i/o
 (minus the i/o needed to read the DSM, about .025%). If you update about 1%
 then you would be removing substantially less, and once you get to about 10%
 then you're back where you started.
 
 
 -- 
   Gregory Stark
   EnterpriseDB  http://www.enterprisedb.com
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Gregory Stark

Bruce Momjian [EMAIL PROTECTED] writes:

 Yep, agreed on the random I/O issue.  The larger question is if you have
 a huge table, do you care to reclaim 3% of the table size, rather than
 just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
 to take a lot of time, but vacuuming to relaim 3% three times seems like
 it is going to be more expensive than just vacuuming the 10% once.  And
 vacuuming to reclaim 1% ten times seems even more expensive.  The
 partial vacuum idea is starting to look like a loser to me again.

Well the answer is of course that depends.

If you maintain the dead space at a steady state averaging 1.5% instead of 5%
your table is 3.33% smaller on average. If this is a DSS system that will
translate into running your queries 3.33% faster. It will take a lot of
vacuums before they hurt more than a 3%+ performance drop.

If it's an OLTP system the it's harder to figure. a 3.33% increase in data
density will translate to a higher cache hit rate but how much higher depends
on a lot of factors. In our experiments we actually got bigger boost in these
kinds of situations than the I expected (I expected comparable to the 3.33%
improvement). So it could be even more than 3.33%. But like said it depends.
If you already have the whole database cache you won't see any improvement. If
you are right on the cusp you could see a huge benefit.

It sounds like you're underestimating the performance drain 10% wasted space
has. If we found out that one routine was unnecessarily taking 10% of the cpu
time it would be an obvious focus of attention. 10% wasted space is going to
work out to about 10% of the i/o time.

It also sounds like we're still focused on the performance impact in absolute
terms. I'm much more interested in changing the performance characteristics so
they're predictable and scalable. It doesn't matter much if your 1kb table is
100% slower than necessary but it does matter if your 1TB table needs 1,000x
as much vacuuming as your 1GB table even if it's getting the same update
traffic.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Simon Riggs
On Mon, 2007-01-22 at 13:27 -0500, Bruce Momjian wrote:
 Yep, agreed on the random I/O issue.  The larger question is if you have
 a huge table, do you care to reclaim 3% of the table size, rather than
 just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
 to take a lot of time, but vacuuming to relaim 3% three times seems like
 it is going to be more expensive than just vacuuming the 10% once.  And
 vacuuming to reclaim 1% ten times seems even more expensive.  The
 partial vacuum idea is starting to look like a loser to me again.

Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread...

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Alvaro Herrera
Bruce Momjian wrote:
 
 Yep, agreed on the random I/O issue.  The larger question is if you have
 a huge table, do you care to reclaim 3% of the table size, rather than
 just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
 to take a lot of time, but vacuuming to relaim 3% three times seems like
 it is going to be more expensive than just vacuuming the 10% once.  And
 vacuuming to reclaim 1% ten times seems even more expensive.  The
 partial vacuum idea is starting to look like a loser to me again.

But if the partial vacuum is able to clean the busiest pages and reclaim
useful space, currently-running transactions will be able to use that
space and thus not have to extend the table.  Not that extension is a
problem on itself, but it'll keep your working set smaller.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Richard Huxton

Bruce Momjian wrote:

Yep, agreed on the random I/O issue.  The larger question is if you have
a huge table, do you care to reclaim 3% of the table size, rather than
just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
to take a lot of time, but vacuuming to relaim 3% three times seems like
it is going to be more expensive than just vacuuming the 10% once.  And
vacuuming to reclaim 1% ten times seems even more expensive.  The
partial vacuum idea is starting to look like a loser to me again.


Buying a house with a 25-year mortgage is much more expensive than just 
paying cash too, but you don't always have a choice.


Surely the key benefit of the partial vacuuming thing is that you can at 
least do something useful with a large table if a full vacuum takes 24 
hours and you only have 4 hours of idle I/O.


It's also occurred to me that all the discussion of scheduling way back 
when isn't directly addressing the issue. What most people want (I'm 
guessing) is to vacuum *when the user-workload allows* and the 
time-tabling is just a sysadmin first-approximation at that.


With partial vacuuming possible, we can arrange things with just three 
thresholds and two measurements:

  Measurement 1 = system workload
  Measurement 2 = a per-table requires vacuuming value
  Threshold 1 = workload at which we do more vacuuming
  Threshold 2 = workload at which we do less vacuuming
  Threshold 3 = point at which a table is considered worth vacuuming.
Once every 10 seconds, the manager compares the current workload to the 
thresholds and starts a new vacuum, kills one or does nothing. New 
vacuum processes keep getting started as long as there is workload spare 
and tables that need vacuuming.


Now the trick of course is how you measure system workload in a 
meaningful manner.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Heikki Linnakangas

Gregory Stark wrote:

Bruce Momjian [EMAIL PROTECTED] writes:


I agree it index cleanup isn't  50% of vacuum.  I was trying to figure
out how small, and it seems about 15% of the total table, which means if
we have bitmap vacuum, we can conceivably reduce vacuum load by perhaps
80%, assuming 5% of the table is scanned.


Actually no. A while back I did experiments to see how fast reading a file
sequentially was compared to reading the same file sequentially but skipping
x% of the blocks randomly. The results were surprising (to me) and depressing.
The breakeven point was about 7%.


Note that with uniformly random updates, you have dirtied every page of 
the table until you get anywhere near 5% of dead space. So we have to 
assume non-uniform distribution of update for the DSM to be of any help.


And if we assume non-uniform distribution, it's a good bet that the 
blocks that need vacuuming are also not randomly distributed. In fact, 
they might very well all be in one cluster, so that scanning that 
cluster is indeed sequential I/O.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Heikki Linnakangas

Kenneth Marshall wrote:

On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote:

Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread...


There may be other functions that could leverage a similar sort of
infrastructure. For example, a long DB mining query could be registered
with the system. Then as the pieces of the table/database are brought in
to shared memory during the normal daily DB activity they can be acquired
without forcing the DB to run a very I/O expensive query when waiting a
bit for the results would be acceptable. As long as we are thinking
piggyback.


Yeah, I had the same idea when we discussed synchronizing sequential 
scans. The biggest difference is that with queries, there's often a user 
waiting for the query to finish, but with vacuum we don't care so much 
how long it takes.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian wrote:
  
  Yep, agreed on the random I/O issue.  The larger question is if you have
  a huge table, do you care to reclaim 3% of the table size, rather than
  just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
  to take a lot of time, but vacuuming to relaim 3% three times seems like
  it is going to be more expensive than just vacuuming the 10% once.  And
  vacuuming to reclaim 1% ten times seems even more expensive.  The
  partial vacuum idea is starting to look like a loser to me again.
 
 But if the partial vacuum is able to clean the busiest pages and reclaim
 useful space, currently-running transactions will be able to use that
 space and thus not have to extend the table.  Not that extension is a
 problem on itself, but it'll keep your working set smaller.

Yes, but my point is that if you are trying to avoid vacuuming the
table, I am afraid the full index scan is going to be painful too.  I
can see corner cases where partial vacuum is a win (I only have 4 hours
of idle I/O), but for the general case I am still worried that partial
vacuum will not be that useful as long as we have to scan the indexes.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Ron Mayer
Gregory Stark wrote:
 
 Actually no. A while back I did experiments to see how fast reading a file
 sequentially was compared to reading the same file sequentially but skipping
 x% of the blocks randomly. The results were surprising (to me) and depressing.
 The breakeven point was about 7%. [...]
 
 The theory online was that as long as you're reading one page from each disk
 track you're going to pay the same seek overhead as reading the entire track.

Could one take advantage of this observation in designing the DSM?

Instead of a separate bit representing every page, having each bit
represent 20 or so pages might be a more useful unit.  It sounds
like the time spent reading would be similar; while the bitmap
would be significantly smaller.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Joris Dobbelsteen
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Gregory Stark
Sent: maandag 22 januari 2007 19:41
To: Bruce Momjian
Cc: Heikki Linnakangas; Russell Smith; Darcy Buskermolen; 
Simon Riggs; Alvaro Herrera; Matthew T. O'Connor; Pavan 
Deolasee; Christopher Browne; pgsql-general@postgresql.org; 
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [GENERAL] Autovacuum Improvements

Bruce Momjian [EMAIL PROTECTED] writes:

 Yep, agreed on the random I/O issue.  The larger question is if you 
 have a huge table, do you care to reclaim 3% of the table 
size, rather 
 than just vacuum it when it gets to 10% dirty?  I realize the vacuum 
 is going to take a lot of time, but vacuuming to relaim 3% 
three times 
 seems like it is going to be more expensive than just vacuuming the 
 10% once.  And vacuuming to reclaim 1% ten times seems even more 
 expensive.  The partial vacuum idea is starting to look like 
a loser to me again.

Well the answer is of course that depends.

If you maintain the dead space at a steady state averaging 
1.5% instead of 5% your table is 3.33% smaller on average. If 
this is a DSS system that will translate into running your 
queries 3.33% faster. It will take a lot of vacuums before 
they hurt more than a 3%+ performance drop.

Good, this means a DSS system will mostly do table scans (right?). So
probably you should witness the 'table scan' statistic and rows fetched
aproaching the end of the universe (at least compared to
inserts/updates/deletes)?

If it's an OLTP system the it's harder to figure. a 3.33% 
increase in data density will translate to a higher cache hit 
rate but how much higher depends on a lot of factors. In our 
experiments we actually got bigger boost in these kinds of 
situations than the I expected (I expected comparable to the 
3.33% improvement). So it could be even more than 3.33%. But 
like said it depends.
If you already have the whole database cache you won't see any 
improvement. If you are right on the cusp you could see a huge benefit.

These tables have high insert, update and delete rates, probably a lot
of index scans? I believe the workload on table scans should be (close
to) none.

Are you willing to share some of this measured data? I'm quite
interested in such figures.

It sounds like you're underestimating the performance drain 
10% wasted space has. If we found out that one routine was 
unnecessarily taking 10% of the cpu time it would be an 
obvious focus of attention. 10% wasted space is going to work 
out to about 10% of the i/o time.

It also sounds like we're still focused on the performance 
impact in absolute terms. I'm much more interested in changing 
the performance characteristics so they're predictable and 
scalable. It doesn't matter much if your 1kb table is 100% 
slower than necessary but it does matter if your 1TB table 
needs 1,000x as much vacuuming as your 1GB table even if it's 
getting the same update traffic.

Or rather, the vacuuming should pay back.
A nice metric might be: cost_of_not_vacuuming / cost_of_vacuuming.
Obviously, the higher the better.

- Joris Dobbelsteen

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Kenneth Marshall
On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote:
 On Mon, 2007-01-22 at 13:27 -0500, Bruce Momjian wrote:
  Yep, agreed on the random I/O issue.  The larger question is if you have
  a huge table, do you care to reclaim 3% of the table size, rather than
  just vacuum it when it gets to 10% dirty?  I realize the vacuum is going
  to take a lot of time, but vacuuming to relaim 3% three times seems like
  it is going to be more expensive than just vacuuming the 10% once.  And
  vacuuming to reclaim 1% ten times seems even more expensive.  The
  partial vacuum idea is starting to look like a loser to me again.
 
 Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread...
 
 -- 
   Simon Riggs 
   EnterpriseDB   http://www.enterprisedb.com
 

There may be other functions that could leverage a similar sort of
infrastructure. For example, a long DB mining query could be registered
with the system. Then as the pieces of the table/database are brought in
to shared memory during the normal daily DB activity they can be acquired
without forcing the DB to run a very I/O expensive query when waiting a
bit for the results would be acceptable. As long as we are thinking
piggyback.

Ken


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Kenneth Marshall
On Mon, Jan 22, 2007 at 07:24:20PM +, Heikki Linnakangas wrote:
 Kenneth Marshall wrote:
 On Mon, Jan 22, 2007 at 06:42:09PM +, Simon Riggs wrote:
 Hold that thought! Read Heikki's Piggyback VACUUM idea on new thread...
 
 There may be other functions that could leverage a similar sort of
 infrastructure. For example, a long DB mining query could be registered
 with the system. Then as the pieces of the table/database are brought in
 to shared memory during the normal daily DB activity they can be acquired
 without forcing the DB to run a very I/O expensive query when waiting a
 bit for the results would be acceptable. As long as we are thinking
 piggyback.
 
 Yeah, I had the same idea when we discussed synchronizing sequential 
 scans. The biggest difference is that with queries, there's often a user 
 waiting for the query to finish, but with vacuum we don't care so much 
 how long it takes.
 
Yes, but with trending and statistical analysis you may not need the
exact answer ASAP. An approximate answer based on a fraction of the
information would be useful. Also, what if queries could be run without
impacting the production uses of a database. One might imagine having a
query with results that converge as the table is processed during normal
use.

Ken

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Jim C. Nasby
On Mon, Jan 22, 2007 at 12:17:39PM -0800, Ron Mayer wrote:
 Gregory Stark wrote:
  
  Actually no. A while back I did experiments to see how fast reading a file
  sequentially was compared to reading the same file sequentially but skipping
  x% of the blocks randomly. The results were surprising (to me) and 
  depressing.
  The breakeven point was about 7%. [...]
  
  The theory online was that as long as you're reading one page from each disk
  track you're going to pay the same seek overhead as reading the entire 
  track.
 
 Could one take advantage of this observation in designing the DSM?
 
 Instead of a separate bit representing every page, having each bit
 represent 20 or so pages might be a more useful unit.  It sounds
 like the time spent reading would be similar; while the bitmap
 would be significantly smaller.

If we extended relations by more than one page at a time we'd probably
have a better shot at the blocks on disk being contiguous and all read
at the same time by the OS.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Steve Atkins


On Jan 22, 2007, at 11:16 AM, Richard Huxton wrote:


Bruce Momjian wrote:
Yep, agreed on the random I/O issue.  The larger question is if  
you have
a huge table, do you care to reclaim 3% of the table size, rather  
than
just vacuum it when it gets to 10% dirty?  I realize the vacuum is  
going
to take a lot of time, but vacuuming to relaim 3% three times  
seems like
it is going to be more expensive than just vacuuming the 10%  
once.  And

vacuuming to reclaim 1% ten times seems even more expensive.  The
partial vacuum idea is starting to look like a loser to me again.


Buying a house with a 25-year mortgage is much more expensive than  
just paying cash too, but you don't always have a choice.


Surely the key benefit of the partial vacuuming thing is that you  
can at least do something useful with a large table if a full  
vacuum takes 24 hours and you only have 4 hours of idle I/O.


It's also occurred to me that all the discussion of scheduling way  
back when isn't directly addressing the issue. What most people  
want (I'm guessing) is to vacuum *when the user-workload allows*  
and the time-tabling is just a sysadmin first-approximation at that.


Yup. I'd really like for my app to be able to say Hmm. No  
interactive users at the moment, no critical background tasks. Now  
would be a really good time for the DB to do some maintenance. but  
also to be able to interrupt the maintenance process if some new  
users or other system load show up.


With partial vacuuming possible, we can arrange things with just  
three thresholds and two measurements:

  Measurement 1 = system workload
  Measurement 2 = a per-table requires vacuuming value
  Threshold 1 = workload at which we do more vacuuming
  Threshold 2 = workload at which we do less vacuuming
  Threshold 3 = point at which a table is considered worth vacuuming.
Once every 10 seconds, the manager compares the current workload to  
the thresholds and starts a new vacuum, kills one or does nothing.  
New vacuum processes keep getting started as long as there is  
workload spare and tables that need vacuuming.


Now the trick of course is how you measure system workload in a  
meaningful manner.


I'd settle for a start maintenance, stop maintenance API.  
Anything else (for instance the heuristics you suggest above) would  
definitely be gravy.


It's not going to be simple to do, though, I don't think.

Cheers,
  Steve


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-22 Thread Martijn van Oosterhout
On Mon, Jan 22, 2007 at 05:51:53PM +, Gregory Stark wrote:
 Actually no. A while back I did experiments to see how fast reading a file
 sequentially was compared to reading the same file sequentially but skipping
 x% of the blocks randomly. The results were surprising (to me) and depressing.
 The breakeven point was about 7%.

I asusume this means you were reading 7% of the blocks, not skipping 7%
of the blocks when you broke even?

I presume by break-even you mean it took just as long, time-wise. But
did it have the same effect on system load? If reading only 7% of the
blocks allows the drive to complete other requests more quickly then
it's beneficial, even if the vacuum takes longer.

This may be a silly thought, I'm not sure how drives handle multiple
requests...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-21 Thread Heikki Linnakangas

Russell Smith wrote:
Strange idea that I haven't researched,  Given Vacuum can't be run in a 
transaction, it is possible at a certain point to quit the current 
transaction and start another one.  There has been much chat and now a 
TODO item about allowing multiple vacuums to not starve small tables.  
But if a big table has a long running vacuum the vacuum of the small 
table won't be effective anyway will it?  If vacuum of a big table was 
done in multiple transactions you could reduce the effect of long 
running vacuum.  I'm not sure how this effects the rest of the system 
thought.


That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to 
ignore other vacuums in the oldest xmin calculation.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-21 Thread Simon Riggs
On Sat, 2007-01-20 at 09:41 +1100, Russell Smith wrote:
 Darcy Buskermolen wrote: 
  [snip] 
  
  Another thought, is it at all possible to do a partial vacuum?  ie spend 
  the 
  next 30 minutes vacuuming foo table, and update the fsm with what hew have 
  learned over the 30 mins, even if we have not done a full table scan ?

 There was a proposal for this, but it was dropped on 2 grounds.
 1. partial vacuum would mean that parts of the table are missed, the
 user could never vacuum certain parts and transaction wraparound would
 get you.  You may also have other performance issues as you forgot
 certian parts of the table

Partial vacuum would still be possible if you remembered where you got
to in the VACUUM and then started from that same point next time. It
could then go to the end of the table and wrap back around.

 2. Index cleanup is the most expensive part of vacuum.  So doing a
 partial vacuum actually means more I/O as you have to do index cleanup
 more often.

Again, not necessarily. A large VACUUM can currently perform more than
one set of index scans, so if you chose the right stopping place for a
partial VACUUM you need never incur any additional work. It might even
save effort in the long run.

I'm not necessarily advocating partial VACUUM, just pointing out that
the problems you raise need not be barriers to implementation, should
that be considered worthwhile.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-21 Thread Martijn van Oosterhout
On Sun, Jan 21, 2007 at 12:24:38PM +, Simon Riggs wrote:
 Partial vacuum would still be possible if you remembered where you got
 to in the VACUUM and then started from that same point next time. It
 could then go to the end of the table and wrap back around.

ISTM the Dead Space Map would give you this automatically, since that's
your memory... Once you have the DSM to track where the dead pages are,
you can set it up to target clusters first, thus giving maximum bang
for buck.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-21 Thread Jim C. Nasby
On Sun, Jan 21, 2007 at 11:39:45AM +, Heikki Linnakangas wrote:
 Russell Smith wrote:
 Strange idea that I haven't researched,  Given Vacuum can't be run in a 
 transaction, it is possible at a certain point to quit the current 
 transaction and start another one.  There has been much chat and now a 
 TODO item about allowing multiple vacuums to not starve small tables.  
 But if a big table has a long running vacuum the vacuum of the small 
 table won't be effective anyway will it?  If vacuum of a big table was 
 done in multiple transactions you could reduce the effect of long 
 running vacuum.  I'm not sure how this effects the rest of the system 
 thought.
 
 That was fixed by Hannu Krosing's patch in 8.2 that made vacuum to 
 ignore other vacuums in the oldest xmin calculation.

And IIRC in 8.1 every time vacuum finishes a pass over the indexes it
will commit and start a new transaction. That's still useful even with
Hannu's patch in case you start a vacuum with maintenance_work_mem too
small; you can abort the vacuum some time later and at least some of the
work it's done will get committed.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-19 Thread Simon Riggs
On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote:
 On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote:
  elein wrote:
   Have you made any consideration of providing feedback on autovacuum to
   users? Right now we don't even know what tables were vacuumed when and
   what was reaped.  This might actually be another topic.
 
  I'd like to hear other people's opinions on Darcy Buskermolen proposal
  to have a log table, on which we'd register what did we run, at what
  time, how long did it last, how many tuples did it clean, etc.  I feel
  having it on the regular text log is useful but it's not good enough.
  Keep in mind that in the future we may want to peek at that collected
  information to be able to take better scheduling decisions (or at least
  inform the DBA that he sucks).
 
  Now, I'd like this to be a VACUUM thing, not autovacuum.  That means
  that manually-run vacuums would be logged as well.
 
 Yes I did intend this thought for vacuum, not strictly autovacuum.

I agree, for all VACUUMs: we need a log table.

The only way we can get a feedback loop on what has come before is by
remembering what happened. Simply logging it is interesting, but not
enough.

There is some complexity there, because with many applications a small
table gets VACUUMed every few minutes, so the log table would become a
frequently updated table itself. I'd also suggest that we might want to
take account of the number of tuples removed by btree pre-split VACUUMs
also.

I also like the idea of a single scheduler and multiple child workers.

The basic architecture is clear and obviously beneficial. What worries
me is how the scheduler will work; there seems to be as many ideas as we
have hackers. I'm wondering if we should provide the facility of a
pluggable scheduler? That way you'd be able to fine tune the schedule to
both the application and to the business requirements. That would allow
integration with external workflow engines and job schedulers, for when
VACUUMs need to not-conflict with external events.

If no scheduler has been defined, just use a fairly simple default.

The three main questions are
- what is the maximum size of VACUUM that can start *now*
- can *this* VACUUM start now?
- which is the next VACUUM to run?

If we have an API that allows those 3 questions to be asked, then a
scheduler plug-in could supply the answers. That way any complex
application rules (table A is available for VACUUM now for next 60 mins,
table B is in constant use so we must use vacuum_delay), external events
(long running reports have now finished, OK to VACUUM), time-based rules
(e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime,
first 3 days of the each month is financial accounting close) can be
specified. 


-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-19 Thread Darcy Buskermolen
On Friday 19 January 2007 01:47, Simon Riggs wrote:
 On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote:
  On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote:
   elein wrote:
Have you made any consideration of providing feedback on autovacuum
to users? Right now we don't even know what tables were vacuumed when
and what was reaped.  This might actually be another topic.
  
   I'd like to hear other people's opinions on Darcy Buskermolen proposal
   to have a log table, on which we'd register what did we run, at what
   time, how long did it last, how many tuples did it clean, etc.  I feel
   having it on the regular text log is useful but it's not good enough.
   Keep in mind that in the future we may want to peek at that collected
   information to be able to take better scheduling decisions (or at least
   inform the DBA that he sucks).
  
   Now, I'd like this to be a VACUUM thing, not autovacuum.  That means
   that manually-run vacuums would be logged as well.
 
  Yes I did intend this thought for vacuum, not strictly autovacuum.

 I agree, for all VACUUMs: we need a log table.

 The only way we can get a feedback loop on what has come before is by
 remembering what happened. Simply logging it is interesting, but not
 enough.

Correct, I think we are all saying the same thing that is this log table is 
purely inserts so that we can see trends over time.


 There is some complexity there, because with many applications a small
 table gets VACUUMed every few minutes, so the log table would become a
 frequently updated table itself. I'd also suggest that we might want to
 take account of the number of tuples removed by btree pre-split VACUUMs
 also.

Thinking on this a bit more, I suppose that this table really should allow for 
user defined triggers on it, so that a DBA can create partioning for it, not 
to mention being able to move it off into it's own tablespace. 



 I also like the idea of a single scheduler and multiple child workers.

 The basic architecture is clear and obviously beneficial. What worries
 me is how the scheduler will work; there seems to be as many ideas as we
 have hackers. I'm wondering if we should provide the facility of a
 pluggable scheduler? That way you'd be able to fine tune the schedule to
 both the application and to the business requirements. That would allow
 integration with external workflow engines and job schedulers, for when
 VACUUMs need to not-conflict with external events.

 If no scheduler has been defined, just use a fairly simple default.

 The three main questions are
 - what is the maximum size of VACUUM that can start *now*

How can we determine this given we have no real knowledge of the upcoming  
adverse IO conditions ?

 - can *this* VACUUM start now?
 - which is the next VACUUM to run?

 If we have an API that allows those 3 questions to be asked, then a
 scheduler plug-in could supply the answers. That way any complex
 application rules (table A is available for VACUUM now for next 60 mins,
 table B is in constant use so we must use vacuum_delay), external events
 (long running reports have now finished, OK to VACUUM), time-based rules
 (e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime,
 first 3 days of the each month is financial accounting close) can be
 specified.

Another thought, is it at all possible to do a partial vacuum?  ie spend the 
next 30 minutes vacuuming foo table, and update the fsm with what hew have 
learned over the 30 mins, even if we have not done a full table scan ?


-- 


Darcy Buskermolen
The PostgreSQL company, Command Prompt Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-19 Thread Bruce Momjian

Added to TODO:

   o Allow multiple vacuums so large tables do not starve small
 tables

 http://archives.postgresql.org/pgsql-general/2007-01/msg00031.php

   o Improve control of auto-vacuum

 http://archives.postgresql.org/pgsql-hackers/2006-12/msg00876.php


---

Darcy Buskermolen wrote:
 On Friday 19 January 2007 01:47, Simon Riggs wrote:
  On Tue, 2007-01-16 at 07:16 -0800, Darcy Buskermolen wrote:
   On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote:
elein wrote:
 Have you made any consideration of providing feedback on autovacuum
 to users? Right now we don't even know what tables were vacuumed when
 and what was reaped.  This might actually be another topic.
   
I'd like to hear other people's opinions on Darcy Buskermolen proposal
to have a log table, on which we'd register what did we run, at what
time, how long did it last, how many tuples did it clean, etc.  I feel
having it on the regular text log is useful but it's not good enough.
Keep in mind that in the future we may want to peek at that collected
information to be able to take better scheduling decisions (or at least
inform the DBA that he sucks).
   
Now, I'd like this to be a VACUUM thing, not autovacuum.  That means
that manually-run vacuums would be logged as well.
  
   Yes I did intend this thought for vacuum, not strictly autovacuum.
 
  I agree, for all VACUUMs: we need a log table.
 
  The only way we can get a feedback loop on what has come before is by
  remembering what happened. Simply logging it is interesting, but not
  enough.
 
 Correct, I think we are all saying the same thing that is this log table is 
 purely inserts so that we can see trends over time.
 
 
  There is some complexity there, because with many applications a small
  table gets VACUUMed every few minutes, so the log table would become a
  frequently updated table itself. I'd also suggest that we might want to
  take account of the number of tuples removed by btree pre-split VACUUMs
  also.
 
 Thinking on this a bit more, I suppose that this table really should allow 
 for 
 user defined triggers on it, so that a DBA can create partioning for it, not 
 to mention being able to move it off into it's own tablespace. 
 
 
 
  I also like the idea of a single scheduler and multiple child workers.
 
  The basic architecture is clear and obviously beneficial. What worries
  me is how the scheduler will work; there seems to be as many ideas as we
  have hackers. I'm wondering if we should provide the facility of a
  pluggable scheduler? That way you'd be able to fine tune the schedule to
  both the application and to the business requirements. That would allow
  integration with external workflow engines and job schedulers, for when
  VACUUMs need to not-conflict with external events.
 
  If no scheduler has been defined, just use a fairly simple default.
 
  The three main questions are
  - what is the maximum size of VACUUM that can start *now*
 
 How can we determine this given we have no real knowledge of the upcoming  
 adverse IO conditions ?
 
  - can *this* VACUUM start now?
  - which is the next VACUUM to run?
 
  If we have an API that allows those 3 questions to be asked, then a
  scheduler plug-in could supply the answers. That way any complex
  application rules (table A is available for VACUUM now for next 60 mins,
  table B is in constant use so we must use vacuum_delay), external events
  (long running reports have now finished, OK to VACUUM), time-based rules
  (e.g. first Sunday of the month 00:00 - 04:00 is scheduled downtime,
  first 3 days of the each month is financial accounting close) can be
  specified.
 
 Another thought, is it at all possible to do a partial vacuum?  ie spend the 
 next 30 minutes vacuuming foo table, and update the fsm with what hew have 
 learned over the 30 mins, even if we have not done a full table scan ?
 
 
 -- 
 
 
 Darcy Buskermolen
 The PostgreSQL company, Command Prompt Inc.
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-19 Thread Russell Smith

Darcy Buskermolen wrote:

[snip]

Another thought, is it at all possible to do a partial vacuum?  ie spend the 
next 30 minutes vacuuming foo table, and update the fsm with what hew have 
learned over the 30 mins, even if we have not done a full table scan ?
  

There was a proposal for this, but it was dropped on 2 grounds.
1. partial vacuum would mean that parts of the table are missed, the 
user could never vacuum certain parts and transaction wraparound would 
get you.  You may also have other performance issues as you forgot 
certian parts of the table


2. Index cleanup is the most expensive part of vacuum.  So doing a 
partial vacuum actually means more I/O as you have to do index cleanup 
more often.


If we are talking about autovacuum, 1 doesn't become so much of an issue 
as you just make the autovacuum remember what parts of the table it's 
vacuumed.  This really has great power when you have a dead space map.


Item 2 will still be an issue.  But if you define partial as either 
fill maintenance_work_mem, or finish the table, you are not increasing 
I/O at all.  As when maintenance work mem is full, you have to cleanup 
all the indexes anyway.  This is really more like VACUUM SINGLE, but the 
same principal applies.


I believe all planning really needs to think about how a dead space map 
will effect what vacuum is going to be doing in the future.



Strange idea that I haven't researched,  Given Vacuum can't be run in a 
transaction, it is possible at a certain point to quit the current 
transaction and start another one.  There has been much chat and now a 
TODO item about allowing multiple vacuums to not starve small tables.  
But if a big table has a long running vacuum the vacuum of the small 
table won't be effective anyway will it?  If vacuum of a big table was 
done in multiple transactions you could reduce the effect of long 
running vacuum.  I'm not sure how this effects the rest of the system 
thought.


Russell Smith


  




Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-17 Thread Ron Mayer
Matthew T. O'Connor wrote:
 Alvaro Herrera wrote:
 I'd like to hear other people's opinions on Darcy Buskermolen proposal
 to have a log table, on which we'd register what did we run, at what
 time, how long did it last, [...] 
 
 I think most people would just be happy if we could get autovacuum to
 log it's actions at a much higher log level.  I think that autovacuum
 vacuumed table x is important and shouldn't be all the way down at the
 debug level.

+1 here.  Even more than autovacuum vacuumed table x, I'd like to see
vacuum starting table x and vacuum done table x.  The reason I say
that is because the speculation autovacuum might have been running
is now a frequent phrase I hear when performance questions are asked.
If vacuum start and end times were logged at a much earlier level,
that feature plus log_min_duration_statement could easily disprove
the vacuum might have been running hypothesis.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-16 Thread Alvaro Herrera
Simon Riggs wrote:

 Perhaps we should focus on the issues that might result, so that we
 address those before we spend time on the details of the user interface.
 Can we deadlock or hang from running multiple autovacuums?

If you were to run multiple autovacuum processes the way they are today,
maybe.  But that's not my intention -- the launcher would be the only
one to read the catalogs; the workers would be started only to do a
single VACUUM job.  This reduces the risk of this kind of problems.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-16 Thread Alvaro Herrera
elein wrote:

 Have you made any consideration of providing feedback on autovacuum to users?
 Right now we don't even know what tables were vacuumed when and what was
 reaped.  This might actually be another topic.

I'd like to hear other people's opinions on Darcy Buskermolen proposal
to have a log table, on which we'd register what did we run, at what
time, how long did it last, how many tuples did it clean, etc.  I feel
having it on the regular text log is useful but it's not good enough.
Keep in mind that in the future we may want to peek at that collected
information to be able to take better scheduling decisions (or at least
inform the DBA that he sucks).

Now, I'd like this to be a VACUUM thing, not autovacuum.  That means
that manually-run vacuums would be logged as well.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-16 Thread Darcy Buskermolen
On Tuesday 16 January 2007 06:29, Alvaro Herrera wrote:
 elein wrote:
  Have you made any consideration of providing feedback on autovacuum to
  users? Right now we don't even know what tables were vacuumed when and
  what was reaped.  This might actually be another topic.

 I'd like to hear other people's opinions on Darcy Buskermolen proposal
 to have a log table, on which we'd register what did we run, at what
 time, how long did it last, how many tuples did it clean, etc.  I feel
 having it on the regular text log is useful but it's not good enough.
 Keep in mind that in the future we may want to peek at that collected
 information to be able to take better scheduling decisions (or at least
 inform the DBA that he sucks).

 Now, I'd like this to be a VACUUM thing, not autovacuum.  That means
 that manually-run vacuums would be logged as well.

Yes I did intend this thought for vacuum, not strictly autovacuum.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-16 Thread Matthew T. O'Connor

Alvaro Herrera wrote:

I'd like to hear other people's opinions on Darcy Buskermolen proposal
to have a log table, on which we'd register what did we run, at what
time, how long did it last, how many tuples did it clean, etc.  I feel
having it on the regular text log is useful but it's not good enough.
Keep in mind that in the future we may want to peek at that collected
information to be able to take better scheduling decisions (or at least
inform the DBA that he sucks).


I'm not familiar with his proposal, but I'm not sure what I think of 
logging vacuum (and perhaps analyze) commands to a table.  We have never 
logged anything to tables inside PG.  I would be worried about this 
eating a lot of space in some situations.


I think most people would just be happy if we could get autovacuum to 
log it's actions at a much higher log level.  I think that autovacuum 
vacuumed table x is important and shouldn't be all the way down at the 
debug level.


The other (more involved) solution to this problem was proposed which 
was create a separate set of logging control params for autovacuum so 
that you can turn it up or down independent of the general server logging.



Now, I'd like this to be a VACUUM thing, not autovacuum.  That means
that manually-run vacuums would be logged as well.


+1


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-15 Thread Pavan Deolasee

Simon Riggs wrote:
 On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
 Christopher Browne wrote:

 Seems to me that you could get ~80% of the way by having the simplest
 2 queue implementation, where tables with size  some threshold get
 thrown at the little table queue, and tables above that size go to
 the big table queue.

 That should keep any small tables from getting vacuum-starved.


This is exectly what I am trying, two process autovacuum and a GUC to
seperate small tables.

In this case, one process takes up vacuuming of the small tables and
other process vacuuming of the remaining tables as well as Xid
avoidance related vacuuming. The goal is to avoid starvation of small
tables when a large table is being vacuumed (which may take
several hours) without adding too much complexity to the code.


 Some feedback from initial testing is that 2 queues probably isn't
 enough. If you have tables with 100s of blocks and tables with millions
 of blocks, the tables in the mid-range still lose out. So I'm thinking
 that a design with 3 queues based upon size ranges, plus the idea that
 when a queue is empty it will scan for tables slightly above/below its
 normal range. That way we wouldn't need to specify the cut-offs with a
 difficult to understand new set of GUC parameters, define them exactly
 and then have them be wrong when databases grow.

 The largest queue would be the one reserved for Xid wraparound
 avoidance. No table would be eligible for more than one queue at a time,
 though it might change between queues as it grows.

 Alvaro, have you completed your design?

 Pavan, what are your thoughts?


IMO 2-queue is a good step forward, but in long term we may need to go
for a multiprocess autovacuum where the number and tasks of processes
are either demand based and/or user configurable.

Another idea is to vacuum the tables in round-robin fashion
where the quantum could be either time or number of block. The
autovacuum process would vacuum 'x' blocks of one table and then
schedule next table in the queue. This would avoid starvation of
small tables, though cost of index cleanup might go up because of
increased IO. Any thoughts of this approach ?

Thanks,
Pavan




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-15 Thread Alban Hertroys
Pavan Deolasee wrote:
 Simon Riggs wrote:
 On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
 Christopher Browne wrote:

 Seems to me that you could get ~80% of the way by having the simplest
 2 queue implementation, where tables with size  some threshold get
 thrown at the little table queue, and tables above that size go to
 the big table queue.

 That should keep any small tables from getting vacuum-starved.

 
 This is exectly what I am trying, two process autovacuum and a GUC to
 seperate small tables.
 
 In this case, one process takes up vacuuming of the small tables and
 other process vacuuming of the remaining tables as well as Xid
 avoidance related vacuuming. The goal is to avoid starvation of small
 tables when a large table is being vacuumed (which may take
 several hours) without adding too much complexity to the code.

Would it work to make the queues push the treshold into the direction of
the still running queue if the other queue finishes before the still
running one? This would achieve some kind of auto-tuning, but that is
usually tricky.

For example, what if one of the queues got stuck on a lock?

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-14 Thread Pavan Deolasee

Simon Riggs wrote:

On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:

Christopher Browne wrote:


Seems to me that you could get ~80% of the way by having the simplest
2 queue implementation, where tables with size  some threshold get
thrown at the little table queue, and tables above that size go to
the big table queue.

That should keep any small tables from getting vacuum-starved.




This is exectly what I am trying, two process autovacuum and a GUC to
seperate small tables.

In this case, one process takes up vacuuming of the small tables and
other process vacuuming of the remaining tables as well as Xid
avoidance related vacuuming. The goal is to avoid starvation of small
tables when a large table is being vacuumed (which may take
several hours) without adding too much complexity to the code.



Some feedback from initial testing is that 2 queues probably isn't
enough. If you have tables with 100s of blocks and tables with millions
of blocks, the tables in the mid-range still lose out. So I'm thinking
that a design with 3 queues based upon size ranges, plus the idea that
when a queue is empty it will scan for tables slightly above/below its
normal range. That way we wouldn't need to specify the cut-offs with a
difficult to understand new set of GUC parameters, define them exactly
and then have them be wrong when databases grow.

The largest queue would be the one reserved for Xid wraparound
avoidance. No table would be eligible for more than one queue at a time,
though it might change between queues as it grows.

Alvaro, have you completed your design?

Pavan, what are your thoughts?



IMO 2-queue is a good step forward, but in long term we may need to go
for a multiprocess autovacuum where the number and tasks of processes
are either demand based and/or user configurable.

Another idea is to vacuum the tables in round-robin fashion
where the quantum could be either time or number of block. The
autovacuum process would vacuum 'x' blocks of one table and then
schedule next table in the queue. This would avoid starvation of
small tables, though cost of index cleanup might go up because of
increased IO. Any thoughts of this approach ?

Thanks,
Pavan





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-13 Thread elein
On Fri, Jan 12, 2007 at 07:33:05PM -0300, Alvaro Herrera wrote:
 Simon Riggs wrote:
 
  Some feedback from initial testing is that 2 queues probably isn't
  enough. If you have tables with 100s of blocks and tables with millions
  of blocks, the tables in the mid-range still lose out. So I'm thinking
  that a design with 3 queues based upon size ranges, plus the idea that
  when a queue is empty it will scan for tables slightly above/below its
  normal range.
 
 Yeah, eventually it occurred to me the fact that as soon as you have 2
 queues, you may as well want to have 3 or in fact any number.  Which in
 my proposal is very easily achieved.
 
 
  Alvaro, have you completed your design?
 
 No, I haven't, and the part that's missing is precisely the queues
 stuff.  I think I've been delaying posting it for too long, and that is
 harmful because it makes other people waste time thinking on issues that
 I may already have resolved, and delays the bashing that yet others will
 surely inflict on my proposal, which is never a good thing ;-)  So maybe
 I'll put in a stub about the queues stuff and see how people like the
 whole thing.

Have you made any consideration of providing feedback on autovacuum to users?
Right now we don't even know what tables were vacuumed when and what was
reaped.  This might actually be another topic.

---elein
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-12 Thread Simon Riggs
On Fri, 2006-12-29 at 20:25 -0300, Alvaro Herrera wrote:
 Christopher Browne wrote:
 
  Seems to me that you could get ~80% of the way by having the simplest
  2 queue implementation, where tables with size  some threshold get
  thrown at the little table queue, and tables above that size go to
  the big table queue.
  
  That should keep any small tables from getting vacuum-starved.
 
 Hmm, would it make sense to keep 2 queues, one that goes through the
 tables in smaller-to-larger order, and the other one in the reverse
 direction?
 
 I am currently writing a design on how to create vacuum queues but I'm
 thinking that maybe it's getting too complex to handle, and a simple
 idea like yours is enough (given sufficient polish).

Sounds good to me. My colleague Pavan has just suggested multiple
autovacuums and then prototyped something almost as a side issue while
trying to solve other problems. I'll show him this entry, maybe he saw
it already? I wasn't following this discussion until now.

The 2 queue implementation seemed to me to be the most straightforward
implementation, mirroring Chris' suggestion. A few aspects that haven't
been mentioned are:
- if you have more than one VACUUM running, we'll need to watch memory
management. Having different queues based upon table size is a good way
of doing that, since the smaller queues have a naturally limited memory
consumption.
- with different size-based queues, the larger VACUUMs can be delayed so
they take much longer, while the small tables can go straight through

Some feedback from initial testing is that 2 queues probably isn't
enough. If you have tables with 100s of blocks and tables with millions
of blocks, the tables in the mid-range still lose out. So I'm thinking
that a design with 3 queues based upon size ranges, plus the idea that
when a queue is empty it will scan for tables slightly above/below its
normal range. That way we wouldn't need to specify the cut-offs with a
difficult to understand new set of GUC parameters, define them exactly
and then have them be wrong when databases grow.

The largest queue would be the one reserved for Xid wraparound
avoidance. No table would be eligible for more than one queue at a time,
though it might change between queues as it grows.

Alvaro, have you completed your design?

Pavan, what are your thoughts?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-12 Thread Alvaro Herrera
Simon Riggs wrote:

 Some feedback from initial testing is that 2 queues probably isn't
 enough. If you have tables with 100s of blocks and tables with millions
 of blocks, the tables in the mid-range still lose out. So I'm thinking
 that a design with 3 queues based upon size ranges, plus the idea that
 when a queue is empty it will scan for tables slightly above/below its
 normal range.

Yeah, eventually it occurred to me the fact that as soon as you have 2
queues, you may as well want to have 3 or in fact any number.  Which in
my proposal is very easily achieved.


 Alvaro, have you completed your design?

No, I haven't, and the part that's missing is precisely the queues
stuff.  I think I've been delaying posting it for too long, and that is
harmful because it makes other people waste time thinking on issues that
I may already have resolved, and delays the bashing that yet others will
surely inflict on my proposal, which is never a good thing ;-)  So maybe
I'll put in a stub about the queues stuff and see how people like the
whole thing.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-12 Thread Simon Riggs
On Fri, 2007-01-12 at 19:33 -0300, Alvaro Herrera wrote:

  Alvaro, have you completed your design?
 
 No, I haven't, and the part that's missing is precisely the queues
 stuff.  I think I've been delaying posting it for too long, and that is
 harmful because it makes other people waste time thinking on issues that
 I may already have resolved, and delays the bashing that yet others will
 surely inflict on my proposal, which is never a good thing ;-)  So maybe
 I'll put in a stub about the queues stuff and see how people like the
 whole thing.

I've not read a word spoken against the general idea, so I think we
should pursue this actively for 8.3. It should be straightforward to
harvest the good ideas, though there will definitely be many.

Perhaps we should focus on the issues that might result, so that we
address those before we spend time on the details of the user interface.
Can we deadlock or hang from running multiple autovacuums?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] Autovacuum Improvements

2007-01-12 Thread Christopher Browne
In an attempt to throw the authorities off his trail, [EMAIL PROTECTED] (Alvaro 
Herrera) transmitted:
 Simon Riggs wrote:

 Some feedback from initial testing is that 2 queues probably isn't
 enough. If you have tables with 100s of blocks and tables with
 millions of blocks, the tables in the mid-range still lose out. So
 I'm thinking that a design with 3 queues based upon size ranges,
 plus the idea that when a queue is empty it will scan for tables
 slightly above/below its normal range.

 Yeah, eventually it occurred to me the fact that as soon as you have
 2 queues, you may as well want to have 3 or in fact any number.
 Which in my proposal is very easily achieved.

Adding an extra attribute to reflect a different ordering or a
different policy allows having as many queues in one queue table as
you might need.

 Alvaro, have you completed your design?

 No, I haven't, and the part that's missing is precisely the queues
 stuff.  I think I've been delaying posting it for too long, and that
 is harmful because it makes other people waste time thinking on
 issues that I may already have resolved, and delays the bashing that
 yet others will surely inflict on my proposal, which is never a good
 thing ;-) So maybe I'll put in a stub about the queues stuff and
 see how people like the whole thing.

Seems like a good idea to me.

Implementing multiple queues amounts to having different worker
processes/threads that operate on the queue table using varying
policies.
-- 
output = reverse(gro.mca @ enworbbc)
http://linuxdatabases.info/info/lisp.html
Rules of the  Evil Overlord #60. My five-year-old  child advisor will
also  be asked to  decipher any  code I  am thinking  of using.  If he
breaks the code  in under 30 seconds, it will not  be used. Note: this
also applies to passwords. http://www.eviloverlord.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match