subject:"\[HACKERS\] Automatic free space map filling"

Re: [HACKERS] Automatic free space map filling

2006-05-03 Thread Hannu Krosing

Ühel kenal päeval, R, 2006-03-03 kell 11:37, kirjutas Tom Lane:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  So for you it would certainly help a lot to be able to vacuum the first
  X pages of the big table, stop, release locks, create new transaction,
  continue with the next X pages, lather, rinse, repeat.
 
  This is perfectly doable, it only needs enough motivation from a
  knowledgeable person.
 
 Bruce and I were discussing this the other day; it'd be pretty easy to
 make plain VACUUM start a fresh transaction immediately after it
 finishes a scan heap/clean indexes/clean heap cycle.  

Do you mean the full (scan heap/clean indexes/clean heap) cycle or some
smaller cycles inside each step ?

If you mean the full cycle, then it is probably not worth it, as even a
single 'clean index' pass can take hours on larger tables.

 The infrastructure
 for this (in particular, session-level locks that won't be lost by
 closing the xact) is all there.  You'd have to figure out how often to
 start a new xact ... every cycle is probably too often, at least for
 smaller maintenance_work_mem settings ... but it'd not be hard or
 involve any strange changes in system semantics.

---
Hannu



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Automatic free space map filling

2006-05-03 Thread Tom Lane

Hannu Krosing [EMAIL PROTECTED] writes:
 If you mean the full cycle, then it is probably not worth it, as even a
 single 'clean index' pass can take hours on larger tables.

The patch Heikki is working on will probably alleviate that problem,
because it will allow vacuum to scan the indexes in physical rather than
logical order.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Automatic free space map filling

2006-05-02 Thread Simon Riggs

On Fri, 2006-04-28 at 15:58 -0400, Bruce Momjian wrote:
 Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
   So for you it would certainly help a lot to be able to vacuum the first
   X pages of the big table, stop, release locks, create new transaction,
   continue with the next X pages, lather, rinse, repeat.
  
   This is perfectly doable, it only needs enough motivation from a
   knowledgeable person.
  
  Bruce and I were discussing this the other day; it'd be pretty easy to
  make plain VACUUM start a fresh transaction immediately after it
  finishes a scan heap/clean indexes/clean heap cycle.  The infrastructure
  for this (in particular, session-level locks that won't be lost by
  closing the xact) is all there.  You'd have to figure out how often to
  start a new xact ... every cycle is probably too often, at least for
  smaller maintenance_work_mem settings ... but it'd not be hard or
  involve any strange changes in system semantics.
 
 Should this be a TODO?  One item of discussion was taht people should
 just increase their workmem so the job can be done faster in larger
 batches.

Yes, I think it should be a todo item.

Csaba's point was that it was the duration a VACUUM transaction was held
open that caused problems. Increasing maintenance_work_mem won't help
with that problem.

This would then allow a VACUUM to progress with a high vacuum_cost_delay
without any ill effects elsewhere in the system.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Automatic free space map filling

2006-05-02 Thread Jim C. Nasby

On Mon, May 01, 2006 at 10:24:50PM +0200, Dawid Kuroczko wrote:
 VACUUM table WHERE some_col  now()-'1 hour'::interval;
 
 I.e. Let vacuum run piggyback on some index.  This would allow
 for a quick vacuum of a fraction of a large table.  Especially when
 the table is large, and only some data (new data) are being modified.
 
 The vacuum for such a table would:
 1. scan the index accoriding to the where criteria and create bitmap
  of blocks to look at.
 2. go through these blocks and vacuum them.
 
 Hmm, another perhaps silly idea -- a special index kind for tracking
 tuple deaths.  Ie -- something like whenever tuple is updated/deleted,
 insert an entry into such index, using last session the tuple is visible
 for as a key.  Then, perhaps, vacuum could scan such an index and
 find tuples which are candidates for removal.  I lack the knowledge of
 PostgreSQL's internals, so forgive me if I am writing something
 completely insane. :)

There is a TODO to create a 'dead space map' which would cover #2 and
probably eliminate any use for #1.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Automatic free space map filling

2006-05-01 Thread Jim C. Nasby

On Fri, Apr 28, 2006 at 03:58:16PM -0400, Bruce Momjian wrote:
 Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
   So for you it would certainly help a lot to be able to vacuum the first
   X pages of the big table, stop, release locks, create new transaction,
   continue with the next X pages, lather, rinse, repeat.
  
   This is perfectly doable, it only needs enough motivation from a
   knowledgeable person.
  
  Bruce and I were discussing this the other day; it'd be pretty easy to
  make plain VACUUM start a fresh transaction immediately after it
  finishes a scan heap/clean indexes/clean heap cycle.  The infrastructure
  for this (in particular, session-level locks that won't be lost by
  closing the xact) is all there.  You'd have to figure out how often to
  start a new xact ... every cycle is probably too often, at least for
  smaller maintenance_work_mem settings ... but it'd not be hard or
  involve any strange changes in system semantics.
 
 Should this be a TODO?  One item of discussion was taht people should
 just increase their workmem so the job can be done faster in larger
 batches.

Except that wouldn't help when vacuuming a lot of small tables; each one
would get it's own transaction.

ISTM that tying this directly to maintenance_work_mem is a bit
confusing, since the idea is to keep vacuum transaction duration down so
that it isn't causing dead tuples to build up itself. It seems like it
would be better to have vacuum start a fresh transaction after a certain
number of tuples have died. But since there's no way to actually measure
that without having row level stats turned on, maybe number of
transactions or length of time would be good surrogates.

Since it sounds like we'd want the transaction to start only at the
start of a clean cycle it could just check the limits at the start of
each cycle. That would prevent it from wrapping the vacuum of each small
table with a (rather pointless) new transaction.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Automatic free space map filling

2006-05-01 Thread Tom Lane

Jim C. Nasby [EMAIL PROTECTED] writes:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 So for you it would certainly help a lot to be able to vacuum the first
 X pages of the big table, stop, release locks, create new transaction,
 continue with the next X pages, lather, rinse, repeat.

 Bruce and I were discussing this the other day; it'd be pretty easy to
 make plain VACUUM start a fresh transaction immediately after it
 finishes a scan heap/clean indexes/clean heap cycle.

 Except that wouldn't help when vacuuming a lot of small tables; each one
 would get it's own transaction.

What's your point?  There's only a problem for big tables, and VACUUM
already does use a new transaction for each table.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Automatic free space map filling

2006-05-01 Thread Martijn van Oosterhout

On Mon, May 01, 2006 at 01:19:30PM -0500, Jim C. Nasby wrote:
 ISTM that tying this directly to maintenance_work_mem is a bit
 confusing, since the idea is to keep vacuum transaction duration down so
 that it isn't causing dead tuples to build up itself. It seems like it
 would be better to have vacuum start a fresh transaction after a certain
 number of tuples have died. But since there's no way to actually measure
 that without having row level stats turned on, maybe number of
 transactions or length of time would be good surrogates.

AIUI, vacuum starts a fresh cycle because it's accumulated a certain
number of dead tuples to clean up. Isn't that what you're asking for?
maintenance_work_mem is the limit on the amount of deleted tuple
information that can be stored (amongst other things I'm sure)...

 Since it sounds like we'd want the transaction to start only at the
 start of a clean cycle it could just check the limits at the start of
 each cycle. That would prevent it from wrapping the vacuum of each small
 table with a (rather pointless) new transaction.

Every table has to be in its own transaction since thats the duration
of the locks. Vacuum handling multiple tables in one transaction leaves
you open to deadlocks.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Automatic free space map filling

2006-05-01 Thread Dawid Kuroczko

On 5/1/06, Martijn van Oosterhout kleptog@svana.org wrote:
On Mon, May 01, 2006 at 01:19:30PM -0500, Jim C. Nasby wrote: ISTM that tying this directly to maintenance_work_mem is a bit confusing, since the idea is to keep vacuum transaction duration down so that it isn't causing dead tuples to build up itself. It seems like it
would be better to have vacuum start a fresh transaction after a certain number of tuples have died. But since there's no way to actually measure that without having row level stats turned on, maybe number of
transactions or length of time would be good surrogates.AIUI, vacuum starts a fresh cycle because it's accumulated a certainnumber of dead tuples to clean up. Isn't that what you're asking for?maintenance_work_mem is the limit on the amount of deleted tuple
information that can be stored (amongst other things I'm sure)...Hmm, one idea, which may (or may not) be interesting for largetable vacuum is allowing a syntax similar to:VACUUM table WHERE some_col now()-'1 hour'::interval;
I.e. Let vacuum run piggyback on some index. This would allowfor a quick vacuum of a fraction of a large table. Especially whenthe table is large, and only some data (new data) are being modified.
The vacuum for such a table would:1. scan the index accoriding to the where criteria and create bitmap of blocks to look at.2. go through these blocks and vacuum them.Hmm, another perhaps silly idea -- a special index kind for tracking
tuple deaths. Ie -- something like whenever tuple is updated/deleted,insert an entry into such index, using last session the tuple is visiblefor as a key. Then, perhaps, vacuum could scan such an index and
find tuples which are candidates for removal. I lack the knowledge ofPostgreSQL's internals, so forgive me if I am writing somethingcompletely insane. :) Regards, Dawid

64 matches

Mail list logo