Hi folks,

as I'm using venti as storage backend for an media archive, where
content can be deleted (and probably will happen often enough),
I'm currently thinking about how an garbage collection could be 
achived.

Let's assume the following premise: 

* only a few well-known apps are writing to venti (eg. only
  vac and vtstore). 
* we know all the root scores and can iterate through the 
  metadata from time to time. 
* venti's storage is divided in several logs of not to big size
  (eg. 2GB).

Now we introduce an "deprecated" mode for an volume: no more 
writes to that volume, requested blocks are automatically moved
to another volume (and cleared from the deprecated one). Maybe
from time to time there might run an compaction process which
removes the holes in the volume.

Well, that's not yet any form of gc - just an smooth data moving
from one volume to another - also good if you intend to take some
disk offline in near future, w/o serious interruption.
(The deprecated volume get emptier and emptier, and no new 
data is added.)

GC is the next step:

Assuming each block to keep is accessed at least once in some given
time, we'll know that the remaining data on the volume will be
trash after that time. So everything we've got to do is to iterate
through all archives and access all their blocks (*1). Once this 
is completely done, the deferred volume only contains trash and
can be safely deleted.


What do you think about that approach ?

cu

*1) we could introduce a new "touch" rpc call, which simply tells
venti that some list of blocks is still required, but does not 
send back their data.

-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 cellphone: +49 174 7066481   email: [EMAIL PROTECTED]   skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

Reply via email to