Re: [BackupPC-devel] [BackupPC-users] improving the deduplication ratio

Tino Schwarze Wed, 16 Apr 2008 05:30:22 -0700

On Wed, Apr 16, 2008 at 02:20:36PM +0200, Ludovic Drolez wrote:

> > > Introducing file chunking would introduce a new abstraction layer - a
> > > file would need to be split into chunks and recreated for restore. You
> > 
> > Tino -- thanks for posting this. These issues are exactly what I had  
> > in mind when I posted about adding sub-file deduplication. There's a  
> > lot more work to do and definitely a bunch more housekeeping. Right  
> > now, BackupPC gets off "easy" by utilizing hardlinks to do the  
> > dedupe. Once we delve below the file, a brand new data structure/ 
> > mechanism needs to be designed and built to efficiently link all of  
> > these blocks together.
> 
> And what about a mix of the two ?
> - keep hard links for files less than the chunk size (filenames begin
> with an 'f' as before)
> - for files bigger than the chunk size, create a regular file which
> contains references to the chunks in the cpool (the files could begin
> with an 'r' for example).


The main problem is that you cannot tell anymore whether a file(chunk)
in the pool is still needed or not. You'd need to look into every backup
and check if there's still a reference to that chunk.

Tino.

-- 
„What we resist, persists.” (Zen saying)

www.craniosacralzentrum.de
www.forteego.de

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
BackupPC-devel mailing list
BackupPC-devel@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Re: [BackupPC-devel] [BackupPC-users] improving the deduplication ratio

Reply via email to