On 08/07/2012 07:07 AM, Tobias Stroh wrote:
> Hi,
> 
> a few observations of my own, no to be taken seriously:
> 
> 1) Block level deduplication
> 
> There are already a lot of filesystem/filesystem layers in fuse (such as
> ZFS, lessfs, ...) which do this.

True.

> This is often more efficient then
> rolling an own solution and is well abstracted.

Dunno, I've not seen particularly impressive numbers from existing
solutions.  Sure ZFS does it.... but you need a fair bit of ram per
storage.  It also results in much more random IO.  In general I'm not
sold on the somewhat better storage efficiency at the cost of more
random I/O.  After all another TB of disk is cheap.  Adding
significantly more random IO is not.

Additionally making backuppc depend on fuse + zfs/lessfs/whatever seems
worrisome.  Sure having a "do not do dedupe" flag makes sense.  Pushing
off dedupe on the filesystem for all users sounds like a nightmare of
support/complaining users/strange issues.  After all dedupe is (IMO) a
leading backuppc feature, I likely would have used something else if I
had to do dedupe myself.

> In my opinion it does not make sense to do block level deduplication in
> the application layer, except if you do it on the client side to safe
> bandwidth.

I like file level dedupe, fast, easy, huge storage win, simple.
Imagine in a disaster recovery situation.  With file level dedupe you
just would need the checksum of your interested blob, which could
potentially be a sqlite query from an ISO or similar.  Something like
cp /pool/<sha512> ~/myImportantData

Or you could have a much more complex time figuring out which checksum
of each block is, reading them, and concatenating them together.

Seems like a fair bit of unnecessary complexity that in most cases is
not going to be a huge storage efficiency win and is going to be slower.


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
BackupPC-devel mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to