On 08/07/2012 07:07 AM, Tobias Stroh wrote: > Hi, > > a few observations of my own, no to be taken seriously: > > 1) Block level deduplication > > There are already a lot of filesystem/filesystem layers in fuse (such as > ZFS, lessfs, ...) which do this.
True. > This is often more efficient then > rolling an own solution and is well abstracted. Dunno, I've not seen particularly impressive numbers from existing solutions. Sure ZFS does it.... but you need a fair bit of ram per storage. It also results in much more random IO. In general I'm not sold on the somewhat better storage efficiency at the cost of more random I/O. After all another TB of disk is cheap. Adding significantly more random IO is not. Additionally making backuppc depend on fuse + zfs/lessfs/whatever seems worrisome. Sure having a "do not do dedupe" flag makes sense. Pushing off dedupe on the filesystem for all users sounds like a nightmare of support/complaining users/strange issues. After all dedupe is (IMO) a leading backuppc feature, I likely would have used something else if I had to do dedupe myself. > In my opinion it does not make sense to do block level deduplication in > the application layer, except if you do it on the client side to safe > bandwidth. I like file level dedupe, fast, easy, huge storage win, simple. Imagine in a disaster recovery situation. With file level dedupe you just would need the checksum of your interested blob, which could potentially be a sqlite query from an ISO or similar. Something like cp /pool/<sha512> ~/myImportantData Or you could have a much more complex time figuring out which checksum of each block is, reading them, and concatenating them together. Seems like a fair bit of unnecessary complexity that in most cases is not going to be a huge storage efficiency win and is going to be slower. ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ BackupPC-devel mailing list [email protected] List: https://lists.sourceforge.net/lists/listinfo/backuppc-devel Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
