Hello,

The following might be an unusual problem for most BackupPC users,
nevertheless I think there is a conceptual weakness in how backuppc
manages the pool (or I'm missing something).

This is a bit of information from the status page:

* Pool is 8156.05GB comprising 10065918 files and 4369 directories
* Pool hashing gives 22309 repeated files with longest chain 3295,
* Nightly cleanup removed 206296 files of size 38.21GB


We might use BackupPC in a larger scale than the majority, but that is
not the problem, it works just fine. The problem is the "longest chain
3295" part. BackupPC_link takes forever to compare these files against
each other (this is not a surprise since the running time is in O(n^2)).

The culprit is the speedup done by Buffer2MD5, which only uses parts of
the first mega byte of a file to calculate the name for the pool. The
responsible files have all the same size and are mostly the same in the
first mega byte (at least in the parts that matter).

Is there a way to circumvent this?
Can I change Buffer2MD5 and take the speedup out?
Even if it means that I have to start fresh.

Regards,
Stefan Bender (Max Planck Institute for Computer Science)


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to