On 4/17/12 12:19 AM, "Martin v. Löwis" wrote:
Am 17.04.2012 00:09, schrieb Tarek Ziadé:
On 4/16/12 11:57 PM, "Martin v. Löwis" wrote:
Maybe a better checksum would be a global hash calculated differently ?
Define a protocol, and I present you with an implementation that
conforms to the protocol, and still has inconsistent data, and not
in a malicious manner, but due to bugs/race conditions/unexpected
events. It's pointless.
if you calculate a checksum with all mirrored files - you can guarantee
that the bits are the same
on both side, no ?
How exactly would you calculate that checksum?
by calculating the grand hash of each file hash.
Would you really require
concatenation of all files?
I did not say that. You are claiming it in a rhetorical question.
That could take a few hours per change.
why that ? you don't calculate the checksum of a file your already have
twice.
Even if you do, it's very fast to call md5.
try it:
$ find mirror | xargs md5
this takes a few seconds at most on the whole mirror
It
would also raise the question in what order the files ought to be
concatenated.
Anything reproductible, a sorted list. In bash I *suspect* the
calculation of the grand hash of the mirror is a one-liner that takes
less than a minute.
I am going to stop here anyways because I don't see the point of
discussing implementation details at this stage, since we were
barely starting to talk about the idea of a checksum - and that seems to
be going nowhere.
Cheers
Tarek
_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig