* Would it be a problem to use CRC32 instead of SHA? (Since security is
not a problem, and CRC32 is faster.)
What happens if you get a collision?

That is, you have two different long identifiers:

a.b.c.d...something
a.b.c.d...anotherthing

which by bad luck both hash to the same value:

a.b.c.d.$AABB99
a.b.c.d.$AABB99

(or whatever).
Yes, that was the question. How do I avoid that? (Of course I can avoid that by using a full sha256 hash value.)
* Can somebody think of a
better algorithm, that would give a bigger chance of recognizing the
original identifier from the modified one?
Rather than truncating the most significant part of the identifier, the
field name, you should truncate the least important part, the middle.

a.b.c.d.e.f.g.something

goes to:

a.b...g.something

or similar.
Yes, this is a good idea. Thank you.


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to