Hey guys.  I know that this is way off-topic, but Google is failing me on this one and I could use some help.

So: is anyone here familiar with the Unicode spec and the various UTFs? (I don't mean University Teaching Fellows)

I'm in the middle of a project pertaining to Unicode strings, and I'm looking to do be able to do some very fast comparisons between arbitrary strings of different UTF encodings.  Most importantly right now, I'm trying to find a hashing algorithm that will give the same hash for the same array of code-points represented in different UTFs, without doing a full transcoding.

If anyone here happens to have worked with Unicode in any depth, I'd appreciate the (off-list) help.

Thanks all,
John Demme

Reply via email to