On Thu, Sep 5, 2013 at 6:27 AM, David Given <d...@cowlark.com> wrote:
> playing with the encoder I see that: > > 12345678 = CRASH CHAPTER CASINO > 123456 = ROVER TRIBAL EGO > 1234 = JAMES ACTIVE > 12 = ALCOHOL > I think the above doesn't really work for an application like Fossil. I think that a prefix of the SHA1 hash should encode to a prefix of the mnemonic, and the other way arround too - a prefix of the mnemonic should decode back to a prefix of the original hash. To do this, some changes need to be made to the encoding mechanism. (1) You have to abandon the seven 3-letter-words that are used for 24-bit values, since if the 24-bit value is a prefix of a longer value, those 3-letter-words will not be used when encoding the longer values. (2) The round-trip from hex to mnemonic back to hex might truncate half-bytes off the end of the hex value. The hex hash is encoded into mnemonic words in 4-byte or 8-character chunks. HHH -> WORD -> HH HHHHHH -> WORD-WORD -> HHHHH HHHHHHHH -> WORD-WORD-WORD -> HHHHHHHH It takes a minimum of 3 hex character (12 bits) to encode a single mnemonic word if the prefix property is to be preserved. But a single mnemonic will only convert back to 2 hex characters. Similarly, 6 hex characters are needed to generate two mnemonic words, but 2 mnemonic words will only give 5 hex characters of output. 8 hex characters converts to 3 mnemonic words and then back to 8 hex characters, without truncation. The encoding mechanism currently used never truncates. If you encode N hex characters (N must be even) then you will get back N hex characters when decoding. However, a prefix of the mnemonic encoding does not yield a prefix of the original hex hash. I think that for this application, prefix preservation is more important that avoidance of encode/decode truncation. -- D. Richard Hipp d...@sqlite.org
_______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users