On 05/09/13 22:41, David Given wrote:
[...]
> I think, without a mathematical proof, that maintaining the ability to
> take prefixes of an encoded name will require us to use a dictionary
> that fits into a precise number of bits. Truncating the dictionary to
> 2^10 entries would be the simplest approach, but this means that our
> three words no longer encode exactly 32 bits --- we only get 30 bits,
> which is 7 hex characters. Four words gets us 40 bits, which is 10 hex
> characters. We don't get anything in between.

I've done this. It's now prefix-friendly:

123456789a -> BINARY CABARET LUNAR COBRA
12345678 ->   BINARY CABARET LUNAR
12345 ->      BINARY CABARET
123 ->        BINARY

With the new encoding scheme our old friend TOAST MOZART TULIP now
becomes MISTER SHERIFF JAVA TOKYO, which I reckon is pretty memorable.
It also now accepts lowercase, and - and _ as delimiters for
command-line friendliness.

I still haven't given up on encoding three words into 32 bits, but it'll
require more thought.

Here is our new bug list:

74a95e62cf      MISTER SHERIFF JAVA TOKYO
0c657fd35f      ASPIRIN PRISM SALSA DIVIDE
93c266d3ee      POSTAL APOLLO MASTER RIVER
263b45306c      CLOCK NEXT HAWAII CARAVAN
04a259be40      ALCOHOL PATRON RELAX PLASTIC
7636b10ddf      MODULAR EDUCATE BELGIUM MONTANA
2a34de01fc      CONCEPT CHAPTER FREEDOM OCEAN

...

(Replies to miscellaneous people summarised here)

I did look at the Diceware dictionary; it's large, but I don't think
it's very good quality. Not only does it lack a lot of the features I
like in the Mnemonic Encoding dictionary --- e.g. Diceware has
CLEAN/CLEAR/CLEAT, where ME words are all unique in the first five
letters and are fairly distinct from each other anyway --- but it also
contains a lot of nonwords like HJ, HK, HL, HM, A, AA, AAA, AAA etc. It
may be worth some work with awk scripts to try and add to the ME
dictionary, though. All I need is another 422 words and then I get 11
bits per word...

Abstract names (such as random syllables) are very dense, but I don't
think they're very memorable --- they lack the flavour of the English words.

OTOH, thinking about the actual use case for these, I'm not sure there's
going to be that many situations where someone needs to remember an
entire unambiguous hash. The main use case I'm thinking of here, which
is where someone from work comes over to my desk and says 'So, about bug
7188...' and I say 'huh?'. I want memorable bug names so that *I* can
disambiguate them in my rather faulty memory. So someone saying 'So,
about bug we-choo-fa...' is much more likely to get a coherent response
from me than 7188.

OTGH I'd still rather they referred to it as MISTER SHERIFF simply for
cool value!

-- 
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────
│
│ "Ripley's Law: Never go further for the cat than the cat would go for
│ you." --- Vexxarr Bleen (trans. Hunter Cressall)

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to