On 05/09/13 22:41, David Given wrote: [...] > I think, without a mathematical proof, that maintaining the ability to > take prefixes of an encoded name will require us to use a dictionary > that fits into a precise number of bits. Truncating the dictionary to > 2^10 entries would be the simplest approach, but this means that our > three words no longer encode exactly 32 bits --- we only get 30 bits, > which is 7 hex characters. Four words gets us 40 bits, which is 10 hex > characters. We don't get anything in between.
I've done this. It's now prefix-friendly: 123456789a -> BINARY CABARET LUNAR COBRA 12345678 -> BINARY CABARET LUNAR 12345 -> BINARY CABARET 123 -> BINARY With the new encoding scheme our old friend TOAST MOZART TULIP now becomes MISTER SHERIFF JAVA TOKYO, which I reckon is pretty memorable. It also now accepts lowercase, and - and _ as delimiters for command-line friendliness. I still haven't given up on encoding three words into 32 bits, but it'll require more thought. Here is our new bug list: 74a95e62cf MISTER SHERIFF JAVA TOKYO 0c657fd35f ASPIRIN PRISM SALSA DIVIDE 93c266d3ee POSTAL APOLLO MASTER RIVER 263b45306c CLOCK NEXT HAWAII CARAVAN 04a259be40 ALCOHOL PATRON RELAX PLASTIC 7636b10ddf MODULAR EDUCATE BELGIUM MONTANA 2a34de01fc CONCEPT CHAPTER FREEDOM OCEAN ... (Replies to miscellaneous people summarised here) I did look at the Diceware dictionary; it's large, but I don't think it's very good quality. Not only does it lack a lot of the features I like in the Mnemonic Encoding dictionary --- e.g. Diceware has CLEAN/CLEAR/CLEAT, where ME words are all unique in the first five letters and are fairly distinct from each other anyway --- but it also contains a lot of nonwords like HJ, HK, HL, HM, A, AA, AAA, AAA etc. It may be worth some work with awk scripts to try and add to the ME dictionary, though. All I need is another 422 words and then I get 11 bits per word... Abstract names (such as random syllables) are very dense, but I don't think they're very memorable --- they lack the flavour of the English words. OTOH, thinking about the actual use case for these, I'm not sure there's going to be that many situations where someone needs to remember an entire unambiguous hash. The main use case I'm thinking of here, which is where someone from work comes over to my desk and says 'So, about bug 7188...' and I say 'huh?'. I want memorable bug names so that *I* can disambiguate them in my rather faulty memory. So someone saying 'So, about bug we-choo-fa...' is much more likely to get a coherent response from me than 7188. OTGH I'd still rather they referred to it as MISTER SHERIFF simply for cool value! -- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ │ "Ripley's Law: Never go further for the cat than the cat would go for │ you." --- Vexxarr Bleen (trans. Hunter Cressall)
signature.asc
Description: OpenPGP digital signature
_______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users