-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Sorry for the slow reply - I was hoping to work on this last weekend but didn't get a chance.
On 23/06/14 22:34, Tom Ritter wrote: > I implemented this on a branch > (https://github.com/tomrittervg/crypto-usability-study/commit/9df0e72f15391128b6b067e891323363780cb451 > > ), and ran into three issues: > > 1) I also am not sure if, when we flip the bits, they should be > flipped at random, or just negated. My gut says negated... I'd look at it the other way - we don't flip bits, we match bits. The attacker can match 80 bits and the rest are random (because if a real attacker was generating fingerprints by hashing keys and looking for partial matches, the non-matching bits would be random). > 2) The 850 word corpus does not translate directly into an even > number of bits. I wound making it 14 words, each representing 9 > bits (using 512 of the words) Makes sense. I did something similar for the Basic English dictionary IIRC - all the word lists have lengths that are powers of two. > 3) The more I thought about it, and then verified, the > fingerprints barely match at all. That's to be expected, isn't it? This is why I think we should start by looking at small random differences (e.g. flipping a single bit) - if we start by testing fingerprints with 80 matching and 48 non-matching bits, I'd expect the differences to be noticeable regardless of the encoding, so we may not get a meaningful comparison between encodings. > While I agree this is a more perfect modeling of a 2^80 attacker, > the fact remains that if I were an attacker I would not waste my > computation trying to match the correct number of _bits_ - I would > spend it trying to match the encoding as well as I could - even if > that means the actual number of bits I match isn't as close. I > know this is the same problem as trying to model an attacker who > puts more emphasis on a fingerprint match at the beginning or end > of the string though. I totally agree that an attacker wouldn't try this. Rather than spending time on modelling an unrealistic attacker, I think we should start from the most basic questions - how does each encoding perform in terms of comparison speed, false negative rate and false positive rate - then use the data to test our intuitions about which differences are most noticeable for each encoding, and then, if the data hasn't led us off down some other interesting path, proceed to model an attacker who has a 2^80 budget to make least-detectable partial matches for each encoding. I'm not saying this because the problem of a 2^80 attacker isn't interesting, but because even the first step towards the problem is interesting enough to keep us occupied for a while. Cheers, Michael -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBCAAGBQJTq/vEAAoJEBEET9GfxSfMYasIAKXRtQLTdPIS/RvD8cxD2ML1 Lws2QDDchPDNh+4D3r8BVM6PsGZaG+v+MvYxgNopw9h8kcLwNHBgXxamdW/19I6X 5T9sVasobg6ckK0XmhjBcyomsyDiZLNcXQ3Gw3rzSIIZvd/IFCZ2eubpVmhuVfAX FHN0FNZAUu8xVHc/4pGh0yDYgLmDwOFuEt48N1rWNLWyXVN+VABDwJAXJ7M+WYlB wJpB04L1ViVv638FE68JrfgUmYGnWkdnZv+Xl2avjvT48AgShsIXLW3DgGUMvtsL zQWDPmiEaZ2v4eU/Lb7MXYkGAhzc9mQ9usg2d4J9AKDf2Abv5nTxvupE+iZnbyU= =sddd -----END PGP SIGNATURE----- _______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
