On 04/09/14 02:54, Joseph Bonneau wrote: > On Wed, Sep 3, 2014 at 2:26 PM, Trevor Perrin <[email protected]> wrote: > >> People would probably reverse most of the addresses, >> so this means the difference between publishing, I dunno, 90% of email >> addresses versus 100%? (though for targeted users - political >> candidates, celebrities, etc, people would tune the searches and have >> a higher success rate.) >> > > A bit more formally stated, after hashing an attacker willing to check X > trial hashes will get Y% of email addresses. By "strengthening" the hash > (multiple iterations, memory-hard functions, etc.) you can try to limit the > value of X for a given attacker. > > We have no hard numbers on what the X/Y curve would look like for email > addresses, but based on the distributions of passwords human names which I > studied extensively in my thesis [1], it's probably safe to say that for X > < 2^30 you would get at least 50% of the email addresses and for 2^40 or > 2^50 you'd hit the 90% range.
I tried to think of a way of allowing the full log of hashes to keys to be published while providing a rate-limitable way of obtaining the salts needed to check email addresses, without allowing a bad provider to issue multiple salts for the same account. I thought that perhaps the salt could be the signature of the hash of the email address (hence if two different salts were produced, the provider could be proved to be misbehaving). However if the attacker has 100millions hosts then the rate limiter needs to be able to block a host after much less than 10 malicious requests ever (at the 2^30 level). While at the same time not blocking large providers which legitimately send thousands of invalid requests a day due to typoed email addresses. No amount of proof of works or multi-request protocols is going to solve that. The only real advantage of storing the hash rather than the email address in the log is the fixed size of the hash output. Any solution putting all the email addresses in the world in transparency logs probably also needs to solve spam at the same time. That is more plausible than it might be as I think that a lot of spam filtering is done based on the reputation of the sender. Senders using an authenticated encryption system could have their reputation more tightly determined than is possible at present. However discriminating between new legitimate users and new spam accounts would remain difficult. Unfortunately deployment is difficult, early adopters get more spam, only when most people are using it does it become possible to penalise people who don't. However even without a transparency log, if there is a user existence oracle from the provider holding the public keys then this problem remains as client machines would need to make those requests for public keys. Client machines are indistinguishable from bots making blocking them difficult. Hence a provider would need to always return keys for any guessed email address (in constant time). I guess email providers currently do spam filtering before returning 'Mail delivery failed - no such address' messages so that attackers don't know if they guessed correctly. Daniel
signature.asc
Description: PGP signature
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
