On Mon, Feb 25, 2013 at 11:48 PM, Tobias Markmann <[email protected]> wrote: > > On Fri, Feb 8, 2013 at 9:32 PM, Dave Cridland <[email protected]> wrote: >> >> Well there's also a harvesting problem to solve, I think. You need to make >> it generally hard for a spammer to try all email addresses they have to >> convert that list into a list of jids. >> >> Obviously a bad server holding a copy of the table is the worst case here. > > Right. That's why our idea is to protect the key, what you're searching with > (being the phone number), with hashing + salts and the response would be one > to a few nodes in the system holding the actual data. Then you could go to > one of the nodes and ask for the actual JID and other attributes for your > query result. This way the few nodes can easily apply rate limiting on the > full from-attribute on stanzas or just the host part.
Just as an aside, even with salted hashing, compromising phone numbers is /relatively/ easy. You can rate limit queries over the wire, but this doesn't help in the case of a compromised or malicious server - I think there is enough information publicly available about me to get compromising my home phone number down to 10,000 hash checks (which seems easily achievable) and getting it down to 1,000,000 checks would probably require less than 5 minutes of thought. Now, *my* phone number may not be interesting to anyone because I'm not interesting to anyone, but... Just some food for thought. /K
