On Mon, Sep 9, 2013 at 2:58 PM, Chris Peterson <cpeter...@mozilla.com> wrote: > Google's Location Service prevents people from tracking individual access > points by requiring requests to include at least 2-3 access points that > Google knows are near each other. This "proves" the requester is near the > access points.
I assume by "prevents people from tracking individual access points" means the following: Some people have a personal access point on them (e.g. in their phone). If somebody knows the SSID and MAC of this personal access point, then they could track this person's location by polling the database for that (SSID, MAC) pair. Google tries to limit this type of abuse as much as practical while providing still providing a location service based on such crowdsourced data. > Unlike Google's Location Service, our server does not store MAC addresses or > SSIDs. We identify access points by hash IDs, specifically SHA1(MAC+SSID). > To query the location of an access point in the database, you must know both > its MAC address and current SSID. MAC addresses are 48 bits. SSIDs are often guessable or predictable. Therefore, using the H(MAC+SSID) instead of just the plain MAC+SSID is not buying you much in terms of privacy, IMO. Basically, if you are really trying to use this as a privacy mechanism then you should store the MAC+SSID according to best practices for storing passwords. For example, use PBKDF2 with a large number of iterations. Regardless of whether you use SHA1, SHA2, PBKDF2, or something else, I will still call whatever function you use H(x). But, I am not sure that switching to PBKDF2 even buys you much improved privacy protection. > H1 = Hash(AP1.MAC + AP1.SSID) > H2 = Hash(AP2.MAC + AP2.SSID) > > Our private database's schema looks something like: > > Hash(AP1.MAC + AP1.SSID) ==> AP1.latitude, AP1.longitude, ... > Hash(AP2.MAC + AP2.SSID) ==> AP2.latitude, AP2.longitude, ... > > Our published database would include two tables. The first table would map a > random row id to metadata about an anonymous access point: > > Random1 ==> AP1.latitude, AP1.longitude, ... > Random2 ==> AP2.latitude, AP2.longitude, ... > > The second table's primary key would be a hash of hashes. It would map a > hash of two neighboring access points' hash IDs to a row id of the first > table. Something like: > > Hash(H1 + H2) ==> Random1 > Hash(H2 + H1) ==> Random2 > > Someone querying the published database would need to know the MAC addresses > and current SSIDs of two neighboring access points to look up either's > location. If you know the MAC+SSID of person X's personal access point and the MAC+SSID of person Y's personal access point, then you can use this database to ask the question "are person X and person Y in the same location?" This seems bad. I see that you attempt to address this below. > btw, should we use SHA-2 instead of SHA-1? There is no reason to use SHA-1 when you have SHA-2 available. However, as I indicated above, it isn't clear it is a good idea to be using any plain hash function as H(x). > Other layers of privacy protection include filtering out ad-hoc Wi-Fi > networks; MAC addresses with vendor prefixes from mobile device manufacters > (e.g. Apple and HTC); SSIDs commonly associated with mobile devices (e.g. > "XXX's iPhone" and Google's "_nomap" opt-out); and APs reported in multiple > locations. I think that these things are much more important than the protection offered by H(x). My concern is that if you store the data on the server as H(x) then you will not be able to do the above filtering on the server unless H(x) is ineffective. That seems bad, because the server will be much easier to update to improve the filtering than the clients will be, AFAICT. Also, you will not be able to measure the effectiveness of the privacy protections on the server, which is also very bad. Therefore, I'd suggest that you avoid using any protection at all, and just use x instead of H(x) until we are very confident there is no way we can further improve the filtering. Cheers, Brian Smith -- Mozilla Networking/Crypto/Security (Necko/NSS/PSM), NSA plant _______________________________________________ dev-security mailing list dev-security@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security