I haven't done a full analysis but do have a few questions


On 9/9/2013 5:58 PM, Chris Peterson wrote:
Our private database maps access point hash IDs to locations (and other metadata). Assuming:

    H1 = Hash(AP1.MAC + AP1.SSID)
    H2 = Hash(AP2.MAC + AP2.SSID)

I assume + means concatenate. I might suggest XORing the values. SSID names are usually human readable, not meant to be secure and thus follow predictable patterns. I also hope you're not using the patterned MAC notation but rather the 48 bit address space representation.



Our private database's schema looks something like:

    Hash(AP1.MAC + AP1.SSID) ==> AP1.latitude, AP1.longitude, ...
    Hash(AP2.MAC + AP2.SSID) ==> AP2.latitude, AP2.longitude, ...

Is the data aged? What happens if I move? Does this give Mozilla the ability to historically track me if I move my device? Is that a problem? (I'm not saying it is, just an observation). You mention below about filtering APs in multiple locations but clearly they can move as people relocate.
What is the granularity of the lat/long?


Our published database would include two tables. The first table would map a random row id to metadata about an anonymous access point:

    Random1 ==> AP1.latitude, AP1.longitude, ...
    Random2 ==> AP2.latitude, AP2.longitude, ...

I would be hesitant to use the word anonymous here. Latlong is easily combine with other publicly available databases that could identify individual address and thus individuals. Again, it comes down to granularity of the data.


The second table's primary key would be a hash of hashes. It would map a hash of two neighboring access points' hash IDs to a row id of the first table. Something like:

    Hash(H1 + H2) ==> Random1
    Hash(H2 + H1) ==> Random2

Someone querying the published database would need to know the MAC addresses and current SSIDs of two neighboring access points to look up either's location.

When you say published, do you mean that the entire DB is published for use by "researchers" or that it's just has a publicly exposed API that responds to queries? I'm assuming if AP3 through AP10 were all also in the vicinity that Hash(H1+Hx) ==> Random1 where x is in {2,..,10}, correct? If so, is whatever value Hy is the prefix in the concatenation will correspond to APy's Random id?




btw, should we use SHA-2 instead of SHA-1? In 2009, NIST recommended that "Federal agencies should stop using SHA-1 for applications that require collision resistance as soon as practical, and must use the SHA-2 family of hash functions for these applications after 2010."

Yes


*R. Jason Cronk, Esq., CIPP/US*
/Privacy Engineering Consultant/, *Enterprivacy Consulting Group* <enterprivacy.com>

 * phone: (828) 4RJCESQ
 * twitter: @privacymaverick.com
 * blog: http://blog.privacymaverick.com

_______________________________________________
dev-security mailing list
dev-security@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security

Reply via email to