This item did NOT make my day. But probably important to know about and to keep in mind.
Amalyah Keshet ----- Original Message ----- > recently, netflix released some anonymized usage data in order > to seed a technical challenge (on recommending algorithms). > > bruce schneier reports that a team of Univ. of Texas researchers > de-anonymized a subset of the data through correlation with public > IMdB (internet movie database) entries. > > bruce extends this by analogy to point how easy this really is > and he notes the obvious analogy to book purchasing habits: > > http://www.schneier.com/blog/archives/2007/12/anonymity_and_t_2.html > > "Someone with access to an anonymous dataset of telephone records, > for example, might partially de-anonymize it by correlating it > with a catalog merchants' telephone order database. Or Amazon's > online book reviews could be the key to partially de-anonymizing > a public database of credit card purchases, or a larger database > of anonymous book reviews. > > "Google, with its database of users' internet searches, could > easily de-anonymize a public database of internet purchases, or > zero in on searches of medical terms to de-anonymize a public > health database. Merchants who maintain detailed customer and > purchase information could use their data to partially de-anonymize > any large search engine's data, if it were released in an > anonymized form. A data broker holding databases of several > companies might be able to de-anonymize most of the records in > those databases. > > "What the University of Texas researchers demonstrate is that this > process isn't hard, and doesn't require a lot of data. It turns out > that if you eliminate the top 100 movies everyone watches, our > movie-watching habits are all pretty individual. This would > certainly hold true for our book reading habits, our internet > shopping habits, our telephone habits and our web searching habits."