> Eckert's first task with the data was to find out if her browsing data > was included in the dataset. To do this, she queried the data for the > URL linked with her company's login page, which generates a unique ID > for each employee. Germany has a population of about 82 million, so > the odds that Eckert herself was in browser data collected from 3 > million Germans was small. Although it turned out her browser history > wasn't in the data set, by querying the data for her company's login > page Eckert discovered that a number of her colleagues were in the > data by matching the unique login IDs from the company's page to the > individuals. > > With this information, Eckert would've been able to see her > colleagues' entire browsing history for the last month. One of the > colleagues included in the dataset was a close friend of hers, and she > reached out to him to let him know that she had his browsing history. > The question she had was which browser plugin was collecting and > selling this data. > > To answer this question, Eckert had her colleague delete one browser > plugin every hour until he disappeared from the live data. On the > seventh plugin, he disappeared. This suggested that the plugin > collecting and selling his browser data was, ironically enough, called > Web of Trust, which offers "free tools for safe search and web browsing." > > The troubling thing about Eckert and Dewes' de-anonymization technique > is that it can be used on anyone who has a public social media > presence. For their report, Eckert and Dewes focused on Twitter and > the German LinkedIn equivalent, Xing, to see if they could use these > public profiles to de-anonymize public figures in the data. > > When you click on your analytics page on Twitter, this brings you to a > URL that includes your public Twitter handle—Xing has a similar > feature. This means that Eckert and Dewes were able to query the > database for these publicly available Twitter URLs for German politicians. > > If the politicians were included in the dataset, the next step was to > visit the Twitter profile of the politician and collect a few of the > links they had recently posted. By using these links, coupled with the > public Twitter URL, Eckert and Dewes were able to pull an individual's > entire month-long browsing history from the anonymous dataset. > > As Dewes pointed out when he and I spoke at Def Con, it requires an > astonishingly small amount of browsing information to identify an > individual out of an anonymous dataset of 3 million people. Since > everyone's browsing habits are unique, it only takes about 10 website > visits to create a "fingerprint" for an individual based on which > websites they are visiting and when. >
https://motherboard.vice.com/en_us/article/gygx7y/your-anonymous-browsing-data-isnt-actually-anonymous