Hey Nuria,

So, the goal is to have a UUID _distinct_ from IP and user agent (that
is, the IP and UA are not related to the UUID that's generated) so
that that UUID can be used as a baseline for accuracy purposes. Think
the UUID in the ModuleStorage test datasets from wayback. So it's not
"can any individual user be de-aggregated" so much as "does a
user_agent/ip hash make a good UUID, generally". If I'm understanding
that page correctly, it's more aimed at the former problem.

On 3 January 2016 at 11:29, Nuria <[email protected]> wrote:
> Oliver,
>
> You might want to check our documentation in wikitech regarding identity
> reconstruction. I think it covers your point #1.
>
>
> https://wikitech.wikimedia.org/wiki/Analytics/Data/Preventing_identity_reconstruction
>
> Nuria
>
>
>
> On Jan 2, 2016, at 10:00 AM, Oliver Keyes <[email protected]> wrote:
>
> Hey y'all
>
> I'm working on a piece of research (largely recreational) on the old
> problem of fingerprinting users with minimal information - namely the
> combination of a user agent and an IP address. Basically I'm looking
> to put together a piece of work showing:
>
> 1. How sub-standard it is;
> 2. How fast it decays;
> 3. How the sub-standardness varies by (platform|location)
>
> This would be pretty doable with internal data; basically I'd need a
> schema with IP, user agent and a per-user UUID that's got a decent
> (>=24 hours) expiry time. My question: does anyone know of a table
> with recent data that meets these requirements? And, if not, anyone
> with EventLogging experience interested in working on the problem with
> me?
>
> --
> Oliver Keyes
> Count Logula
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to