Hi Pine,

On Thu, Mar 27, 2014 at 11:45:59PM -0700, ENWP Pine wrote:
> No UA data is recorded from any platform for non-edit actions like
> pageviews and watchlisting, even if an editor is logged in, right?

no, that is not correct.

Currently, for each request [1] to the text and mobile caches udp2log
holds [2] the User-Agent (column 14), and URL (column 9).  This data
gets stored away into files (some parts sampled, some parts
unsampled). I and some others have access to this data.

Same for kafka and mobile caches. There it is always unsampled.

You could do all kinds of badâ„¢ things with those data sets, if you
wanted.

I do see that Ops might need that data. They should have it.

But I hope that my access to this /raw/ data, and access of other
fellow Analytics team members to this raw data get's killed in the
foreseeable future, and that we effectively make it impossible to
fingerprint/track people around.

Best regards,
Christian

[1] Regardless of whether it is edit or non-edit.
Regardless of the action.
Regardless of the platform.

[2] https://wikitech.wikimedia.org/wiki/Cache_log_format



-- 
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
                           Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a        Email:  [email protected]
4040 Linz, Austria           Phone:          +43 732 / 26 95 63
                             Fax:            +43 732 / 26 95 63
                             Homepage: http://quelltextlich.at/
---------------------------------------------------------------

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to