All done. Hashed Client IP's are not being collected anymore on Eventlogging- Varnishkafka is not picking it up from Varnish - and no IPs all the way to the mysql/hadoop end of things.
Thanks all. On Tue, Mar 8, 2016 at 9:01 AM, Madhumitha Viswanathan < [email protected]> wrote: > Thanks Toby! > > This change will be deployed today. > > On Wed, Mar 2, 2016 at 11:53 AM, Toby Negrin <[email protected]> > wrote: > >> Thanks Madhu -- it's great to see the analytics team working proactively >> on things like this. >> >> -Toby >> >> On Wed, Mar 2, 2016 at 10:18 AM, Madhumitha Viswanathan < >> [email protected]> wrote: >> >>> Hi all, >>> >>> The analytics team, in an effort to collect sensitive data less, plans >>> to drop the clientIP field from the EventCapsule( >>> https://meta.wikimedia.org/wiki/Schema:EventCapsule), which is the >>> wrapper for all events flowing into Eventlogging (Currently IPs and User >>> Agents get purged after the 90 days mark). The field was originally meant >>> only for debugging, but has served some research usecases. Most of these >>> cases have been wrapped up at this point. It has also been used as a proxy >>> to count number of devices visiting sites like our blog - and since IP's >>> are not a good measure of that anyway - we plan to move such cases to use >>> Piwik. >>> >>> The rollout of the change will happen in stages (Drop clientIPs first on >>> the EL end, then the EventCapsule in meta, and finally on the VarnishKafka >>> end). It should be a clean deployment and there's no scheduled downtime - >>> EL will keep working as is. What does change? ClientIP's will start being >>> set as NULL in your mysql tables. If you update the Eventlogging schema you >>> maintain - causing new tables to be created, the new tables will not have >>> the clientIp field in them. The change is planned to be rolled out the week >>> of 11th or 18th March '16, pending the completion of data collection for >>> the ongoing QuickSurveys based research work. >>> >>> Let us know if you have any questions/concerns on the list or on >>> #wikimedia-analytics. The related phab ticket is here - >>> https://phabricator.wikimedia.org/T128407. >>> >>> Thanks, >>> Madhu Viswanathan >>> Software Engineer, Analytics >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > > -- > --Madhu :) > -- --Madhu :)
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
