All done. Hashed Client IP's are not being collected anymore on
Eventlogging- Varnishkafka is not picking it up from Varnish - and no IPs
all the way to the mysql/hadoop end of things.

Thanks all.

On Tue, Mar 8, 2016 at 9:01 AM, Madhumitha Viswanathan <
[email protected]> wrote:

> Thanks Toby!
>
> This change will be deployed today.
>
> On Wed, Mar 2, 2016 at 11:53 AM, Toby Negrin <[email protected]>
> wrote:
>
>> Thanks Madhu -- it's great to see the analytics team working proactively
>> on things like this.
>>
>> -Toby
>>
>> On Wed, Mar 2, 2016 at 10:18 AM, Madhumitha Viswanathan <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> The analytics team, in an effort to collect sensitive data less, plans
>>> to drop the clientIP field from the EventCapsule(
>>> https://meta.wikimedia.org/wiki/Schema:EventCapsule), which is the
>>> wrapper for all events flowing into Eventlogging (Currently IPs and User
>>> Agents get purged after the 90 days mark). The field was originally meant
>>> only for debugging, but has served some research usecases. Most of these
>>> cases have been wrapped up at this point. It has also been used as a proxy
>>> to count number of devices visiting sites like our blog - and since IP's
>>> are not a good measure of that anyway - we plan to move such cases to use
>>> Piwik.
>>>
>>> The rollout of the change will happen in stages (Drop clientIPs first on
>>> the EL end, then the EventCapsule in meta, and finally on the VarnishKafka
>>> end). It should be a clean deployment and there's no scheduled downtime -
>>> EL will keep working as is. What does change? ClientIP's will start being
>>> set as NULL in your mysql tables. If you update the Eventlogging schema you
>>> maintain - causing new tables to be created, the new tables will not have
>>> the clientIp field in them. The change is planned to be rolled out the week
>>> of 11th or 18th March '16, pending the completion of data collection for
>>> the ongoing QuickSurveys based research work.
>>>
>>> Let us know if you have any questions/concerns on the list or on
>>> #wikimedia-analytics. The related phab ticket is here -
>>> https://phabricator.wikimedia.org/T128407.
>>>
>>> Thanks,
>>> Madhu Viswanathan
>>> Software Engineer, Analytics
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> --
> --Madhu :)
>



-- 
--Madhu :)
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to