Re: [Analytics] ensuring reader anonymity

2016-11-14 Thread Wikimedia Legal
Hi Pine, Are the data retention guidelines what you're looking for? They're linked in the footer on the privacy policy page, but were done as a separate policy after the privacy policy was already in placed. Best, Jacob Legal Counsel

Re: [Analytics] High number of pageviews on page with single hyphen as title

2016-11-14 Thread Nuria Ruiz
This is documented now here: https://wikitech.wikimedia.org/wiki/Analytics/PageviewAPI#Gotchas On Tue, Nov 8, 2016 at 7:25 AM, Vipul Naik wrote: > Hi Joseph, > > Thanks for the clarification. > > Any ideas why this number is much higher for some months? In particular, >

Re: [Analytics] [Ops] Statsv

2016-11-14 Thread Nuria Ruiz
>I agree it makes sense to co-own statsv. I think it also needs to have a primary maintainer/point of contact if issues beside basic operational monitoring come up (e.g. general >maintenance and updates). What do you think? Sounds fine. Also, probably performance team doesn't mind to contribute to

Re: [Analytics] [Ops] Statsv

2016-11-14 Thread Filippo Giunchedi
I agree it makes sense to co-own statsv. I think it also needs to have a primary maintainer/point of contact if issues beside basic operational monitoring come up (e.g. general maintenance and updates). What do you think? thanks, Filippo On Mon, Nov 14, 2016 at 8:21 AM, Andrew Otto

Re: [Analytics] Statsv

2016-11-14 Thread Andrew Otto
​+ops Analytics (Otto & Luca) probably have the most experience with python kafka clients, and also are the most likely to cause statsv problems, (due to analytics kafka broker restarts, etc.). So it makes sense for us to be at least partially responsible. On the other hand, statsv is for

Re: [Analytics] 9 am UTC maintenance for dataset1001 (dumps.wikimedia.org)

2016-11-14 Thread Ariel Glenn WMF
That should be Tuesday, Nov 15. It's been a long week. A. On Mon, Nov 14, 2016 at 2:27 PM, Ariel Glenn WMF wrote: > On Tuesday Nov 13, at 9 am UTC, the web server for the dumps and other > datasets will > be unavailable due to maintenance. This should take no longer than

[Analytics] 9 am UTC maintenance for dataset1001 (dumps.wikimedia.org)

2016-11-14 Thread Ariel Glenn WMF
On Tuesday Nov 13, at 9 am UTC, the web server for the dumps and other datasets will be unavailable due to maintenance. This should take no longer than 10 minutes. Thanks for your understanding. Ariel ___ Analytics mailing list

[Analytics] Wikimedia datasets collection on the Internet Archive has surpassed 1 million items

2016-11-14 Thread Hydriz Scholz
Dear all, The Wikimedia Foundation datasets collection on the Internet Archive [1] has now surpassed 1 million items (and about 50,000 full database dumps)! This marks a major milestone in our archiving efforts of Wikimedia's vast amount of data and ensures that the vital content submitted by

Re: [Analytics] Statsv

2016-11-14 Thread Addshore
Would it not make more sense for the team / teams that maintain statd & graphite to maintain statv? As far as I see statsv is part of that ecosystem and also as far as I know the analytics team doesn't really have much to do with it. On Mon, 14 Nov 2016 at 10:40 Gilles Dubuc