cc-ing our friends in research and wikitech (sorry I forgot initially)

We're happy to announce a few improvements to Analytics data releases on
> dumps.wikimedia.org:
>
> * We are releasing a new dataset, an estimate of Unique Devices accessing
> our projects [1]
> * We are officially making available a better Pageviews dataset [2]
> * We are deprecating two older pageview statistics datasets
> * We moved Analytics data from /other to /analytics [3]
>
> Details follow:
>
>
> *Unique Devices:* Since 2009, the Wikimedia Foundation used comScore to
> report data about unique web visitors.  In January 2016, however, we
> decided to stop reporting comScore numbers [4] because of certain
> limitations in the methodology, these limitations translated into
> misreported mobile usage. We are now ready to replace comscore numbers with
> the Unique Devices Dataset [5][1]. While unique devices does not equal
> unique visitors, it is a good proxy for that metric, meaning that a major
> increase in the number of unique devices is likely to come from an increase
> in distinct users. We understand that counting uniques raises fairly big
> privacy concerns and we use a very private conscious way to count unique
> devices, it does not include any cookie by which your browser history can
> be tracked [6].
>
> We invite you to explore this new dataset and hope it’s helpful for the
> Wikimedia community in better understanding our projects. This data can
> help measurethe reach of wikimedia projects on the web.
>
> *Pageviews:* This [2] is the best quality data available for counting the
> number of pageviews our projects receive at the article and project level.
> We've upgraded from pagecounts-raw to pagecounts-all-sites, and now to
> pageviews, in order to filter out more spider traffic and measure something
> closer to what we think is a real user viewing content.  A short history
> might be useful:
>
>     * pagecounts-raw: was maintained by Domas Mituzas originally and taken
> over by the analytics team.  It was and still is the most used dataset,
> though it has some majore problems.  It does not count access to the mobile
> site, it does not filter out spider or bot traffic, and it suffers from
> unknown loss due to logging infrastructure limitations.
>     * pagecounts-all-sites: uses the same pageview definition as
> pagecounts-raw, and so also does not filter out spider or bot traffic.  But
> it does include access to mobile and zero sites, and is built on a more
> reliable logging infrastructure.
>     * pagecounts-ez: is derived from the best data available at the time.
> So until December 2015, it was based on pagecounts-raw and
> pagecounts-all-sites, and now it's based on pageviews.  This dataset is
> great because it compresses very large files without losing any
> information, still providing hourly page and project level statistics.
>
> So the new dataset, pageviews, is what's behind our pageview API and is
> now available in static files for bulk download back to May 2015.  But the
> multiple ways to download pageview data is confusing for consumers, so
> we're keeping only pageviews and pagecounts-ez and deprecating the other
> two.  If you'd like to read more about the current pageview definition,
> details are on the research page [7].
>
> *Deprecating:* We are deprecating the pagecounts-raw and
> pagecounts-all-sites datasets in May 2016 (discussion here:
> https://phabricator.wikimedia.org/T130656 ).  This data suffers from many
> artifacts, lack of mobile data, and/or infrastructure problems, and so is
> not comparable to the new way we track pageviews.  It will remain here
> because we have historical data that may be useful, but it will not be
> maintained or updated beyond May 2016.
>
> *Clean-up:* Analytics data on dumps was crammed into /other with
> unrelated datasets.  We made a new page to receive current and future
> datasets [3] and linked to it from /other and /.  Please let us know if
> anything there looks confusing or opaque and I'll be happy to clarify.
>
>
> [1] http://dumps.wikimedia.org/other/unique_devices
> [2] http://dumps.wikimedia.org/other/pageviews
> [3] http://dumps.wikimedia.org/analytics/
> [4] https://meta.wikimedia.org/wiki/ComScore/Announcement
> [5] https://meta.wikimedia.org/wiki/Research:Unique_Devices
> [6]
> https://meta.wikimedia.org/wiki/Research:Unique_Devices#How_do_we_count_unique_devices.3F
> [7] https://meta.wikimedia.org/wiki/Research:Page_view
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to