Jorg, the project abbreviations are explained in depth here: https://wikitech.wikimedia.org/wiki/Analytics/Data/Pageviews
On Mon, Mar 6, 2017 at 11:15 AM, Jörg Jung <[email protected]> wrote: > Yeah, Dan, that will work, thanx. > > Just out of curiosity: Why are there three projects for "de" and what is > the difference between them ? /de/,/de.m/ and /de.zero/ > > Cheers, JJ > > Am 06.03.2017 um 15:45 schrieb Dan Andreescu: > > Jorg, take a look at https://dumps.wikimedia.org/other/pagecounts-ez/ > > which has compressed data without losing granularity. You can get > > monthly files here and download a lot less data. > > > > On Mon, Mar 6, 2017 at 5:40 AM, Jörg Jung <[email protected] > > <mailto:[email protected]>> wrote: > > > > Marcel, > > > > thanx for ur quick answer. > > My main issue with dumps (or i don't get something) is: > > > > I need to download them first to be able to aggregate and filter. > > Which for the year 2016 would be: 40MB(middle) * 24h * 30d * 12m = > about > > 350TB > > > > As i am not sitting directly at DE-CIX but in my private office i > will > > face a pretty hard time with that :-) > > > > So my idea is that somebody "closer" to the raw data would basically > do > > the aggregation and filtering for me... > > > > Will somebody (please) ? > > > > Thanx, JJ > > > > Am 06.03.2017 um 11:14 schrieb Marcel Ruiz Forns: > > > Hi Jörg, :] > > > > > > Do you mean top 250K most viewed *articles* in de.wikipedia.org > > <http://de.wikipedia.org> > > > <http://de.wikipedia.org>? > > > > > > If so, I think you can get that from the dumps indeed. You can > find 2016 > > > hourly pageview stats by article for all wikis > > > here: https://dumps.wikimedia.org/other/pageviews/2016/ > > <https://dumps.wikimedia.org/other/pageviews/2016/> > > > > > > Note that the wiki codes (first column) you're interested in are: > > /de/, > > > /de.m/ and /de.zero/. > > > The third column holds the number of pageviews you're after. > > > Also, this data set does not include bot traffic as recognized by > the > > > pageview definition > > <https://meta.wikimedia.org/wiki/Research:Page_view > > <https://meta.wikimedia.org/wiki/Research:Page_view>>. > > > As files are hourly and contain data for all wikis, you'll need > some > > > aggregation and filtering. > > > > > > Cheers! > > > > > > On Mon, Mar 6, 2017 at 2:59 AM, Jörg Jung < > [email protected] <mailto:[email protected]> > > > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > > > > > Ladies, gents, > > > > > > for a project i plan i'd need the following data: > > > > > > Top 250K sites for 2016 in project de.wikipedia.org < > http://de.wikipedia.org> > > > <http://de.wikipedia.org>, user-access. > > > > > > I only need the name of the site and the corrsponding number of > > > user-accesses (all channels) for 2016 (sum over the year). > > > > > > As far as i can see i can't get that data via REST or by > aggegating > > > dumps. > > > > > > So i'd like to ask here, if someone likes to helpout. > > > > > > Thanx, cheers, JJ > > > > > > -- > > > Jörg Jung, Dipl. Inf. (FH) > > > Hasendriesch 2 > > > D-53639 Königswinter > > > E-Mail: [email protected] > > <mailto:[email protected]> <mailto:[email protected] > > <mailto:[email protected]>> > > > Web: www.retevastum.de <http://www.retevastum.de> > > <http://www.retevastum.de> > > > www.datengraphie.de <http://www.datengraphie.de> > > <http://www.datengraphie.de> > > > www.digitaletat.de <http://www.digitaletat.de> > > <http://www.digitaletat.de> > > > www.olfaktum.de <http://www.olfaktum.de> > > <http://www.olfaktum.de> > > > > > > _______________________________________________ > > > Analytics mailing list > > > [email protected] > > <mailto:[email protected]> > > <mailto:[email protected] > > <mailto:[email protected]>> > > > https://lists.wikimedia.org/mailman/listinfo/analytics > > <https://lists.wikimedia.org/mailman/listinfo/analytics> > > > <https://lists.wikimedia.org/mailman/listinfo/analytics > > <https://lists.wikimedia.org/mailman/listinfo/analytics>> > > > > > > > > > > > > > > > -- > > > *Marcel Ruiz Forns* > > > Analytics Developer > > > Wikimedia Foundation > > > > > > > > > _______________________________________________ > > > Analytics mailing list > > > [email protected] <mailto:Analytics@lists. > wikimedia.org> > > > https://lists.wikimedia.org/mailman/listinfo/analytics > > <https://lists.wikimedia.org/mailman/listinfo/analytics> > > > > > > > -- > > Jörg Jung, Dipl. Inf. (FH) > > Hasendriesch 2 > > D-53639 Königswinter > > E-Mail: [email protected] <mailto:joerg.jung@retevastum. > de> > > Web: www.retevastum.de <http://www.retevastum.de> > > www.datengraphie.de <http://www.datengraphie.de> > > www.digitaletat.de <http://www.digitaletat.de> > > www.olfaktum.de <http://www.olfaktum.de> > > > > _______________________________________________ > > Analytics mailing list > > [email protected] <mailto:[email protected]> > > https://lists.wikimedia.org/mailman/listinfo/analytics > > <https://lists.wikimedia.org/mailman/listinfo/analytics> > > > > > > > > > > _______________________________________________ > > Analytics mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > -- > Jörg Jung, Dipl. Inf. (FH) > Hasendriesch 2 > D-53639 Königswinter > E-Mail: [email protected] > Web: www.retevastum.de > www.datengraphie.de > www.digitaletat.de > www.olfaktum.de > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
