Ok, guys, thanx alot !

Am 06.03.2017 um 17:33 schrieb Dan Andreescu:
> Jorg, the project abbreviations are explained in depth
> here: https://wikitech.wikimedia.org/wiki/Analytics/Data/Pageviews
> 
> On Mon, Mar 6, 2017 at 11:15 AM, Jörg Jung <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Yeah, Dan, that will work, thanx.
> 
>     Just out of curiosity: Why are there three projects for "de" and what is
>     the difference between them ?  /de/,/de.m/ and /de.zero/
> 
>     Cheers, JJ
> 
>     Am 06.03.2017 um 15:45 schrieb Dan Andreescu:
>     > Jorg, take a look at https://dumps.wikimedia.org/other/pagecounts-ez/
>     <https://dumps.wikimedia.org/other/pagecounts-ez/>
>     > which has compressed data without losing granularity.  You can get
>     > monthly files here and download a lot less data.
>     >
>     > On Mon, Mar 6, 2017 at 5:40 AM, Jörg Jung <[email protected] 
> <mailto:[email protected]>
>     > <mailto:[email protected]
>     <mailto:[email protected]>>> wrote:
>     >
>     >     Marcel,
>     >
>     >     thanx for ur quick answer.
>     >     My main issue with dumps (or i don't get something) is:
>     >
>     >     I need to download them first to be able to aggregate and filter.
>     >     Which for the year 2016 would be: 40MB(middle) * 24h * 30d *
>     12m = about
>     >     350TB
>     >
>     >     As i am not sitting directly at DE-CIX but in my private
>     office i will
>     >     face a pretty hard time with that :-)
>     >
>     >     So my idea is that somebody "closer" to the raw data would
>     basically do
>     >     the aggregation and filtering for me...
>     >
>     >     Will somebody (please) ?
>     >
>     >     Thanx, JJ
>     >
>     >     Am 06.03.2017 um 11:14 schrieb Marcel Ruiz Forns:
>     >     > Hi Jörg, :]
>     >     >
>     >     > Do you mean top 250K most viewed *articles* in
>     de.wikipedia.org <http://de.wikipedia.org>
>     >     <http://de.wikipedia.org>
>     >     > <http://de.wikipedia.org>?
>     >     >
>     >     > If so, I think you can get that from the dumps indeed. You
>     can find 2016
>     >     > hourly pageview stats by article for all wikis
>     >     > here: https://dumps.wikimedia.org/other/pageviews/2016/
>     <https://dumps.wikimedia.org/other/pageviews/2016/>
>     >     <https://dumps.wikimedia.org/other/pageviews/2016/
>     <https://dumps.wikimedia.org/other/pageviews/2016/>>
>     >     >
>     >     > Note that the wiki codes (first column) you're interested in
>     are:
>     >     /de/,
>     >     > /de.m/ and /de.zero/.
>     >     > The third column holds the number of pageviews you're after.
>     >     > Also, this data set does not include bot traffic as
>     recognized by the
>     >     > pageview definition
>     >     <https://meta.wikimedia.org/wiki/Research:Page_view
>     <https://meta.wikimedia.org/wiki/Research:Page_view>
>     >     <https://meta.wikimedia.org/wiki/Research:Page_view
>     <https://meta.wikimedia.org/wiki/Research:Page_view>>>.
>     >     > As files are hourly and contain data for all wikis, you'll
>     need some
>     >     > aggregation and filtering.
>     >     >
>     >     > Cheers!
>     >     >
>     >     > On Mon, Mar 6, 2017 at 2:59 AM, Jörg Jung
>     <[email protected] <mailto:[email protected]>
>     <mailto:[email protected] <mailto:[email protected]>>
>     >     > <mailto:[email protected] <mailto:[email protected]>
>     <mailto:[email protected]
>     <mailto:[email protected]>>>> wrote:
>     >     >
>     >     >     Ladies, gents,
>     >     >
>     >     >     for a project i plan i'd need the following data:
>     >     >
>     >     >     Top 250K sites for 2016 in project de.wikipedia.org 
> <http://de.wikipedia.org> <http://de.wikipedia.org>
>     >     >     <http://de.wikipedia.org>, user-access.
>     >     >
>     >     >     I only need the name of the site and the corrsponding number 
> of
>     >     >     user-accesses (all channels) for 2016 (sum over the year).
>     >     >
>     >     >     As far as i can see i can't get that data via REST or by 
> aggegating
>     >     >     dumps.
>     >     >
>     >     >     So i'd like to ask here, if someone likes to helpout.
>     >     >
>     >     >     Thanx, cheers, JJ
>     >     >
>     >     >     --
>     >     >     Jörg Jung, Dipl. Inf. (FH)
>     >     >     Hasendriesch 2
>     >     >     D-53639 Königswinter
>     >     >     E-Mail:     [email protected] 
> <mailto:[email protected]>
>     >     <mailto:[email protected]
>     <mailto:[email protected]>> <mailto:[email protected]
>     <mailto:[email protected]>
>     >     <mailto:[email protected] <mailto:[email protected]>>>
>     >     >     Web:        www.retevastum.de <http://www.retevastum.de> 
> <http://www.retevastum.de>
>     >     <http://www.retevastum.de>
>     >     >                 www.datengraphie.de <http://www.datengraphie.de>
>     <http://www.datengraphie.de>
>     >     <http://www.datengraphie.de>
>     >     >                 www.digitaletat.de <http://www.digitaletat.de>
>     <http://www.digitaletat.de>
>     >     <http://www.digitaletat.de>
>     >     >                 www.olfaktum.de <http://www.olfaktum.de> 
> <http://www.olfaktum.de>
>     >     <http://www.olfaktum.de>
>     >     >
>     >     >     _______________________________________________
>     >     >     Analytics mailing list
>     >     >     [email protected] 
> <mailto:[email protected]>
>     >     <mailto:[email protected] 
> <mailto:[email protected]>>
>     >     <mailto:[email protected]
>     <mailto:[email protected]>
>     >     <mailto:[email protected]
>     <mailto:[email protected]>>>
>     >     >     https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >     <https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>>
>     >     >     <https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >     <https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>>>
>     >     >
>     >     >
>     >     >
>     >     >
>     >     > --
>     >     > *Marcel Ruiz Forns*
>     >     > Analytics Developer
>     >     > Wikimedia Foundation
>     >     >
>     >     >
>     >     > _______________________________________________
>     >     > Analytics mailing list
>     >     > [email protected]
>     <mailto:[email protected]>
>     <mailto:[email protected]
>     <mailto:[email protected]>>
>     >     > https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >     <https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>>
>     >     >
>     >
>     >     --
>     >     Jörg Jung, Dipl. Inf. (FH)
>     >     Hasendriesch 2
>     >     D-53639 Königswinter
>     >     E-Mail:     [email protected]
>     <mailto:[email protected]> <mailto:[email protected]
>     <mailto:[email protected]>>
>     >     Web:        www.retevastum.de <http://www.retevastum.de>
>     <http://www.retevastum.de>
>     >                 www.datengraphie.de <http://www.datengraphie.de>
>     <http://www.datengraphie.de>
>     >                 www.digitaletat.de <http://www.digitaletat.de>
>     <http://www.digitaletat.de>
>     >                 www.olfaktum.de <http://www.olfaktum.de>
>     <http://www.olfaktum.de>
>     >
>     >     _______________________________________________
>     >     Analytics mailing list
>     >     [email protected]
>     <mailto:[email protected]>
>     <mailto:[email protected]
>     <mailto:[email protected]>>
>     >     https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >     <https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>>
>     >
>     >
>     >
>     >
>     > _______________________________________________
>     > Analytics mailing list
>     > [email protected] <mailto:[email protected]>
>     > https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
>     >
> 
>     --
>     Jörg Jung, Dipl. Inf. (FH)
>     Hasendriesch 2
>     D-53639 Königswinter
>     E-Mail:     [email protected] <mailto:[email protected]>
>     Web:        www.retevastum.de <http://www.retevastum.de>
>                 www.datengraphie.de <http://www.datengraphie.de>
>                 www.digitaletat.de <http://www.digitaletat.de>
>                 www.olfaktum.de <http://www.olfaktum.de>
> 
>     _______________________________________________
>     Analytics mailing list
>     [email protected] <mailto:[email protected]>
>     https://lists.wikimedia.org/mailman/listinfo/analytics
>     <https://lists.wikimedia.org/mailman/listinfo/analytics>
> 
> 
> 
> 
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
> 

-- 
Jörg Jung, Dipl. Inf. (FH)
Hasendriesch 2
D-53639 Königswinter
E-Mail:     [email protected]
Web:        www.retevastum.de
            www.datengraphie.de
            www.digitaletat.de
            www.olfaktum.de

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to