>The good source for Recent pageview data is hadoop, going back a bit the
well-loved webstatscollector files provide that info:

Sorry, mean to sent two links:

http://dumps.wikimedia.org/other/pagecounts-all-sites/ -> this is data from
hadoop

http://dumps.wikimedia.org/other/pagecounts-raw/ -> this is data from
webstatscollector

On Thu, Jan 8, 2015 at 7:34 AM, Nuria Ruiz <[email protected]> wrote:

> >It uses 1:1000 random sampling, so I have to count the log events and
> multiply by 1000 to get a good estimation. Am I missing something?
> Quite a bit actually. Mostly that reporting is only available to "some"
> browsers (the majority but not all) but also only the main document is
> counted and a pageview is more than the request of the main document. For
> example, you will not get all 301s/302s or images and there are many, many
> other details.
>
> See pageview definition:
> https://meta.wikimedia.org/wiki/Research:Page_view
>
> The good source for Recent pageview data is hadoop, going back a bit the
> well-loved webstatscollector files provide that info:
> http://dumps.wikimedia.org/other/pagecounts-all-sites/
>
>
> On Wed, Jan 7, 2015 at 11:12 PM, Gergo Tisza <[email protected]> wrote:
>
>> On Wed, Jan 7, 2015 at 5:59 PM, Nuria Ruiz <[email protected]> wrote:
>>
>>> >Back when MediaViewer was launched, I added a namespace parameter to
>>> NavigationTiming to be able to track per-namespace pageviews,
>>> Navigation timing is heavily sampled so I am not sure you could estimate
>>> pageviews with the scarce dataset it provides, I would say it is not
>>> possible.
>>>
>>
>> It uses 1:1000 random sampling, so I have to count the log events and
>> multiply by 1000 to get a good estimation. Am I missing something?
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to