The difference is very small, but you're right to point it out, I've opened
a task to look into it: https://phabricator.wikimedia.org/T205457


On Wed, Sep 19, 2018 at 5:10 PM Felix J. Scholz <[email protected]>
wrote:

> Hey,
>
> I've been looking through the documentation on the pageview api in recent
> days, and have a question that I have not been able to come up with a
> solution to so far.
>
> Per my understanding, the data accessible through the "aggregated by
> project" pageview api [1], when filtered to just query "user" agents,
> should return the same results as can be found in the hourly pageview dumps
> data [2 / 3].
>
> However, while the data is close, in two of my brief tests (for the data
> of October 1, 2015) the values did not match up.
>
> Data from "aggregate" API:
> en.wikipedia & excluding spiders [4]: 238.845.634
> pt.wikipedia & excluding spiders [5]: 11.390.043
>
> Data from pageview dumps [3]:
> en & en.zero & en.m: 238.840.836
> pt & pt.zero & pt.m: 11.389.979
>
> As you can see while the values are close, they do not match.
>
> What am I missing here? Am I maybe mistaken in the notion that the two
> data sources are providing data from the same source and thus should be
> compatible?
>
> Felix
>
> [1] https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews
> [2]
> https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageviews
> [3] https://dumps.wikimedia.org/other/pageviews/
> [4]
> https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/en.wikipedia/all-access/user/daily/2015100100/2015100100
> [5]
> https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/pt.wikipedia/all-access/user/daily/2015100100/2015100100
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to