@Vipul: thanks for flagging this.  We accidentally merged a change that
ignored pages with a + in their title for the time period that Marcel
mentioned: April 24th to June 6th.  The relevant commits in our history are
these:

accident:
https://phabricator.wikimedia.org/rANRSd7e2b6bc1d69eeef2907df7b42bca62936149353
fix:
https://phabricator.wikimedia.org/rANRS561868c68415fba92f05d78bb322be8a58bce79c

The raw data is purged regularly so we couldn't rebuild.  We also have
generally chosen to annotate our data instead of rebuilding.  I have now
added this incident to the relevant page:

https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData_Lake%2FTraffic%2FPageview_hourly&type=revision&diff=1849390&oldid=1838199

On Thu, Jan 2, 2020 at 9:58 AM Marcel Ruiz Forns <[email protected]>
wrote:

> Hi Vipul!
> Thanks for letting us know about this.
> This is indeed a problem. And I think it's related to the + special
> character in the title of the page.
> I checked general traffic for English Wikipedia, and it looks OK to me.
> But then I checked other pages with the same + character in them, and they
> show the same pattern.
> They stop somewhere in the middle of April 24th and come back in the
> middle of June 6th.
> I created a task for this, we'll be prioritizing it soon.
> See: https://phabricator.wikimedia.org/T241734
> Thanks a lot!
>
> On Wed, Jan 1, 2020 at 6:39 PM Vipul Naik <[email protected]> wrote:
>
>> I was trying to get pageviews data for the Travel + Leisure Wikipedia
>> page https://en.wikipedia.org/wiki/Travel_%2B_Leisure
>>
>> It seems like the data is missing for the month of May on desktop. In
>> particular, this link returns a Not found error:
>>
>>
>> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Travel_%2B_Leisure/daily/20190501/20190531?purge1328419450
>>
>> The corresponding links for April and June return data, but the last few
>> days of April and the first few days of June are missing:
>>
>>
>> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Travel_%2B_Leisure/daily/20190601/20190630?purge1595833545
>>  (data
>> is missing for June 1 to 5 but present June 6 onward)
>>
>>
>> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Travel_%2B_Leisure/daily/20190401/20190430?purge1328419450
>>  (data
>> is missing for April 25 onward)
>>
>> The same is true on mobile-web.
>>
>> I thought it's possible the article was deleted and then reinstated, but
>> the revision history doesn't suggest any changes during the time period,
>> and there is no update on the talk page and nothing in the deletion log.
>>
>> Any ideas?
>>
>> I've also noticed the pageviews API occasionally omitting data for a few
>> days for other queries, though a re-query usually works to fill in the
>> missing data. For instance,
>> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Alcohol_and_cancer/daily/20191101/20191130
>>  originally
>> returned no results for me but on a re-query I was able to get results.
>> I'll share more information on this in a separate email if I'm able to
>> reproduce.
>>
>> Thank you,
>>
>> Vipul
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
> --
> *Marcel Ruiz Forns** (he/him)*
> Analytics Developer @ Wikimedia Foundation
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to