@Vipul: thanks for flagging this. We accidentally merged a change that ignored pages with a + in their title for the time period that Marcel mentioned: April 24th to June 6th. The relevant commits in our history are these:
accident: https://phabricator.wikimedia.org/rANRSd7e2b6bc1d69eeef2907df7b42bca62936149353 fix: https://phabricator.wikimedia.org/rANRS561868c68415fba92f05d78bb322be8a58bce79c The raw data is purged regularly so we couldn't rebuild. We also have generally chosen to annotate our data instead of rebuilding. I have now added this incident to the relevant page: https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData_Lake%2FTraffic%2FPageview_hourly&type=revision&diff=1849390&oldid=1838199 On Thu, Jan 2, 2020 at 9:58 AM Marcel Ruiz Forns <[email protected]> wrote: > Hi Vipul! > Thanks for letting us know about this. > This is indeed a problem. And I think it's related to the + special > character in the title of the page. > I checked general traffic for English Wikipedia, and it looks OK to me. > But then I checked other pages with the same + character in them, and they > show the same pattern. > They stop somewhere in the middle of April 24th and come back in the > middle of June 6th. > I created a task for this, we'll be prioritizing it soon. > See: https://phabricator.wikimedia.org/T241734 > Thanks a lot! > > On Wed, Jan 1, 2020 at 6:39 PM Vipul Naik <[email protected]> wrote: > >> I was trying to get pageviews data for the Travel + Leisure Wikipedia >> page https://en.wikipedia.org/wiki/Travel_%2B_Leisure >> >> It seems like the data is missing for the month of May on desktop. In >> particular, this link returns a Not found error: >> >> >> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Travel_%2B_Leisure/daily/20190501/20190531?purge1328419450 >> >> The corresponding links for April and June return data, but the last few >> days of April and the first few days of June are missing: >> >> >> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Travel_%2B_Leisure/daily/20190601/20190630?purge1595833545 >> (data >> is missing for June 1 to 5 but present June 6 onward) >> >> >> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Travel_%2B_Leisure/daily/20190401/20190430?purge1328419450 >> (data >> is missing for April 25 onward) >> >> The same is true on mobile-web. >> >> I thought it's possible the article was deleted and then reinstated, but >> the revision history doesn't suggest any changes during the time period, >> and there is no update on the talk page and nothing in the deletion log. >> >> Any ideas? >> >> I've also noticed the pageviews API occasionally omitting data for a few >> days for other queries, though a re-query usually works to fill in the >> missing data. For instance, >> https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/Alcohol_and_cancer/daily/20191101/20191130 >> originally >> returned no results for me but on a re-query I was able to get results. >> I'll share more information on this in a separate email if I'm able to >> reproduce. >> >> Thank you, >> >> Vipul >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > -- > *Marcel Ruiz Forns** (he/him)* > Analytics Developer @ Wikimedia Foundation > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
