Thank you very much, Neil! This is very much helpful. :) On Mon, 21 Dec 2020, 4:54 pm Neil Shah-Quinn, <[email protected]> wrote:
> Unfortunately, I can't tell you anything more than what you already know! > I think that huge, temporary spikes in edits or pageviews that don't match > expected patterns of human use (like the death of a celebrity or a big > editing campaign) are most likely caused by bots. With editing spikes, I > can usually confirm this belief by examining the edits. With pageview > spikes, it's much harder. If the spike was in the last 90 days, I could > investigate more by looking at the confidential raw traffic data > <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest>, > but after 90 days, that data is deleted to protect user privacy. > > <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest> > The case you mentioned fits all my criteria. First, it is a huge, > temporary spike. Second, it doesn't match expected patterns of human use: > there is no matching spike in mobile pageviews and the pages involved are > not pages humans would want to read. So, it's for these reasons only that I > am confident that it was caused by bots. > > Now, *why* would someone use a bot to access millions of Bangla Wikipedia > articles for a single month? I have no idea. It could just be a programmer > somewhere doing an experiment. Your guess is as good as mine 😊 > > On Fri, 18 Dec 2020 at 21:52, Ankan Ghosh Dastider < > [email protected]> wrote: > >> Hi Neil, >> >> Thank you very much for responding so fast. >> >> That's can be the potential answer! Can you please share any definite (or >> relative) information regarding the error at that time, if possible? Can >> you give me any idea on why the bot view increases so much on a certain >> year (and on some certain dates)? If possible, any example will be really >> helpful. >> >> >> Ankan >> >> On Fri, Dec 18, 2020 at 10:01 PM Neil Shah-Quinn < >> [email protected]> wrote: >> >>> That's a good question! I think the most likely explanation is that a >>> bot automatically viewed those pages. I see that you have already removed >>> "spider" and "automated" traffic in your Wikistats graphs, but those >>> classifications are not perfect. Before March 2020, they only detected bots >>> that explicitly marked themselves as bots. Now, our methods are more >>> sophisticated >>> <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/BotDetection>, >>> but I am sure they still miss some things. >>> >>> On Fri, 18 Dec 2020 at 18:48, Ankan Ghosh Dastider < >>> [email protected]> wrote: >>> >>>> Hello everyone, >>>> >>>> I am Ankan, a Wikimedian from Bangladesh. Recently, I was searching for >>>> the Wikimedia stats website for research purposes. I got a bit curious >>>> regarding the Bengali Wikipedia total page view section >>>> <https://stats.wikimedia.org/#/bn.wikipedia.org/reading/total-page-views/normal%7Cbar%7Call%7Caccess~desktop*mobile-app*mobile-web+(agent)~user%7Cmonthly>, >>>> as the traffic didn't match the normal flow in January 2018 and faced a >>>> sudden surge of desktop access by users. It is unprecedented and highest >>>> till today. If you check the normal rate of desktop access, you will see >>>> that it is almost 450% than the second highest. >>>> >>>> The pageview result suggests that the top-visited pages are >>>> category-related and date-related pages (the highest visited one is >>>> 'Category:Stubs', see here >>>> <https://pageviews.toolforge.org/?project=bn.wikipedia.org&platform=desktop&agent=user&redirects=0&start=2018-01-01&end=2018-01-31&pages=%E0%A6%AC%E0%A6%BF%E0%A6%B7%E0%A6%AF%E0%A6%BC%E0%A6%B6%E0%A7%8D%E0%A6%B0%E0%A7%87%E0%A6%A3%E0%A7%80:%E0%A6%85%E0%A6%B8%E0%A6%AE%E0%A7%8D%E0%A6%AA%E0%A7%82%E0%A6%B0%E0%A7%8D%E0%A6%A3>) >>>> which is quite enigmatic as these pages are hardly viewed by the general >>>> readers. The result of certain dates in January 2018 is completely >>>> exceptional. >>>> >>>> Note that, I have checked some other languages and the rate is normal >>>> there. >>>> >>>> I am seeking your assistance to analyze the probable reason behind this >>>> surge. Thanks in advance! >>>> >>>> >>>> Best regards, >>>> Ankan >>>> >>>> -- >>>> Ankan Ghosh Dastider (he/him) >>>> User:ANKAN <https://meta.wikimedia.org/wiki/User:ANKAN> || All Wikimedia >>>> Foundation <https://meta.wikimedia.org/wiki/Wikimedia_Foundation>'s >>>> public Wiki >>>> Executive Member || Wikimedia Bangladesh <http://wikimedia.org.bd/> >>>> Twitter <https://twitter.com/Iagdastider> | LinkedIn >>>> <https://www.linkedin.com/in/ankan-ghosh-dastider/> | ResearchGate >>>> <https://www.researchgate.net/profile/Ankan_Ghosh_Dastider> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>> >>> >>> -- >>> Neil Shah-Quinn >>> senior data scientist, Product Analytics >>> <https://www.mediawiki.org/wiki/Product_Analytics> >>> Wikimedia Foundation <https://wikimediafoundation.org/> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> >> -- >> Ankan Ghosh Dastider (he/him) >> User:ANKAN <https://meta.wikimedia.org/wiki/User:ANKAN> || All Wikimedia >> Foundation <https://meta.wikimedia.org/wiki/Wikimedia_Foundation>'s >> public Wiki >> Executive Member || Wikimedia Bangladesh <http://wikimedia.org.bd/> >> Twitter <https://twitter.com/Iagdastider> | LinkedIn >> <https://www.linkedin.com/in/ankan-ghosh-dastider/> | ResearchGate >> <https://www.researchgate.net/profile/Ankan_Ghosh_Dastider> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > -- > Neil Shah-Quinn > senior data scientist, Product Analytics > <https://www.mediawiki.org/wiki/Product_Analytics> > Wikimedia Foundation <https://wikimediafoundation.org/> > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
