> "At about 0.5% of our total human views currently, they start to matter for overall traffic trend analysis etc." Our bot traffic not reported as such is a lot higher than 0.5%, probably more like one order of magnitude higher 2% at least. See: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageviews/Bots_Research
This matters a lot for computations like top pageviews which are so distorted by bot traffic that they almost become not useful. Now, while these are overall numbers the effect is felt mostly on english wikipedia as smaller wikipedias have a lot smaller percentage of non reported bots. On Thu, Aug 3, 2017 at 8:59 AM, Tilman Bayer <[email protected]> wrote: > For those with NDA access, see also the more detailed investigation at > https://phabricator.wikimedia.org/T157404 (nothing super secret about the > topic per se, it's just that some partial IP data was examined in the > process, so the task was set to non-public to avoid privacy concerns) > > When filing that task half a year ago, I wrote that "At about 0.5% of our > total human views currently, they start to matter for overall traffic trend > analysis etc." They have since increased and, as can be gleaned from > Kaldari's remarks, do indeed affect our global stats markedly now. I have > started to remove them in the pageviews stats and trends I'm preparing, > will follow up with more detail on Phabricator. > > > On Fri, Jul 21, 2017 at 9:24 PM, Nuria Ruiz <[email protected]> wrote: > >> >Surely this can't be accurate though as most other sites on the >> internet report virtually non-existent usage of IE7 (less than 1% >> everywhere I've checked). Can someone >double-check this? >> This is likely bot traffic with IE7 user-agent. See: >> https://phabricator.wikimedia.org/T148461 >> >> We will hopefully be able to tackle distortion of stats by non-labelled >> bot traffic in the next year: https://phabricator.wikimedia.org/T138207 >> >> Issue for dataset noted here: https://wikitech.wikimed >> ia.org/wiki/Analytics/Data_Lake/Traffic/Browser_general#Chan >> ges_and_known_problems_since_2016-03-21 >> >> >> >> >> On Fri, Jul 21, 2017 at 4:36 PM, Ryan Kaldari <[email protected]> >> wrote: >> >>> According to... >>> https://analytics.wikimedia.org/dashboards/browsers/#all-sit >>> es-by-browser/browser-family-and-major-hierarchical-view >>> ... IE7 accounts for 2.5% of all pageviews in the last month. >>> >>> According to... >>> https://analytics.wikimedia.org/dashboards/browsers/#desktop >>> -site-by-browser/browser-family-and-major-hierarchical-view >>> ... IE7 accounts for 5.1% of all desktop pageviews in the last month. >>> >>> If that's true, IE7 (which came out 10 years ago) is more popular than >>> all versions of Safari combined. It also means that we need to roll back a >>> whole slew of features in MediaWiki that aren't supported in IE7. >>> >>> Surely this can't be accurate though as most other sites on the internet >>> report virtually non-existent usage of IE7 (less than 1% everywhere I've >>> checked). Can someone double-check this? >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > > -- > Tilman Bayer > Senior Analyst > Wikimedia Foundation > IRC (Freenode): HaeB > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
