As Billinghurst was kind enough to point out, I got that the wrong way round. Serves me right for replying to emails at 6am!
The weekly effect is definitely seasonal, which supports the idea that it's not artificial; as to what causes it, either (1) humans are fallible and "regular" automata is less-so (irregular automata, who knows?) or (2) as Nemo suggests, people are searching for different things, which would be fascinating to analyse if we could :/ On 5 January 2016 at 06:04, Oliver Keyes <[email protected]> wrote: > The other way around, I think; only bots work Sundays. We know a lot > of the search queries that don't work /shouldn't/ work: they're > producing no results because they're nonsense, or spam, or someone > being silly through the API. Normal human traffic rises on a Monday to > peak on a Tuesday, and begins to drop down again towards the end of > the week and weekend. What this means is that the proportion of > traffic coming from non-humans is greater on the weekends (because > fewer people are browsing) and that increases the impact of automata > on the zero results rate for those days. > > On 4 January 2016 at 23:28, billinghurst <[email protected]> wrote: >> What is with issue that we have a weekly cycle (exactly?) where there is a >> 4% difference in the success in half a week, EVERY WEEK! >> >> With the number of searches done on the site, that seems like an aberration >> that a each Sunday is a more accurate search day!?! Analytical gremlins of >> data capture, or not even bots work Sundays? >> >> >> On Tue, 5 Jan 2016 06:54 Oliver Keyes <[email protected]> wrote: >>> >>> (Links: the dashboards live at http://discovery.wmflabs.org/ and an >>> example of automata filtering can be seen at >>> http://discovery.wmflabs.org/metrics/#failure_rate !) >>> >>> That is, 2% and 5% lower? You're looking at percentages so where the >>> lines vary between checkbox options it'll be different proportions. >>> Unless there's a graph I'm missing :D >>> >>> On 4 January 2016 at 13:45, Trey Jones <[email protected]> wrote: >>> > This is awesome. Roughly, by eye, it looks like automata are about 2% of >>> > ZRR >>> > overall and 5% of ZRR for fulltext search, which was around 15% before >>> > the >>> > holidays (and lower over the holidays—during The Time of Unreliable User >>> > Behavior). >>> > >>> > Is there a write up for this project? I know it had to be a ton of work, >>> > and >>> > I'm curious about the details (possibly more so than most). >>> > >>> > Do you think you got most of them? Or was the result high-precision but >>> > not >>> > exhaustive? >>> > >>> > Thanks for working on this! >>> > >>> > —Trey >>> > >>> > Trey Jones >>> > Software Engineer, Discovery >>> > Wikimedia Foundation >>> > >>> > On Mon, Jan 4, 2016 at 1:29 PM, Oliver Keyes <[email protected]> >>> > wrote: >>> >> >>> >> Hey all, >>> >> >>> >> After several weeks of work to switch all the scripts over and >>> >> backfill, all the Discovery dashboards now have the ability to filter >>> >> crawlers and automated software out from graphs where that is >>> >> relevant. You should notice a simple checkbox on, for example, the >>> >> Zero Results Rate data or Wikidata Query Service traffic. >>> >> >>> >> While a bit of backfilling is still waiting on the servers syncing up, >>> >> this work is essentially complete, and provides another way to look at >>> >> data on how people are using search (and who those people are). It was >>> >> a heck of a lot of work, by both myself and Mikhail, but it's >>> >> hopefully valuable :). >>> >> >>> >> For Discovery Analytics, >>> >> >>> >> -- >>> >> Oliver Keyes >>> >> Count Logula >>> >> Wikimedia Foundation >>> >> >>> >> _______________________________________________ >>> >> discovery mailing list >>> >> [email protected] >>> >> https://lists.wikimedia.org/mailman/listinfo/discovery >>> > >>> > >>> > >>> > _______________________________________________ >>> > discovery mailing list >>> > [email protected] >>> > https://lists.wikimedia.org/mailman/listinfo/discovery >>> > >>> >>> >>> >>> -- >>> Oliver Keyes >>> Count Logula >>> Wikimedia Foundation >>> >>> _______________________________________________ >>> discovery mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/discovery >> >> >> _______________________________________________ >> discovery mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/discovery >> > > > > -- > Oliver Keyes > Count Logula > Wikimedia Foundation -- Oliver Keyes Count Logula Wikimedia Foundation _______________________________________________ discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
