The other way around, I think; only bots work Sundays. We know a lot of the search queries that don't work /shouldn't/ work: they're producing no results because they're nonsense, or spam, or someone being silly through the API. Normal human traffic rises on a Monday to peak on a Tuesday, and begins to drop down again towards the end of the week and weekend. What this means is that the proportion of traffic coming from non-humans is greater on the weekends (because fewer people are browsing) and that increases the impact of automata on the zero results rate for those days.
On 4 January 2016 at 23:28, billinghurst <[email protected]> wrote: > What is with issue that we have a weekly cycle (exactly?) where there is a > 4% difference in the success in half a week, EVERY WEEK! > > With the number of searches done on the site, that seems like an aberration > that a each Sunday is a more accurate search day!?! Analytical gremlins of > data capture, or not even bots work Sundays? > > > On Tue, 5 Jan 2016 06:54 Oliver Keyes <[email protected]> wrote: >> >> (Links: the dashboards live at http://discovery.wmflabs.org/ and an >> example of automata filtering can be seen at >> http://discovery.wmflabs.org/metrics/#failure_rate !) >> >> That is, 2% and 5% lower? You're looking at percentages so where the >> lines vary between checkbox options it'll be different proportions. >> Unless there's a graph I'm missing :D >> >> On 4 January 2016 at 13:45, Trey Jones <[email protected]> wrote: >> > This is awesome. Roughly, by eye, it looks like automata are about 2% of >> > ZRR >> > overall and 5% of ZRR for fulltext search, which was around 15% before >> > the >> > holidays (and lower over the holidays—during The Time of Unreliable User >> > Behavior). >> > >> > Is there a write up for this project? I know it had to be a ton of work, >> > and >> > I'm curious about the details (possibly more so than most). >> > >> > Do you think you got most of them? Or was the result high-precision but >> > not >> > exhaustive? >> > >> > Thanks for working on this! >> > >> > —Trey >> > >> > Trey Jones >> > Software Engineer, Discovery >> > Wikimedia Foundation >> > >> > On Mon, Jan 4, 2016 at 1:29 PM, Oliver Keyes <[email protected]> >> > wrote: >> >> >> >> Hey all, >> >> >> >> After several weeks of work to switch all the scripts over and >> >> backfill, all the Discovery dashboards now have the ability to filter >> >> crawlers and automated software out from graphs where that is >> >> relevant. You should notice a simple checkbox on, for example, the >> >> Zero Results Rate data or Wikidata Query Service traffic. >> >> >> >> While a bit of backfilling is still waiting on the servers syncing up, >> >> this work is essentially complete, and provides another way to look at >> >> data on how people are using search (and who those people are). It was >> >> a heck of a lot of work, by both myself and Mikhail, but it's >> >> hopefully valuable :). >> >> >> >> For Discovery Analytics, >> >> >> >> -- >> >> Oliver Keyes >> >> Count Logula >> >> Wikimedia Foundation >> >> >> >> _______________________________________________ >> >> discovery mailing list >> >> [email protected] >> >> https://lists.wikimedia.org/mailman/listinfo/discovery >> > >> > >> > >> > _______________________________________________ >> > discovery mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/discovery >> > >> >> >> >> -- >> Oliver Keyes >> Count Logula >> Wikimedia Foundation >> >> _______________________________________________ >> discovery mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/discovery > > > _______________________________________________ > discovery mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/discovery > -- Oliver Keyes Count Logula Wikimedia Foundation _______________________________________________ discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
