>1. A UDF for ua-parser or whatever we decide to use (this will possibly be necessary for pageviews, but not necessarily - it depends on our >spider/automaton detection strategy) We got this one ready today: https://gerrit.wikimedia.org/r/#/c/166142/
On Fri, Oct 10, 2014 at 3:55 PM, Oliver Keyes <[email protected]> wrote: > > > On 10 October 2014 16:02, Nuria Ruiz <[email protected]> wrote: > >> >At some point I believe we hope to just, you know. Have a regularly >> updated browser matrix somewhere. >> I REALLY think this should make it into our goals, if it cannot be done >> this quarter it should for sure be done this quarter. >> >> > I agree it would be nice. It's one of those things that will either come > as a side-effect of other stuff, OR require subsantially more work, and > nothing in-between. Things we need for it: > > 1. A UDF for ua-parser or whatever we decide to use (this will possibly be > necessary for pageviews, but not necessarily - it depends on our > spider/automaton detection strategy) > 2. Pageviews data > 3. A table somewhere. > > Take 1, apply to 2, stick in 3. Maybe grab the same data for text/html > requests overall (depends on query runtime), maybe don't. > > The *ideal* implementation, obviously, is to pair this up with a site > that automatically parses the results into HTML. That should be the end > goal. but in terms of engineering support we can get most of the way there > simply by ensuring we always have a recent snapshot to hand. I can probably > put something together over the sampled logs and throw it in SQL if there > are urgent needs. > > >> Do we not have more recent data than May? >> > > We don't, but thanks to the utilities library I built, the code for > generating it would literally run: > > library(WMUtils) > uas <- > as.data.table(ua_parse(data_sieve(do.call("rbind",lapply(seq(20140901,20140930,1),sampled_logs)))$user_agent)) > > uas <- uas[,j = list(requests = .N, by = c("os","browser")] > > write.table(uas, file = uas_for_jon.tsv, sep = "\t", row.names = FALSE, > quote = TRUE) > > ...assuming we didn't care about readability. > > Point is, in the time until we have the new parser built into Hadoop and > that setup, we can totally generate interim data from the sampled logs > using the same parser at a tiny cost in research/programming time, iff (the > mathematical if) we need it enough that we're cool with the sampling, and > people can convince [[Dario|Our Great Leader]] to authorise me to spend 15 > minutes of my time on it. > > >> >> On Fri, Oct 10, 2014 at 12:45 PM, Oliver Keyes <[email protected]> >> wrote: >> >>> Email Dario and I, if he prioritises it I'll run a check on more recent >>> data. >>> >>> At some point I believe we hope to just, you know. Have a regularly >>> updated browser matrix somewhere. This comes some time after pageviews >>> though. >>> >>> On 10 October 2014 14:38, Toby Negrin <[email protected]> wrote: >>> >>>> Hi Jon -- I'm sure other folks will have more information but here's a >>>> link to a slide with some data from May[1]. We don't see a lot of Windows >>>> phone traffic. >>>> >>>> -Toby >>>> >>>> [1] >>>> https://docs.google.com/a/wikimedia.org/presentation/d/19tZgTi6VUG04wfGWVzcaZKY26oQiXhPaHI9g2tBmMKE/edit#slide=id.g382406373_08 >>>> >>>> On Fri, Oct 10, 2014 at 11:17 AM, Jon Robson <[email protected]> >>>> wrote: >>>> >>>>> I was going through our backlog again today, and I noticed a bug about >>>>> supporting editing on Windows Phones with IE9 [1] >>>>> >>>>> Yet again, I wondered 'how many of our users are using IE9' as I >>>>> wondered if because of this lack of support we are losing out on lots >>>>> of potential editors. >>>>> >>>>> What's the easiest way to get this information now? Is it available? >>>>> >>>>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=55599 >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>> >>>> >>> >>> >>> -- >>> Oliver Keyes >>> Research Analyst >>> Wikimedia Foundation >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
