>1. A UDF for ua-parser or whatever we decide to use (this will possibly be
necessary for pageviews, but not necessarily - it depends on our
>spider/automaton detection strategy)
We got this one ready today: https://gerrit.wikimedia.org/r/#/c/166142/




On Fri, Oct 10, 2014 at 3:55 PM, Oliver Keyes <[email protected]> wrote:

>
>
> On 10 October 2014 16:02, Nuria Ruiz <[email protected]> wrote:
>
>> >At some point I believe we hope to just, you know. Have a regularly
>> updated browser matrix somewhere.
>> I REALLY think this should make it into our goals, if it cannot be done
>> this quarter it should for sure be done this quarter.
>>
>>
> I agree it would be nice. It's one of those things that will either come
> as a side-effect of other stuff, OR require subsantially more work, and
> nothing in-between. Things we need for it:
>
> 1. A UDF for ua-parser or whatever we decide to use (this will possibly be
> necessary for pageviews, but not necessarily - it depends on our
> spider/automaton detection strategy)
> 2. Pageviews data
> 3. A table somewhere.
>
> Take 1, apply to 2, stick in 3. Maybe grab the same data for text/html
> requests overall (depends on query runtime), maybe don't.
>
> The *ideal* implementation, obviously, is to pair this up with a site
> that automatically parses the results into HTML. That should be the end
> goal. but in terms of engineering support we can get most of the way there
> simply by ensuring we always have a recent snapshot to hand. I can probably
> put something together over the sampled logs and throw it in SQL if there
> are urgent needs.
>
>
>> Do we not have more recent data than May?
>>
>
> We don't, but thanks to the utilities library I built, the code for
> generating it would literally run:
>
> library(WMUtils)
> uas <-
> as.data.table(ua_parse(data_sieve(do.call("rbind",lapply(seq(20140901,20140930,1),sampled_logs)))$user_agent))
>
> uas <- uas[,j = list(requests = .N, by = c("os","browser")]
>
> write.table(uas, file = uas_for_jon.tsv, sep = "\t", row.names = FALSE,
> quote = TRUE)
>
> ...assuming we didn't care about readability.
>
> Point is, in the time until we have the new parser built into Hadoop and
> that setup, we can totally generate interim data from the sampled logs
> using the same parser at a tiny cost in research/programming time, iff (the
> mathematical if) we need it enough that we're cool with the sampling, and
> people can convince [[Dario|Our Great Leader]] to authorise me to spend 15
> minutes of my time on it.
>
>
>>
>> On Fri, Oct 10, 2014 at 12:45 PM, Oliver Keyes <[email protected]>
>> wrote:
>>
>>> Email Dario and I, if he prioritises it I'll run a check on more recent
>>> data.
>>>
>>> At some point I believe we hope to just, you know. Have a regularly
>>> updated browser matrix somewhere. This comes some time after pageviews
>>> though.
>>>
>>> On 10 October 2014 14:38, Toby Negrin <[email protected]> wrote:
>>>
>>>> Hi Jon -- I'm sure other folks will have more information but here's a
>>>> link to a slide with some data from May[1]. We don't see a lot of Windows
>>>> phone traffic.
>>>>
>>>> -Toby
>>>>
>>>> [1]
>>>> https://docs.google.com/a/wikimedia.org/presentation/d/19tZgTi6VUG04wfGWVzcaZKY26oQiXhPaHI9g2tBmMKE/edit#slide=id.g382406373_08
>>>>
>>>> On Fri, Oct 10, 2014 at 11:17 AM, Jon Robson <[email protected]>
>>>> wrote:
>>>>
>>>>> I was going through our backlog again today, and I noticed a bug about
>>>>> supporting editing on Windows Phones with IE9 [1]
>>>>>
>>>>> Yet again, I wondered 'how many of our users are using IE9' as I
>>>>> wondered if because of this lack of support we are losing out on lots
>>>>> of potential editors.
>>>>>
>>>>> What's the easiest way to get this information now? Is it available?
>>>>>
>>>>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=55599
>>>>>
>>>>> _______________________________________________
>>>>> Analytics mailing list
>>>>> [email protected]
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Oliver Keyes
>>> Research Analyst
>>> Wikimedia Foundation
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to