Woah! Nice :D How are definitions updates handled?

On 10 October 2014 18:58, Nuria Ruiz <[email protected]> wrote:

> >1. A UDF for ua-parser or whatever we decide to use (this will possibly
> be necessary for pageviews, but not necessarily - it depends on our
> >spider/automaton detection strategy)
> We got this one ready today: https://gerrit.wikimedia.org/r/#/c/166142/
>
>
>
>
> On Fri, Oct 10, 2014 at 3:55 PM, Oliver Keyes <[email protected]>
> wrote:
>
>>
>>
>> On 10 October 2014 16:02, Nuria Ruiz <[email protected]> wrote:
>>
>>> >At some point I believe we hope to just, you know. Have a regularly
>>> updated browser matrix somewhere.
>>> I REALLY think this should make it into our goals, if it cannot be done
>>> this quarter it should for sure be done this quarter.
>>>
>>>
>> I agree it would be nice. It's one of those things that will either come
>> as a side-effect of other stuff, OR require subsantially more work, and
>> nothing in-between. Things we need for it:
>>
>> 1. A UDF for ua-parser or whatever we decide to use (this will possibly
>> be necessary for pageviews, but not necessarily - it depends on our
>> spider/automaton detection strategy)
>> 2. Pageviews data
>> 3. A table somewhere.
>>
>> Take 1, apply to 2, stick in 3. Maybe grab the same data for text/html
>> requests overall (depends on query runtime), maybe don't.
>>
>> The *ideal* implementation, obviously, is to pair this up with a site
>> that automatically parses the results into HTML. That should be the end
>> goal. but in terms of engineering support we can get most of the way there
>> simply by ensuring we always have a recent snapshot to hand. I can probably
>> put something together over the sampled logs and throw it in SQL if there
>> are urgent needs.
>>
>>
>>> Do we not have more recent data than May?
>>>
>>
>> We don't, but thanks to the utilities library I built, the code for
>> generating it would literally run:
>>
>> library(WMUtils)
>> uas <-
>> as.data.table(ua_parse(data_sieve(do.call("rbind",lapply(seq(20140901,20140930,1),sampled_logs)))$user_agent))
>>
>> uas <- uas[,j = list(requests = .N, by = c("os","browser")]
>>
>> write.table(uas, file = uas_for_jon.tsv, sep = "\t", row.names = FALSE,
>> quote = TRUE)
>>
>> ...assuming we didn't care about readability.
>>
>> Point is, in the time until we have the new parser built into Hadoop and
>> that setup, we can totally generate interim data from the sampled logs
>> using the same parser at a tiny cost in research/programming time, iff (the
>> mathematical if) we need it enough that we're cool with the sampling, and
>> people can convince [[Dario|Our Great Leader]] to authorise me to spend 15
>> minutes of my time on it.
>>
>>
>>>
>>> On Fri, Oct 10, 2014 at 12:45 PM, Oliver Keyes <[email protected]>
>>> wrote:
>>>
>>>> Email Dario and I, if he prioritises it I'll run a check on more recent
>>>> data.
>>>>
>>>> At some point I believe we hope to just, you know. Have a regularly
>>>> updated browser matrix somewhere. This comes some time after pageviews
>>>> though.
>>>>
>>>> On 10 October 2014 14:38, Toby Negrin <[email protected]> wrote:
>>>>
>>>>> Hi Jon -- I'm sure other folks will have more information but here's a
>>>>> link to a slide with some data from May[1]. We don't see a lot of Windows
>>>>> phone traffic.
>>>>>
>>>>> -Toby
>>>>>
>>>>> [1]
>>>>> https://docs.google.com/a/wikimedia.org/presentation/d/19tZgTi6VUG04wfGWVzcaZKY26oQiXhPaHI9g2tBmMKE/edit#slide=id.g382406373_08
>>>>>
>>>>> On Fri, Oct 10, 2014 at 11:17 AM, Jon Robson <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I was going through our backlog again today, and I noticed a bug about
>>>>>> supporting editing on Windows Phones with IE9 [1]
>>>>>>
>>>>>> Yet again, I wondered 'how many of our users are using IE9' as I
>>>>>> wondered if because of this lack of support we are losing out on lots
>>>>>> of potential editors.
>>>>>>
>>>>>> What's the easiest way to get this information now? Is it available?
>>>>>>
>>>>>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=55599
>>>>>>
>>>>>> _______________________________________________
>>>>>> Analytics mailing list
>>>>>> [email protected]
>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Oliver Keyes
>>>> Research Analyst
>>>> Wikimedia Foundation
>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>
>


-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to