Cool! I guess I had in my mind from previous and incomplete conversation
that the goal was a specific cutoff. Thanks for the clarification.

Trey Jones
Software Engineer, Discovery
Wikimedia Foundation

On Fri, Jan 22, 2016 at 4:05 PM, Mikhail Popov <[email protected]> wrote:

> That's actually our goal with quick surveys :P We want to ask users for
> their satisfaction with our search and then build a predictive model with
> satisfaction as the response variable and dwell time + other data as the
> predictor variables.
>
> Right now we're stuck at the "get training data" step. Once that's
> resolved, we can do precisely what you described :D Then we'll have a daily
> estimate of user satisfaction (unobservable without direct user feedback)
> using data we can observe (browsing behavior).
>
> Thanks,
> Mikhail
>
> On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones <[email protected]> wrote:
>
>> Yesterday in the quarterly review Dan mentioned that our current user
>> satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a
>> successful search, and that we want to use a survey to correlate
>> qualitative and quantitative values to pin down a better cutoff for our
>> users. I don't remember whether Dan mentioned it, or I was just rehashing
>> the notion on my own, but it may be difficult to pin down a specific cutoff.
>>
>> A wild thought appears! Why do we have to pin down a specific cut off?
>> Why can't we have a probabilistic user satisfaction metric? (Other then
>> complexity and computational speed, which may be relevant.)
>>
>> We have the ability to gather so much data that we could easily compute
>> something like this: 20% of users are satisfied when dwell time is <5s, 35%
>> for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
>>
>> Determining the cutoffs might be tricky, and computation is more complex
>> than counting, but not ridiculously complicated, and potentially much more
>> accurate for large samples. Presenting the results is still easy: "54.7% of
>> our users are happy with their search results based on our dwell-time
>> model".
>>
>> I tried to do a quick search for papers on this topic, but I didn't find
>> anything. I'm not familiar with the literature, so that may not mean much.
>>
>> Okay, back to the TextCat mines....
>>
>> —Trey
>>
>> Trey Jones
>> Software Engineer, Discovery
>> Wikimedia Foundation
>>
>> _______________________________________________
>> discovery mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/discovery
>>
>>
>
>
> --
> *Mikhail Popov* // Data Analyst, Discovery
> <https://www.mediawiki.org/wiki/Wikimedia_Discovery>
> https://wikimediafoundation.org/
>
> *Imagine a world in which every single human being can freely share in the
> **sum of all knowledge. That's our commitment.* Donate
> <https://donate.wikimedia.org/>.
>
> _______________________________________________
> discovery mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/discovery
>
>
_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

Reply via email to