Re: [discovery] Relevance Thoughts

Oliver Keyes Mon, 08 Feb 2016 03:31:35 -0800

Hey Justin,

Could you summarise your ideas/proposals? I imagine reading the raw
logs might be a bit of a time investment for readers :).


On 5 February 2016 at 17:26, Justin Ormont <[email protected]> wrote:
> Greetings,
>
> Moving discussion from irc to email for added transparency and visibility...
>
> Previously on irc:
>
> tfinc
>
> 13:45 Deskana: so much really interesting talk about search on
> https://meta.wikimedia.org/wiki/Talk:2016_Strategy/Reach#NaBUru38
>
> 13:46 https://meta.wikimedia.org/wiki/Talk:2016_Strategy/Reach
>
> 13:46 less about that specific post and more about the conversations in
> general
>
> 13:46 i see lot of people who could help us test and move with next steps
>
>
> JustinO
>
> 13:49 that talk is actually what reminded me to check in with you folks and
> see if you wanted assistance in the relevance area
>
>
> tfinc
>
> 13:51 JustinO: greetings. we can always use wise guidance and help to make
> our users and donors proud. what do you have in mind ?
>
>
> JustinO
>
> 13:52 last year I was talking with a couple of folks after elasticon
>
> 13:53 and we were going thru the first steps like which metrics are useful
> to track
>
>
> jgirault
>
> 13:54 debt: OuKB: jan_drewniak: besides a varnish issue with images, the
> page with separate JS file is on beta http://www.wikipedia.beta.wmflabs.org/
>
>
> tfinc
>
> 13:55 JustinO: ebernhardson and i will be at this years elasticon
>
> 13:56 JustinO: we've been looking at a number of interesting metrics to
> validate user satisfaction for our search relevance. bearloga can tell you
> plenty about it
>
>
> JustinO
>
> 13:57 awesome. i looked thru some of your docs. tracking dwell time is great
> as it opens up a whole host of useful metrics
>
>
> ebernhardson
>
> 13:57 JustinO: we almost certainly need help in relevane :) we are currently
> hitting some very high level things, but we need to to a lot more in terms
> of collecting and measuring relevance (both from users, and in back testing
> for new features) to do well moving forward
>
>
> bearloga
>
> 13:58 JustinO: we're tracking dwell time and clickthrough rate. we hope to
> get some qualitative user feedback to correlate that with the quantitative
> data we're tracking
>
>
> JustinO
>
> 13:58 with that you can infer good clicks vs. bad clicks. which leads to a
> session success rate, time to success, etc. and in the long run gives you a
> training set to do offline evaluations and in the long term, machine learned
> rankers
>
>
> jgirault
>
> 13:59 the deploy-to-prod patch would be:
> https://gerrit.wikimedia.org/r/268804
>
>
> tfinc
>
> 13:59 JustinO: Trey314159 has worked a bit on creating a base line relevance
> lab to do offline evaluations between different ranking/sorting/etc
> algorithms
>
>
> JustinO
>
> 14:00 @bearloga: one simple way of qualitative feed back is the simple "how
> was you search today?" message
>
>
> jan_drewniak
>
> 14:01 jgirault: like someone once said, the hardest things in programming
> are cache invalidation and naming things
>
>
> JustinO
>
> 14:01 @tfink: offline evals are very useful. creating a hand generated
> judgment set with cleans labels takes time but pays off
>
>
> ebernhardson
>
> 14:01 we also do track which position the user clicked, in addition to dwell
> time. But i don't think we are doing anything with that information yet
>
>
> bearloga
>
> 14:02 JustinO: the question we're going to ask is basically that but we're
> working on rolling out that feedback system
>
>
> jgirault
>
> 14:02 jan_drewniak: and choosing between spaces and tabs
>
>
> JustinO
>
> 14:04 ebernhardson: i think i was suggesting tracking {query, all results,
> position clicked, dwell time on the clicked page, userid, time from from
> pageload to click}
>
>
> jgirault
>
> 14:04 alright, so I'm gonna head to the office now. Once I get there, I'll
> try to find someone to push that to prod. Meanwhile, if you have time
> jan_drewniak you can sanity check the latest master
>
>
> Trey314159
>
> 14:04 JustinO: Hey! Sorry Dan (Deskana) and I haven't gotten back to your
> email yet. It's been a busy week, and there's a lot of stuff but not a lot
> of context to that email thread.
>
>
> JustinO
>
> 14:04 @Trey314159: no worries
>
>
> Trey314159
>
> 14:04 Fortunately, James outlined your conversation:
> https://meta.wikimedia.org/wiki/Schema_talk:Search#Useful_metrics_to_track
>
> 14:05 (For anyone else who wants to take a look)
>
>
> ebernhardson
>
> 14:05 JustinO: interesting, i think we are collecting most of those, but not
> the all results or the user id. We do collect a token that is a short-term
> proxy for the user id though
>
>
> JustinO
>
> 14:05 an anonymous token for the id is great
>
>
> ebernhardson
>
> 14:05 JustinO: i'm curious, by all results you mean (in our case) a list of
> page titles or id's?
>
>
> Ironholds
>
> 14:05 JustinO, can I ask you move this to the mailing list or email myself
> or bearloga? We can explain what we're already tracking, what we're planning
> on tracking, and you can chip in feedback
>
>
> ebernhardson
>
> 14:05 i hadn't thought of that, but it makes sense
>
>
> JustinO
>
> 14:05 @ebernhardson : pageids i suppose, i'm not sure what's best for
> wikimedia
>
>
> Ironholds
>
> 14:06 at the moment this is kind of duplicative because you don't know what
> we're tracking in advance of suggesting we track it ;p
>
>
> ebernhardson
>
> 14:06 the current schema is here:
> https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2
>
> 14:06 the descriptions could be better, but give a general idea
>
>
> JustinO
>
> 14:07 ebernhardson: session id is prob fine for a userid unless you want to
> get towards personalization in the long run. eg: give coders more pages
> related to tech
>
>
> Trey314159
>
> 14:07 Ironholds: to be fair, JustinO suggested we track it long before we
> actually did (early last year).. but I agree this might be a better
> conversation on the mailing list, definitely including Ironholds and
> bearloga, and not late on a Friday afternoon (local time for me, at least)
>
>
> Ironholds
>
> 14:07 JustinO, yep, we've tested session IDs. We know these things ;p
>
>
> Ironholds
>
> 14:08 let's chat on the mailing lists where conversations can be seen by
> other users/helpers for transparency purposes, and we can be async to avoid
> time drains
>
>
> JustinO
>
> 14:08 yeah, i'm assuming you've put lots of thought into the topics
>
>
> Ironholds
>
> 14:09 https://lists.wikimedia.org/mailman/listinfo/discovery for reference
>
>
> JustinO
>
> 14:09 yep
>
>
> Ironholds
>
> 14:10 (our mailing list infrastructure makes it a nightmare to find
> anything. I just use google ;p)
>
>
> JustinO
>
> 14:10 i maybe on there
>
>
> Ironholds
>
> 14:10 (...appropriate for the discovery team I guess)
>
>
> bearloga
>
> 14:10 chuckles
>
>
> ebernhardson
>
> 14:10 Ironholds: while i don't expect it will make it into prod (change is
> hard) there is a test instance is discourse that could plausibly replace
> mailling lists and be more discoverable
>
> 14:11 https://discourse.wmflabs.org/
>
>
> Ironholds
>
> 14:11 cool!
>
>
> --justin
>
>
> _______________________________________________
> discovery mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/discovery
>



-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

Re: [discovery] Relevance Thoughts

Reply via email to