[discovery] TextCat and Confidence

Trey Jones Tue, 12 Jul 2016 12:59:46 -0700

Hey everyone,

Mikhail has written up and should soon release his report on our recent
TextCat A/B tests; the results look good, and language identification and
cross-wiki searching definitely improve the results (in terms of results
shown and results clicked) for otherwise poorly performing queries (those
that get fewer than 3 results).


Mikhail's report also suggests looking at some measure of confidence for
the language identification to see if that has any effect on the quality
(in terms of number of results, but more importantly clicks) of the
crosswiki (also "interwiki") results. This sounds like a good idea, but
TextCat doesn't make it super easy to do. I have some ideas, though, and I
would love some suggestions from anyone else who has any ideas.

The details are kind of technical, so if that kind of thing makes your eyes
glaze over, you should avert your gaze now.

Otherwise, check out my write up on TextCat and confidence
<https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/TextCat_and_Confidence>
and share your ideas here, or on the talk page.

Thanks!

—Trey

Trey Jones
Software Engineer, Discovery
Wikimedia Foundation

_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

[discovery] TextCat and Confidence

Reply via email to