The short version is that one epic (showing results from more than one wiki) seemed to be quick and easy to implement, so we'll take that directly to A/B testing. The other two will be run through the relevance lab to see if they are promising enough to take to A/B testing.
Kevin Smith Agile Coach, Wikimedia Foundation On Tue, Nov 10, 2015 at 10:16 AM, Dan Garry <[email protected]> wrote: > There are now a whole bunch of tasks in the Cirrus board (and also some > accompanying tasks in the Analysis board). They were thrown together quite > quickly; if there's any uncertainty in their scope or you think it's > unclear what it is you need to do to move on it, then please reach out or > comment on the task and we can add extra definition. > > Thanks, > Dan > > On 10 November 2015 at 09:06, Dan Garry <[email protected]> wrote: > >> I filed epics for each of these: >> >> - T118278 <https://phabricator.wikimedia.org/T118278> - EPIC: Run >> additional A/B tests for the language switching functionality, with >> different libraries to detect the query's language, and evaluate if the >> other libraries are better >> - T118280 <https://phabricator.wikimedia.org/T118280> - EPIC: Run A/B >> test to determine whether using the accept-language header of the user to >> switch query languages is good or not >> - T118281 <https://phabricator.wikimedia.org/T118281> - EPIC: Run >> original language switching A/B test, but switch languages if the user has >> fewer than n results (for some n), and determine if that's better or worse >> >> We'll be breaking these epics down into specific actionables the sprint >> planning meeting in 25 minutes. >> >> Thanks, >> Dan >> >> On 8 November 2015 at 21:51, Dan Garry <[email protected]> wrote: >> >>> Summarising this discussion, it seems like the path forward which would >>> reap the most rewards is as follows: >>> >>> 1. Finish the MVP of the relevance lab; right now we can only test >>> zero results rate for any given experiment, and the lab will help us also >>> test result relevance. >>> 2. Start writing tests to switch out the language detector used in >>> the first test with alternative ones, to see if they're better >>> - This should affect the zero results rate, so lack of the >>> relevance lab does not block this >>> - This should also affect relevance (at least conceptually), so >>> can be tested using the relevance lab also >>> 3. Write test to use accept-language header as a heuristic to do >>> language switching (rather than language detection) >>> - This should affect the zero results rate, so lack of the >>> relevance lab does not block this >>> - This should also affect relevance (at least conceptually), so >>> can be tested using the relevance lab also >>> 4. Expand original language switching test to also switch if there >>> are "few" results (let's say "few" = 3 or fewer). >>> - Does not really affect zero results rate; this is dependent on >>> relevance lab >>> >>> Any objections to this course of action? I plan to file tasks for these >>> mid-Monday morning. >>> >>> Thanks, >>> Dan >>> >>> On 2 November 2015 at 16:58, Erik Bernhardson < >>> [email protected]> wrote: >>> >>>> Now that we have the feature deployed (behind a feature flag), and have >>>> an initial "does it do anything?" test going out today, along with an >>>> upcoming integration with our satisfaction metrics, we need to come up with >>>> how will will try to further move the needle forward. >>>> >>>> For reference these are our Q2 goals: >>>> >>>> - Run A/B test for a feature that: >>>> - Uses a library to detect the language of a user's search query. >>>> - Adjusts results to match that language. >>>> - Determine from A/B test results whether this feature is fit to >>>> push to production, with the aim to: >>>> - Improve search user satisfaction by 10% (from 15% to 16.5%). >>>> - Reduce zero results rate for non-automata search queries by >>>> 10%. >>>> >>>> We brainstormed a number of possibilities here: >>>> >>>> https://etherpad.wikimedia.org/p/LanguageSupportBrainstorming >>>> >>>> >>>> We now need to decide which of these ideas we should prioritize. We >>>> might want to take into consideration which of these can be pre-tested with >>>> our relevancy lab work, such that we can prefer to work on things we think >>>> will move the needle the most. I'm really not sure which of these to push >>>> forward on, so let us know which you think can have the most impact, or >>>> where the expected impact could be measured with relevancy lab with minimal >>>> work. >>>> >>>> >>>> >>>> _______________________________________________ >>>> discovery mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/discovery >>>> >>>> >>> >>> >>> -- >>> Dan Garry >>> Lead Product Manager, Discovery >>> Wikimedia Foundation >>> >> >> >> >> -- >> Dan Garry >> Lead Product Manager, Discovery >> Wikimedia Foundation >> > > > > -- > Dan Garry > Lead Product Manager, Discovery > Wikimedia Foundation > > _______________________________________________ > discovery mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/discovery > >
_______________________________________________ discovery mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/discovery
