So cool.  It's always pleasing to see positive results tests like these.
Seems like WikiGrok Version B wins round 1.

On Mon, Jan 5, 2015 at 11:47 AM, Joaquin Oltra Hernandez <
[email protected]> wrote:

> Very cool results. Seems like showing the same question to a bunch of
> users and grabbing the most popular answer as the correct one will work on
> most of the cases.
> On Jan 5, 2015 8:08 PM, "Florian Schmidt" <
> [email protected]> wrote:
>
>>  Awesome! Can’t wait for it to be „always-on“ :)
>>
>> *Von:* [email protected] [
>> mailto:[email protected]
>> <[email protected]>]* Im Auftrag von* Maryana Pinchuk
>> *Gesendet:* Montag, 5. Januar 2015 19:48
>> *An:* Leila Zia; Dario Taraborelli; mobile-l
>> *Betreff:* [WikimediaMobile] Preliminary WikiGrok response quality in
>> stable
>>
>> If you're like me, you've probably been breathlessly awaiting the results
>> of the first WikiGrok stable A/B test to see if the responses we're getting
>> are good, bad, or ugly :) Well, good news! I did some hand-coding of the
>> results (a sample of about 300 responses from the ~1,200 we got during the
>> test) and have some interesting preliminary findings to share. Caveat: this
>> is not science, just a quick check of WikiGrok's pulse. Leila from
>> Analytics is helping us analyze this and other WikiGrok test data and will
>> have a more thorough write-up of the results soon :)
>>
>> As a reminder, this test ran for a week in December in stable for logged
>> in users only on English Wikipedia. We tested two versions of the UX (a
>> simple "yes/no/maybe" interface and a slightly more complex tagging one),
>> and we asked questions about biographies (actors and writers) and music
>> albums (live or studio albums). The responses were not yet sent to
>> Wikidata; the infrastructure to do that is currently in development.
>>
>> * The tl;dr is that the quality of the responses is pretty high!* The
>> overall rate of correct responses for the sample I looked at was** 80%*.
>>
>> * Also, *users with no edits and users with 1 or more edits had similar
>> quality responses* (in fact, the 0 edit count users gave slightly higher
>> quality responses). So even total newbs are capable of grokking :)
>>
>> * Lastly, while we didn't see any differences in engagement or conversion
>> (the rate at which users started and finished the WikiGrok process) between
>> the two versions, there was a difference in quality –* Version B
>> (tagging) produced a noticeably higher quality response rate (95%)*.
>>
>> More detailed breakdown of quality below, including by individual answer
>> (fun fact that is sure to make Sam Smith sad: nobody seems to have any clue
>> what a live album is!). Now let's see if these trends hold for logged out
>> users, too :) Our first test for all users (logged in and logged out) is
>> slated for later this month.
>>
>> ** * * *
>>
>> *User classes*
>>
>> Users with 0 edits – 85%
>>
>> Users with 1 or more edits – 80%
>>
>> *Versions*
>>
>> Version A – 68%
>>
>> Version B – 95%
>>
>> *Question types*
>>
>> "Is this person an author?" – 72%
>>
>> "Is this a film actor?" – 90%
>>
>> "Is this a television actor?" – 85%
>>
>> "Is this a live album?" – 50% :(
>>
>> "Is this a studio album?" – 64%
>>
>> --
>>
>> Maryana Pinchuk
>> Product Manager, Wikimedia Foundation
>> *wikimediafoundation.org* <http://wikimediafoundation.org>
>>
>> _______________________________________________
>> Mobile-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>
>>
> _______________________________________________
> Mobile-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>
>


-- 
Rob Moen
Wikimedia Foundation
[email protected]
_______________________________________________
Mobile-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mobile-l

Reply via email to