So cool. It's always pleasing to see positive results tests like these. Seems like WikiGrok Version B wins round 1.
On Mon, Jan 5, 2015 at 11:47 AM, Joaquin Oltra Hernandez < [email protected]> wrote: > Very cool results. Seems like showing the same question to a bunch of > users and grabbing the most popular answer as the correct one will work on > most of the cases. > On Jan 5, 2015 8:08 PM, "Florian Schmidt" < > [email protected]> wrote: > >> Awesome! Can’t wait for it to be „always-on“ :) >> >> *Von:* [email protected] [ >> mailto:[email protected] >> <[email protected]>]* Im Auftrag von* Maryana Pinchuk >> *Gesendet:* Montag, 5. Januar 2015 19:48 >> *An:* Leila Zia; Dario Taraborelli; mobile-l >> *Betreff:* [WikimediaMobile] Preliminary WikiGrok response quality in >> stable >> >> If you're like me, you've probably been breathlessly awaiting the results >> of the first WikiGrok stable A/B test to see if the responses we're getting >> are good, bad, or ugly :) Well, good news! I did some hand-coding of the >> results (a sample of about 300 responses from the ~1,200 we got during the >> test) and have some interesting preliminary findings to share. Caveat: this >> is not science, just a quick check of WikiGrok's pulse. Leila from >> Analytics is helping us analyze this and other WikiGrok test data and will >> have a more thorough write-up of the results soon :) >> >> As a reminder, this test ran for a week in December in stable for logged >> in users only on English Wikipedia. We tested two versions of the UX (a >> simple "yes/no/maybe" interface and a slightly more complex tagging one), >> and we asked questions about biographies (actors and writers) and music >> albums (live or studio albums). The responses were not yet sent to >> Wikidata; the infrastructure to do that is currently in development. >> >> * The tl;dr is that the quality of the responses is pretty high!* The >> overall rate of correct responses for the sample I looked at was** 80%*. >> >> * Also, *users with no edits and users with 1 or more edits had similar >> quality responses* (in fact, the 0 edit count users gave slightly higher >> quality responses). So even total newbs are capable of grokking :) >> >> * Lastly, while we didn't see any differences in engagement or conversion >> (the rate at which users started and finished the WikiGrok process) between >> the two versions, there was a difference in quality –* Version B >> (tagging) produced a noticeably higher quality response rate (95%)*. >> >> More detailed breakdown of quality below, including by individual answer >> (fun fact that is sure to make Sam Smith sad: nobody seems to have any clue >> what a live album is!). Now let's see if these trends hold for logged out >> users, too :) Our first test for all users (logged in and logged out) is >> slated for later this month. >> >> ** * * * >> >> *User classes* >> >> Users with 0 edits – 85% >> >> Users with 1 or more edits – 80% >> >> *Versions* >> >> Version A – 68% >> >> Version B – 95% >> >> *Question types* >> >> "Is this person an author?" – 72% >> >> "Is this a film actor?" – 90% >> >> "Is this a television actor?" – 85% >> >> "Is this a live album?" – 50% :( >> >> "Is this a studio album?" – 64% >> >> -- >> >> Maryana Pinchuk >> Product Manager, Wikimedia Foundation >> *wikimediafoundation.org* <http://wikimediafoundation.org> >> >> _______________________________________________ >> Mobile-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/mobile-l >> >> > _______________________________________________ > Mobile-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/mobile-l > > -- Rob Moen Wikimedia Foundation [email protected]
_______________________________________________ Mobile-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mobile-l
