I am usually in favor of a/b experiments where the user is not aware of it
taking place. They tend to give you the most honest results.
That said, I have recently done some side-by-side checking for Danish
queries on another product which worked quite well so I'm thinking if we
could have, say, "t
Before launch, we asked a team of international Googlers to assess quality.
We could reach out to that group again. Anders Sandholm coordinated that
effort and would be a good person to reach out to if we want to repeat it.
2009/10/28 Hironori Bono (坊野 博典)
>
> Hi Evan,
>
> Thank you for your fee
Hi Evan,
Thank you for your feedback.
2009/10/28 Evan Martin :
> It still might be worth soliciting feedback from users directly. For
> example, if the new dictionary is missing a common word the above
> measure would get a high count of "Add to Dictionary", and maybe users
> could tell us abo
Will we have any chance to ship both, and randomly select (at startup
time??) between the two dictionaries? Alternatively, could we ship a series
of dev builds, and alternate use of old an new dictionaries.
The bottom line IMO is that when running experiments, you need the closest
to apples-apple
2009/10/28 Hironori Bono (坊野 博典) :
> Even though this is still a random thought, I would personally like to
> use chromium to evaluate the new dictionaries: i.e. uploading the new
> dictionaries to our dictionary server, changing the chromium code to
> use the updated ones, asking users to compare