Steve,

The real issue is that the example I have given is that it is over
simplistic. It is dependent on sterile lab conditions, and the user
population been the same in the lab and in the real world. And there only
being one issue that effects 10% of the user population. One of the great
beauties of the world is the complexity and diversity of people. In the
sterile lab people are tested on the same machine (we have found machine
configuration such as screen size has a bearing on behaviour), and they
don't have the distractions that normally effect the user in the real
world.

Actually, that's not true. You'd be fairly likely to discover it with only
> 5-10 users - in the 65%+ range of 'likely'.
>
> For 5 uses that is only 41% (1-(1-0.1)^5), and for 10 it is 65%. This is
far off from Nielson number that 5 users will find 84% of the issues.
(1-(1-0.31)^5)

If I was manufacturing and there was a 45% chance that 10% of my cars leave
the production line with a fault, there is a high chance that consumers
would stop buying my product, the company would go bust, and I would be out
a job. From my experience of production lines a sample size of 10 for a
production of one million units would be considered extremely low.

We have moved allong way since 1993 when Nielsen and Landauer's paper was
published. The web was not arround, and the profile of users was very
different. The web has changed that. We will need to test with more people
as websites traffic increases, and we get better at web site design. For
example if we assume that designers of a web site have been using good
design principles and therefore an issue only effects 2.5% of users. Then 10
users in a test will only discover that issue 22% of the time. But using our
1 million visitors a year example the issue will mean that 25,000 people
will experience problems.

But we do agree that each population needs it's own test. And I totally
agree that testing iteratively is a good idea.

@William --  Woolrych and Cockton 2001 argument applies to simple task based
tests. See 
http://osiris.sunderland.ac.uk/~cs0awo/hci%202001%20short.pdf<http://osiris.sunderland.ac.uk/%7Ecs0awo/hci%202001%20short.pdf>

All the best

James
blog.feralabs.com

PS (*Disclaimer*) Due to my belief that usability testing needs not just to
be more statistically sound, but also be able to test a wide range of users
from different cultures I co-founded www.webnographer.com a remote usability
testing tool. So I am advocate for testing with more geographically diverse
users than normal lab tests.

2009/10/2 Steve Baty <[email protected]>

> "If your client website has 1 million visitors a year, a usability issue
> that
> effects 10% of the users would be unlikely to be discovered on a test of
> only 5 to 10 users, but would give 100,000 people a bad experience when
> they
> visit the site."
>
> Actually, that's not true. You'd be fairly likely to discover it with only
> 5-10 users - in the 65%+ range of 'likely'. Manufacturing quality control
> systems and product quality testing have been using such statistical methods
> since the 20's and they went through heavy refinement and sophistication in
> the 60's, 70's and 80's.
>
> It's also worth repeating the message both Jakob & Jared Spool are
> constantly talking about: test iteratively with a group of 5-10
> participants. You'll find that 65%+ figure above rises to 99%+ in that case.
>
> Again, doesn't change your basic points about cultural diversity and
> behaviour affecting the test parameters, but your above point is not
> entirely accurate.
>
> Cheers
> Steve
>
> 2009/10/2 James Page <[email protected]>
>
> It is dependent on how many issues there are, the cultural variance of your
>> user base, and the margin of error you are happy with. Five users or even
>> 10
>> is not enough on a modern well designed web site.
>>
>> The easy way to think of a Usability Test is a treasure hunt. If the
>> treasure is very obvious then you will need fewer people, if less obvious
>> then you will need more people. If you increase the area of the hunt then
>> you will need more people. Most of the advocates of only testing 5 to 10
>> users, experience comes from one country. Behaviour changes significantly
>> country by country, even in Western Europe. See my blog post here :
>> http://blog.feralabs.com/2009/01/does-culture-effect-online-behaviour/
>>
>> If your client website has 1 million visitors a year, a usability issue
>> that
>> effects 10% of the users would be unlikely to be discovered on a test of
>> only 5 to 10 users, but would give 100,000 people a bad experience when
>> they
>> visit the site.
>>
>> Can you find treasure with only five or ten users. Of course you can. But
>> how sure can you be that you have found even significant issues.
>>
>> A very good argument in why 10 is not enough is Woolrych and Cockton 2001.
>> They point out an issue in Nielsen formula in that he does not take into
>> account the visibility of an issue. They show using only 5 users can
>> significantly under count even significant usability issues.
>>
>> The following powerpoint from an eyetracking study demonstrates the issue
>> with only using a few users.
>> http://docs.realeyes.it/why50.ppt
>>
>> You may also want to look at the margin of error for the test that you are
>> doing.
>>
>> All the best
>>
>> James
>> blog.feralabs.com
>>
>> 2009/10/1 Will Hacker <[email protected]>
>>
>> > Chris,
>> >
>> > There is not any statistical formula or method that will tell you the
>> > correct number of people to test. In my experience it depends on the
>> > functions you are testing, how many test scenarios you want to run
>> > and how many of those can be done by one participant in one session,
>> > and how many different levels of expertise you need (e.g. novice,
>> > intermediate, and/or expert) to really exercise your application.
>> >
>> > I have gotten valuable insight from testing 6-10 people for ecommerce
>> > sites with fairly common functionality that people are generally
>> > familiar with but have used more for more complex applications where
>> > there are different levels of features that some users rely on
>> > heavily and others never use.
>> >
>> > I do believe that any testing is better than none, and realize you
>> > are likely limited by time and budget. I think you can usually get
>> > fairly effective results with 10 or fewer people.
>> >
>> > Will
>> >
>> >
>> > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
>> > Posted from the new ixda.org
>> > http://www.ixda.org/discuss?post=46278
>> >
>> >
>> > ________________________________________________________________
>> > Welcome to the Interaction Design Association (IxDA)!
>> > To post to this list ....... [email protected]
>> > Unsubscribe ................ http://www.ixda.org/unsubscribe
>> > List Guidelines ............ http://www.ixda.org/guidelines
>> > List Help .................. http://www.ixda.org/help
>> >
>> ________________________________________________________________
>> Welcome to the Interaction Design Association (IxDA)!
>> To post to this list ....... [email protected]
>> Unsubscribe ................ http://www.ixda.org/unsubscribe
>> List Guidelines ............ http://www.ixda.org/guidelines
>> List Help .................. http://www.ixda.org/help
>>
>
>
>
> --
> Steve 'Doc' Baty | Principal | Meld Consulting | P: +61 417 061 292 | E:
> [email protected] | Twitter: docbaty | Skype: steve_baty | LinkedIn:
> www.linkedin.com/in/stevebaty
>
________________________________________________________________
Welcome to the Interaction Design Association (IxDA)!
To post to this list ....... [email protected]
Unsubscribe ................ http://www.ixda.org/unsubscribe
List Guidelines ............ http://www.ixda.org/guidelines
List Help .................. http://www.ixda.org/help

Reply via email to