Re: [IxDA Discuss] Article on Number of Usability Test Participants

James Page Sun, 04 Oct 2009 02:40:26 -0700

Jared,

I enjoyed your post, and it is interesting how there was a paradigm shift
from large to small studies. Surely the web's advent in the late 90's mean
that the techniques developed in the late 80's and early 90's need updating,
to leverage the technology change that has happened since then.  We need a
new paradigm shift?


Of course the number of participants depends on what you are wanting to
learn. If you are interested just in one user, one user is enough. But for a
website aiming at a diverse selection users does that hold true?  Doesn't
the number of users boil down to a cost issue?

You argue that techniques like usability testing are used today to see the
design through the eyes of the user. For that we do need more than 5 or 10
users as the diversity, background, frequency of use and experience of the
users has changed significantly since the days of the mainframe computers
that the discount methods where designed around.

When Jakob Nielsen, and others like yourself came up with the discount
methods, in the 1980's most use of a computer was by people that where
trained to do a task, which they did frequently. (I wonder if your own
website in a year has more users, from more countries than used one of the
systems you originally worked on like Digital Equipment Corporation's
PDP10?)

An example of this is before Internet Flight Booking systems came around in
the late 90's it took months of training to be able to book a flight on a
computer. Now nobody trains to be able to book a flight, or hotel. Times
have changed significantly since when the discount methods where developed.

The issue I have with testing with just a few users is that it can exclude a
significant issue.
Nielsen makes a claim that his useit site might look awful, but that it is
readable, which is is not the case for me. I am Dyslexic, and I find
Nielsen's useit website hard going, because he uses very wide column widths.
(I can read a narrow column twice as fast as I can read a wide one). Now the
chance that when he tested the site with only 8 people that one of the
participants was Dyslexic would be low. But there are still many millions of
us. If he either had used the heuristics from  magazine, or newspaper design
or had tested the site with descent sample size he would know that he had an
issue.  Or maybe the reason he did not discover the issue is because when he
built the site in 1995 screen sizes where smaller and therefore the columns
where too, but times have changed and he needs to re test.

Nielsen's constant of .33 to show that 5 users is enough assumes that 33% of
the test participants will experience the issue. My guess is that between 5%
and 10% would experience the column width issue but I may be wrong and that
is why testing is important.

If Nielsen only tests with 5 or 10 people he has no way in knowing if this
is an issue he needs to fix. Does it only effect me in the United Kingdom,
or are there many more people that have an issue with it?  I am sure that
Nielsen is a very busy person, and is it worth his effort in fixing the
issue? I have heard he built the site himself.  If he solves the problem how
will that effect the users that are used to the original design? With only
testing with eight people it is hard to construct an argument.

By only using a few people for user research in one location, are you not
excluding a significant number of your sites audience?

All the best

James
blog.feralabs.com

My disclaimer is that I co-started www.webnographer.com a online usability
testing tool. The reason that my partner and I have sweat hours in
developing the tool and in developing remote methods is that we believe is
that usability testing needs to become cheaper, and test a more diverse
selection of users than current methods do.




2009/10/3 Jared Spool <[email protected]>

> [Ok. I started to write a simple post about how you need to talk about what
> you want to learn from your study before you can ask about number of
> participants, but then it evolved into this 1200+ word history lesson. I
> left that part in, but you can skip to the very end to see my point. -
> Jared]
>
> We're talking about usability testing as if it's a single procedure that
> everyone performs exactly the same way. (This is not the only problem with
> this thread, but it's the one that I won't be all snarky about.)
>
> As a field, we're not very good about making sure everyone knows about our
> history. In the case of usability testing methods, history is pretty
> important.
>
> -> BACKGROUND - Skip to the next section if you just want to get to the
> IMPORTANT RELEVANT STUFF
>
> The first usability tests (as we know them today) were conducted by
> cognitive psychologists in the 1970s. (You can trace usability testing back
> to time-and-motion studies in the '20s and '30s, but I don't think we need
> to go back that far for this conversation.)
>
> When the cog. psychs. were testing, they were using the test methodology as
> a technique to understand human behavior and cognition: how did people react
> to stimuli (physical and digital)? They were looking at reaction times,
> memory, motor response, and other basics. A lot of this work was being done
> at universities and corporate research labs, like Bell Labs and Xerox PARC.
> NASA, DARPA, and the DOD were also involved. (Interestingly, they all
> discovered a lot of stuff that we take for granted today in design -- back
> then it was all new and controversial, like Fitts's Law.)
>
> In the late '70s, early '80s, we started applying usability testing into
> engineering processes. I was part of one of the first teams (at Digital
> Equipment Corporation) to use usability tests in the process of developing
> products. Engineering teams at IBM, HP, WANG, Boeing, Siemens, GTE, and
> Nortel were doing similar things. (I'm sure there were others that I've
> forgotten or didn't know about.)
>
> At DEC, the first engineering uses of usability testing were for either
> research-based prototype evaluation or very late-stage product defect
> determination. Meanwhile, John Gould and his team at IBM published a seminal
> paper about using an iterative process for designing a messaging system at
> the 1984 Summer Olympics. Jim Carroll's team were using testing methods for
> understanding documentation needs in office systems. Ron Perkins & co at
> WANG were doing similar things. Industrial design groups at many companies
> were using usability testing for studying behavioral responses and ergonomic
> constraints for interactive system devices.
>
> It was still a few years until we saw labs at companies like Microsoft,
> Word Perfect, and Apple. By the time they'd gotten involved, we'd evolved
> many of the methods and protocols to look at a the design at a variety of
> points throughout the development process. But the early testing methods
> were too expensive and too time consuming to effectively use within the
> engineering practice. It was always a special case, reserved for the most
> important projects.
>
> All of these studies involved laboratory-based protocols. In the very late
> '80s and early '90s, many of us pushed for laboratory-less testing
> techniques, to lower the costs and time constraints. We also started
> experimenting with techniques, such as paper prototypes, which reduced the
> up-front cost of building the design to test it.
>
> Others, such as those behind the participatory design movement in
> Scandinavia and the ethnographic/contextual design methods emerging in the
> US and central Europe, were looking at other methods for gleaning
> information. (This is when Jakob started popularizing Discount Usability
> Engineering, which had a huge impact on the adoption of the techniques
> within the design process.)
>
> Today, we see that the cost of conducting a usability test has dropped
> tremendously. When I started in the '70s, a typical study would easily cost
> $250,000 in today's dollars. Today, a team can perform an eight participant
> in-person study for much less than $5,000 and remote methods are even
> cheaper.
>
> -> IMPORTANT RELEVANT STUFF (in case you decided to skip the BACKGROUND)
>
> All this is relevant to the conversation, because usability testing has
> morphed and changed in its history. When we used it for scientific
> behavioral and cognitive studies, we needed to pay close attention to all
> the details. Number of users was critical, as was the recruiting method, the
> moderation protocols, and the analysis methods. You couldn't report results
> of a study without describing, in high detail, every aspect of how you put
> the study together and came to your conclusion. (You still see remnants of
> this today in the way CHI accepts papers.)
>
> When we were using it for defect detection, we needed to understand the
> number of users problem better. That's when Nielsen & Landauer, Jim Lewis,
> Bob Virzi, and Will Schroeder & I started looking at the variables.
>
> But we've moved passed defect detection for common usage. And in that way,
> usability testing has morphed into a slew of different techniques. As a
> result, the parameters of using the method change based on how you're using
> it.
>
> Today, the primary use is for gleaning insights about who our users are and
> how they see our designs. It's not about finding problems in the design
> (though, that's always a benefit). Instead, it's a tool that helps us makes
> decisions in those thousands of moments during the design process when we
> don't have access to our users.
>
> Sitting next to a single user, watching them use a design, can be, by
> itself an enlightening process. When we work with teams who are watching
> their users for the first time (an occurrence that happens way too often
> still), they come out of the first session completely energized and excited
> about what they've just learned. And that's just after seeing 1
> definitely-not-statistically-significant user.
>
> Techniques like usability testing are used today to see the design through
> the eyes of the user. Because a lot of hard work has been done through the
> years to bring the costs of testing down significantly, we can use it in
> this way, which was never possible back when I started in this business.
>
> But, there are uses of usability testing that still need to take sample
> size into account. For example, when we conduct our Compelled Shopping
> Analysis, we typically have 50 or more participants in the study. (The
> largest so far had 72 participants in the main study with 12 pilot/rehearsal
> participants to work the bugs out of the protocols.) These studies are very
> rigorous comparisons of multiple aspects of live e-commerce sites and we
> need to ensure we're capturing all the data accurately. Interestingly, we
> regularly find show-stopping design problems in the last 5 participants that
> weren't seen before in the study.
>
> -> MY POINT (finally)
>
> So, usability testing has evolved into a multi-purpose tool. You can't
> really talk about the minimum number of participants without talking about
> how you want to use the tool. And you can't talk about how you want to use
> the tool without talking about what you want to learn.
>
> If you just want to gain insights about who your users are and how they'll
> react to your design ideas, you only need a small number (1-5) to get really
> interesting, great insights. Other techniques (such as 5-second tests,
> defect detection, Compelled Shopping, Inherent Value studies) require
> different numbers of participants.
>
> And the different techniques also require different recruiting protocols,
> different moderating protocols, and different data analysis protocols. So,
> if we're talking about number of participants, we also need to talk about
> those differences too.
>
> Hopefully, that will clear all this up. If you want to ask about the number
> of participants, tell us first about what you hope to learn.
>
> Jared
>
>
>
> ________________________________________________________________
> Welcome to the Interaction Design Association (IxDA)!
> To post to this list ....... [email protected]
> Unsubscribe ................ http://www.ixda.org/unsubscribe
> List Guidelines ............ http://www.ixda.org/guidelines
> List Help .................. http://www.ixda.org/help
>
________________________________________________________________
Welcome to the Interaction Design Association (IxDA)!
To post to this list ....... [email protected]
Unsubscribe ................ http://www.ixda.org/unsubscribe
List Guidelines ............ http://www.ixda.org/guidelines
List Help .................. http://www.ixda.org/help

Re: [IxDA Discuss] Article on Number of Usability Test Participants

Reply via email to