RE: studies of naming?

Steven Clarke Wed, 28 Mar 2012 07:41:26 -0700

Richard,

Thanks for laying out the different design options. Unfortunately we're not 
always in a position to be able to study every design option since we're 
constrained by our shipping schedule. Sometimes we need to trade off what we 
know our engineering teams can build in the time we have available with what we 
would like to do in an ideal world. In many ways the studies we do in the 
Visual Studio team aim to tip the balance in favor of the user experience. It's 
one of the differences between the research that we do in the product teams at 
Microsoft and the research that the teams in Microsoft Research do.

Anyway, the more interesting debate is how the opportunistic programmers would 
deal with the design options you laid out. On the factory design pattern, that 
has been studied also. This will likely make you even more terrified of using 
software built by other software engineers :), but take a look: 
http://www.cs.cmu.edu/~NatProg/papers/Ellis2007FactoryUsability.pdf

On the fluent like initializer object style you demonstrated below, I don't 
believe that will solve all the usability issues we observe opportunistic 
programmers experiencing with required parameters. These developers are very 
exploratory in the way that they write code and are looking for shortcuts 
everywhere they go. They are task oriented, not API oriented. They think about 
the task they are writing code for, not the way that the APIs work. The success 
of the fluent style of API would depend on how many 'dots' the user would need 
to write before they were able to call the method that helped them with their 
task. If it is too deep, they might even give up exploring the class before 
they get that far.

We don't have the luxury of dismissing these types of programmers. While it 
might strike you with terror that these programmers exist, they are 
successfully building applications in many different domains. They may work 
differently to you and many other programmers but that doesn't necessarily mean 
that the code they create is worthless. Within the Visual Studio team at 
Microsoft we've devoted efforts to attempting to make them successful by 
adapting to their workstyles when appropriate.

There are a few blog posts and papers that describe these personas in more 
detail that might be worth reading if you're interested in how we use them.

http://p.einarsen.no/programmer-personality-types-and-why-it-matters-at-all/

http://drops.dagstuhl.de/opus/volltexte/2007/1080/pdf/07081.ClarkeSteven.Paper.1080.pdf

http://blogs.msdn.com/b/stevencl/archive/2010/08/19/making-effective-use-of-personas-in-design.aspx

http://blogs.msdn.com/b/stevencl/archive/2010/12/22/climate-change-and-developer-personas.aspx

From: john.m.daugh...@gmail.com [mailto:john.m.daugh...@gmail.com] On Behalf Of 
John Daughtry
Sent: 28 March 2012 13:51
To: Richard O'Keefe
Cc: Steven Clarke; Brad Myers; Raoul Duke; Ppig-Discuss-List
Subject: Re: studies of naming?

You introduced a third option. However, it is a deviation of the implementation 
behind the interface as opposed to an alternative interface. Are you suggesting 
that every API study should consider all of the limitless alternative 
implementations? Perhaps I misunderstood the design alternative you suggest as 
the third option.

Your further discussion (e.g., initializer objects) supports the notion that 
any attempt to achieve both usability and robustness is tedious and laborious 
in dominant languages.

John

On Tue, Mar 27, 2012 at 11:10 PM, Richard O'Keefe 
<o...@cs.otago.ac.nz<mailto:o...@cs.otago.ac.nz>> wrote:

On 27/03/2012, at 10:14 PM, Steven Clarke wrote:

> Yes, you're right Richard. Our study was focused on languages like Java and 
> C# (and importantly, as they existed around 2006/2007). So as you say, we 
> wouldn't generalize the results to languages that have named parameters. We 
> described the two design choices we evaluated at the start of the paper:
>
> "There are two common design choices: provide only
> constructors that require certain objects (a "required
> constructor"). This option has the benefit of enforcing
> certain invariants at the expense of flexibility. An
> alternative design, "create-set-call," allows objects to
> be created and then initialized."

There are actually three design choices, and I've seen the third
one too often for comfort.  Please note that this is a *different*
choice from the one you endorsed:

Good Smalltalk design:

  Provide a variety of factory methods with keyword arguments,
  all of which provide fully initialised objects satisfying
  the class invariant; such objects often need little or no
  mutation afterwards.

  Good Eiffel design agrees in every respect except 'keyword arguments'.

Good create-set-call design:

  Ensure that the default 'new' constructor returns a fully
  initialised object satisfying the class invariant and
  offering meaningful default behaviour; such objects almost always
  require adjustment to get them into the state you really want but
  all states are meaningful.

  IN ADDITION make sure that all 'initialisation phase' methods
  can safely be called AT ALL TIMES.

  Ensure that every public method is *tested* with a default-
  initialised object.

Bad create-set-call design:

  Don't think about class invariants.
  Rely on default constructors that leave fields with default
  values (0, nil, &c) that satisfy types but not invariants.
  Allow objects to be named outside their class in partially
  initialised states that require 'initialisation phase' methods
  to be called before 'work phase' methods, but do not check
  that this has been done.

  Allow 'initialisation phase' methods to be called at any time
  without checking it it makes sense, allowing even well-initialised
  objects to subsequently be put into inconsistent states.

Much as I love Smalltalk, a lot of the code that I see using the
create-set-call pattern is actually doing the >bad< version.
Here's an example that took only 2 minutes to find.
In Pharo 1.1 (which is not the current version)
       Url new<cmd-P>
which is the equivalent of System.out.println(new Url())
raises an exception.  Url new created an uninitialised object.
That's _almost_ fair enough:  this is supposed to be an abstract
class, but it should have been caught in #new.  Go to a concrete
subclass:
       FileUrl new<cmd-P>
also raises an exception, trying to print the elements of a nil
String.

> However, I don't think our result is unsurprising,

All I can say is that it didn't surprise _me_.
Compare for example
       f = fopen(x, y); /*C*/
       s := FileStream read: x. "ST"

       s = new FileStream(y, x); //C#

Smalltalk: obvious what it does.
C: which argument is which?  Compiler can't help.
C#: three different things it could be; the compiler
can tell them apart, but it's not so easy for people.
And when you get to

       new FileStream(String, FileMode, FileSystemRights,
               FileShare, Int32, FileOptions)

this is obviously going to be a lot harder for people to
read than
       FileStream read: string rights: rights share: share
               bufferSize: int32 options: options

If you *could* do
       s = new FileStream()
               .FileName(string)
               .Rights(rights)
               .Share(share)
               .BufferSize(int32)
               .Options(options)
               .Open();
that would be a lot clearer.

And that introduces a fourth design pattern, which the paper did
not investigate, call it "initializer object".

The general scheme for Initializer Object is
       class X has a static Maker() method
       returning an instance of X_Maker().

       X_Maker() has methods like
               Facet(value)
       returning the same X_Maker() object
       and a completion method called something
       like    Open()
       or      Create()
       that returns a fully initialised instance of X.

This requires a creation style like

       s = FileStream.maker()
               .FileName(string) &c as before
               .Open();

> You've highlighted the core of the debate when, referring to the people who 
> prefer the create-set-call pattern, you said " the idea of them writing any 
> code that might affect my life or the life of anyone known to me is not one 
> that's going to help me sleep at night ". This was also the initial reaction 
> of many people inside Microsoft when they heard the results of our study.
>
> Our response in this debate has always been that different programmers 
> require different APIs. That's the message we tried to communicate in the 
> paper when we described the different personas. These personas represent 
> different workstyles of the developers we have observed using the .Net 
> framework. They are a crucial tool in our ability to successfully design a 
> framework that is broadly usable by millions of developers.

The paper didn't just say that these programmers didn't LIKE or weren't 
COMFORTABLE
using full-initialisation constructors, but that they didn't really get the 
idea.
Is it really a good idea to design a framework that is (ab)usable by people who
probably shouldn't be programming in the first place?

Less dismissively:  would these people have got the idea of Initializer Object?

> One of the hardest things to do in order to use these workstyles successfully 
> to design an API is for the API designer to let go of their own biases 
> towards how things should be done, and instead, design the API based on a 
> deep understanding of how the user expects things to be done.

You can call it a bias if you want.  I call it sheer terror based on seeing it
done wrong (as in: programs crash) far too often.

The paper did not suggest to me that a deep understanding of what those users
thought was happening had been reached, or even sought.  In particular, may I
offer the MacDonalds analogy?  You go to a MacDonalds, and you tell them what
you want.  As you say
       - I want a hamburger
       - I want the mighty angus
       - no onion or pickles
do you see this?
       * a hamburger bun materialises in front of you
       * it's filled with the beef and trimmings
       * the onion and pickles are taken away
No.  They *take your order*, and then deliver a complete hamburger not entirely
unlike the way you want it, and then you eat it.  You can't eat an incompletely
assembled hamburger, because they don't give you one.

In the Initializer Object pattern, the initializer object is like the order the
person behind the counter is filling out.  When you say "that's it" and pay,
_that's_ when they select or assemble the hamburger and deliver it to you.  Up
to that point, you can revise the order if you want to.

Isn't it at least possible that what the people who use new _() and then
fill out their order might prefer Initializer Object?  Surely there's no 
shortage
of Windows developers who have bought a hamburger...

A fifth design pattern could be called "Lazy Biphasic Object".

A Biphasic object is one with (at least) two distinct states:
initialisation phase, where various facets can be set up, and
operational phase(s), where the object does whatever you really
wanted it to do.  Methods may be classified as
 - initialisation only
 - operation only
 - multiphase

Lazy Biphasic Object is where the object changes from initialisation
phase to operational phase the first time an operation only method is
called.  The best known instance of this is C FILE objects, where
setbuf() and setvbuf() are initialisation only methods, and getc()
and putc() are operation only methods, and the first time you call
getc() or putc() the initialisation process is only then completed.

Are the programmers who are only comfortable with create-set-call
really thinking in terms of lazy biphasic objects?  Would the API
be better if designed that way?  Would they mind at all if the
transition were explicit rather than implicit?

> That doesn't mean that we always do everything the way that users expect them 
> to be done but it does mean that when we decide to do something that differs 
> from their expectations we do it consciously and deliberately.

It also means that you need to know what it REALLY is that they are expecting.
Maybe you found that out, but the paper didn't _say_.  There is a big
difference between create-set-call where the object is *always* ready for use
and lazy biphasic object where it is an error to call an initialisation-only
method in operational phase.

> In this case, the understanding we had gained from the study reported in this 
> paper and from many other studies we had run internally (at one point we were 
> running API user experience studies on the .Net framework monthly) indicated 
> that we needed to design the API to accommodate the users preference for 
> initializing objects since the alternative was that many developers would 
> have a very difficult time using APIs that were designed differently to their 
> expectations.

You offer users an *extremely* limited choice, and then talk about what
they did as their *preference*?  The paper didn't mention the Initialiser Object
approach:  how do you know they would not have preferred that?
The paper didn't mention explicit biphasic object:  how do you know they would
not have preferred that?  It didn't mention lazy biphasic object (with 
exceptions
raised for out-of-phase invocations).  How do you know they would not have
preferred that?  The paper did not mention single-point-of-construction with
keyword arguments (or even passing a dictionary, as you might do in Python).
How do you know people would not have preferred that?

Oh, it's great that you did the experiment, and really great that you wrote it
up, but the range of alternatives considered was _far_ too small to base any
far-reaching decisions on.

Again, I repeat: whatever you actually found out, the *paper* does not tell me
what those users actually thought was happening or wanted to happen, only which
notation fitted those thoughts less badly.
>
> You're right again in saying that we knew these workstyles existed before the 
> study. Our paper describes the different ways that developers exhibiting 
> these workstyles prefer to initialize objects. This was useful information 
> for us at Microsoft in determining at that time, how best to accommodate the 
> preferences of many of our customers.
>
> It's interesting to note that the opportunistic workstyle has since been 
> observed and studied by others, most notably by Scott Klemmer's group at 
> Stanford: http://hci.stanford.edu/research/opportunistic/

Let me quote that page:

       Opportunistic Programming is a method of software development
       that emphasizes speed and ease of development over code robustness
       and maintainability.

Sheil had a "Power tools for programmers" paper in the early 1980s arguing for
"exploratory programming" and how to support it.  (Of course Stanford was in
touch with work on Lisp and Smalltalk at Xerox for a long time, so this old 
topic
should have been very well known at Stanford.)

As a programmer, I want to go at top speed.

As a user of other people's programs, I am heartily *sick* of
programs where insufficient attention was paid to robustness.
Just yesterday I was trying to help a student debug a program
in a language not entirely unlike C# where if he started n copies
of his program several seconds apart, all went well, but start
them in a shell loop and a random copy would get a completely
black window.  I honestly could not find anything in _his_ code
to justify this.  This was _not_ a good learning experience for him.

--
The Open University is incorporated by Royal Charter (RC 000391), an exempt 
charity in England & Wales and a charity registered in Scotland (SC 038302).

RE: studies of naming?

Reply via email to