[melbourne-pug] Data type assumptions in Python (was: Yesterday's MPUG session and going forward)

Ben Finney Wed, 04 Sep 2013 16:53:38 -0700

Lars Yencken <l...@yencken.org> writes:

> I haven't mastered it, but there seems to be an art to testing your type
> assumptions early in Python.


The art is: Don't test data type assumptions in the code.

Rather, use EAFP and duck typing, and only test type assumptions in unit
tests.


Duck typing means that you stop caring about the type of objects given
to you, and care only about how they behave. “If it looks like a duck,
and quacks like a duck, we may as well treat it like a duck until it
demonstrates otherwise”.

EAFP means it is Easier to Ask Forgiveness than Permission. (This is
contrasted with Look Before You Leap, which is discouraged in Python.)
Just use the object you're given as though it will behave the way you
want. If it does, great! If it doesn't an exception is raised and it's
up to the *caller* to deal with that exception since they didn't pass
your code an object with the correct behaviour.


Both these approaches allow polymorphism, a powerful language
capability: Your code will accept any object that behaves the right way,
regardless of where the object fits in the type hierarchy, without
needing to special-case each different type. If you need a file-like
object, you don't need the object to inherit from ‘file’. You only need
it to implement some subset of the expected behaviour of ‘file’ — and
you do that by just using the object like a file.

If, instead, you have a narrow set of types and check to see the object
fits in that set, then you thwart polymorphism: no-one can implement the
behaviour you expect on an object from elsewhere in the hierarchy (such
as an ‘io.StringIO’ object, which does not inherit from ‘file’ at all
but implements a large subset of the behaviour).

Your code that checks object types will be needlessly rejecting objects
which it could use, and your callers will not be able to do things they
could do in the absence of those needless restrictions. Use duck typing,
and make your callers responsible for passing objects that behave
correctly.

Likewise, if instead you LBYL in your code, you are attempting to handle
type problems at the wrong location. It should be the caller's
responsibility, not yours, to pass objects that implement the right
behaviour. Your LBYL code will grow more and more error checking,
obscuring the actual functionality that you should be writing. Use EAFP,
and your code will trigger exceptions when those objects don't behave
correctly: it's then the caller's responsibility to handle it.


How, then, do you know that your code will behave correctly? Write unit
tests that exercise your functions – which, if you're doing it right,
are small and loosely coupled with narrow interfaces – by throwing
objects of various types at them, making sure they use them correctly
and that exceptions are raised where the objects are unsuitable.

What about letting your callers know the expected behaviour? That's what
the docstring is for. Describe your parameters by their behaviour. Don't
say “foo is a list of widgets”, if you only need it to be iterable. “foo
is an iterable of widgets” tells the reader what they need to pass.

-- 
 \       “[Freedom of speech] isn't something somebody else gives you. |
  `\      That's something you give to yourself.” —_Hocus Pocus_, Kurt |
_o__)                                                         Vonnegut |
Ben Finney

_______________________________________________
melbourne-pug mailing list
melbourne-pug@python.org
https://mail.python.org/mailman/listinfo/melbourne-pug

[melbourne-pug] Data type assumptions in Python (was: Yesterday's MPUG session and going forward)

Reply via email to