Hi Michael,
| > OK, you fire up Hugs and type :t unzip and Hugs tells you that
|
| > unzip :: [(a,b)] -> ([a],[b])
|
| > Completely clear, unzip takes a list of pairs and returns a
| pair of lists.
|
| As a new user (and a complete newbie to FP), perhaps I can shed some light
| on something here....
| - The above "explanation" is worthless.
| - It is completely and absolutely worthless.
| ...
Erik's choice of words seems to have hit a nerve, but I think he has
a point. The original poster said:
"The only documentation for unzip is this:
unzip = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])"
To my mind, that's wrong. This *isn't* the only piece of documentation,
because it leaves out the type, and types play a very important role for
many Haskell programmers. When I'm writing my own programs in Haskell,
or trying to understand somebody else's code, I almost always start with
the types. In this case, you can either look in the prelude to find
the type of unzip, or you can type :i unzip at the Hugs prompt. What
you'll find in either case is as follows:
unzip :: [(a,b)] -> ([a],[b])
I do understand that, without explanation, this may well look like line
noise, and hence that it might be hard to appreciate how useful this
type information really can be. So it seems to me that we need to help
anyone who regards themselves as a new user or newbie to FP, to
understand the notation. Here's a very quick summary:
(a,b) is the type of a pair, each of which contains a first component
of type a, and a second component of type b. For example,
the expression ("hello", True) has type (String, Bool).
[a] is a the type of a list, each of whose elements have type a.
For example, the expression ["hello", "Haskell", "world"] is
a list of strings, and hence has type [String].
a -> b is the type of a function, which takes arguments of type "a"
and returns results of type "b". For example, chr is a function
of type Int -> Char, which means that if you pass an integer
argument to it, then you'll get a character value back, such
as chr 65 = 'A'.
So now you have a better chance to start decoding the type for unzip.
Let me try to fill it in piece by piece, with explanations on the right:
unzip :: ... -> ... unzip is a function ...
unzip :: [..] -> ... ... that takes a list
unzip :: [(a,b)] -> ... ... of pairs as its argument
unzip :: [(a,b)] -> (...,...) ... and returns a pair
unzip :: [(a,b)] -> ([a],[b]) ... of lists as its result.
This is exactly what you were wanting:
| Good docs, on the other hand, are very helpful. Even if it strikes an
| old-timer as redundant to explain "unzip = foldr (\(a,b) ~(as,bs) ->
| (a:as,b:bs)) ([],[])" as "this function takes a list of pairs and
| returns a pair of lists", believe it or not this actually helps newbies.
(And in Erik's defense, it was also the first thing that he said after
giving the type for unzip!)
Now what about the "a" and the "b"? These are type variables, which
represent unknown/arbitrary types. The fact is that it doesn't matter
what type of argument you pass to unzip, so long as it's a list of
pairs. And in each case, the result will always be a pair of lists.
For example, if the argument has type [(Int,Bool)], then the result
will have type ([Int], [Bool]); if the argument has type [([Int],[Int])],
then the result will have type ([[Int]], [[Int]]); and so on ...
Finally, consider the following problem. I'm going to give you a value
of type [(a,b)], for some types a and b, but I'm not going to tell
you what that value is, or what the types a and b represent. Instead,
I'd like you to tell me how you could use the value that I gave you
to produce a result of type ([a], [b]). In other words, what would
you do if you were a function of type [(a,b)] -> ([a], [b])? Think
about this for a moment ... all you know is something about the shape
of the argument value. It's going to look (roughly) something like:
[(a1, b1), (a2, b2), ..., (an, bn)]
where the first component of each pair has type a, and the second
component has type b. From this data, there are many different ways
that you could obtain a value of type ([a], [b]), but perhaps the most
natural one --- the one that uses all of the input data in the simplest
possible way --- is the function that returns:
([a1, a2, ..., an], [b1, b2, ..., bn])
This, in fact, is exactly what unzip does. To confirm that, you will
need to look at the actual definition, but the real point here is to
see how much we could learn about unzip, simply by thinking about its
type. This, at some level, is one of the amazing things about
polymorphic types in languages like Haskell, and the thing that Erik
was referring to when he talked about Phil Wadler's paper on "Theorems
for Free!" Types can capture a lot of information, so leaving
out the types when you talk about, think about, define, or simply refer
to a function is leaving out valuable documentation.
I hope that my comments here will help you to understand Erik's
message more fully, and to begin to appreciate how useful types
can be in reading, understanding, and writing Haskell code. At
the same time, don't forget that we have folks at all kinds of
different level on this list, from old-timers to newcomers. Some
of the messages that you'll see here are going to be over the heads
of newcomers, while others will be of no interest to old-timers.
(Erik's message was perhaps in the first category, while this one
here will be in the second: most old-timers probably stopped reading
this message some time ago, because they already knew everything that
I was saying!) It's important that we have folks at all ends of
the spectrum reading and contributing to this list --- indeed, that's
what gives us all an opportunity to learn more and develop our
understanding of Haskell.
Hope this helps!
All the best,
Mark