Re: [Haskell] Implicit parameters redux

2004-01-29 Thread Ben Rudiak-Gould
Here's an example of implicit return values from a project I worked on
recently, followed by an example of the thread idea.

Suppose I've written a decompiler -- it takes binary object code and
produces an abstract syntax tree representing source code. A very
simplified version of the output type might be

  type StatementBlock = [Expr]

  data Expr
= Arith Expr String Expr-- e.g. Arith (Literal 5) + (Literal 8)
| Assign Expr Expr
| ProcCall Expr [Expr]
| Literal Int
| TheProcedure Int
| ...

The Int field of TheProcedure is the raw address of the beginning of the
procedure in the file. So code like foo(1,2,3) will be represented as
something like ProcCall (TheProcedure 51034) [Literal 1, ...]

I want to produce source code as output, so I write a function with type
StatementBlock - String:

  showStatement exprs = concat [ showExpr x ++ ;\n | x - exprs ]

  showExpr (Arith left op right) =
showExpr left ++ op ++ showExpr right

  showExpr (ProcCall proc args) =
showExpr proc ++ ( ++ join , (map showExpr args) ++ )

  showExpr (Literal n) = show n

  showExpr (TheProcedure addr) = procedure ++ show addr

The last line leaves something to be desired -- it chooses very unfriendly
names for the procedures. As a matter of fact I have various heuristics
for choosing more helpful names for procedures, and I also allow the user
to supply a configuration file with names. So I encapsulate all this in a
table of names and pass it to showExpr, and I get code like

  showExpr names (TheProcedure addr) =
lookupProcedureName names addr

But the rest of showExpr and showStatement get needlessly ugly, because
they have to pass a names parameter to every recursive call. This is
where ordinary implicit parameters become useful. I replace names with
?names and it gets passed around for me.

Now the decompiler may produce code which refers to procedures I don't
know about (haven't decompiled). I can indicate this in the source code
I produce:

  showExpr names (TheProcedure addr) =
case lookupProcedureName names addr of
  Just name - name
  Nothing   - (*** unknown procedure ***)

But I'd like to also collect these for later use -- say, to list as part
of a summary printed at the end.

There are various ways I could do this, but let me concentrate on this
one:

  showExpr (TheProcedure addr) =
case lookupProcedureName ?names addr of
  Just name - (name, [])
  Nothing   - ((*** unknown procedure ***), [addr])

  showExpr (Literal n) = (show n, [])

  showExpr (Arith left op right) =
(x++op++y, p++q)
where (x,p) = showExpr left
  (y,q) = showExpr right

This strategy lets us collect a list of unrecognized addresses at the top,
as a second return value. But the code gets very ugly -- much worse than
the implicit parameter case, in fact, since Haskell doesn't have a
convenient notation for multiple return values. I could hide this with a
modified ++ operator:

  (x,p) ++ (y,q) = (x++y, p++q)

Then I could write:

  showExpr (Arith left op right) =
showExpr left ++ (op, []) ++ showExpr right

Better, but not great.

Implicit return values provide a much cleaner solution: just write

  showExpr (TheProcedure addr) =
case lookupProcedureName ?names addr of
  Just name - name
  Nothing   - ((*** unknown procedure ***), %unknown = [addr])

and you're done. None of the other cases need to be modified (unless they
also produce unknown addresses).

This need to produce some form of statistical information on the side
comes up fairly frequently in my code.


Now state threading. Consider the following silly imperative program in C:

  char name[100];
  int i;

  puts(What is your name?);
  gets(name);
  for (i = 0; name[i]; ++i)
name[i] = toupper(name[i]);
  puts(Your name in uppercase is:);
  puts(name);

There's all kinds of mutation and I/O going on here. In imperative
programming there's a current state, which includes things like the
screen and the keyboard buffer and the array name, and you give a list
of commands which do something to that state, in a particular order.

A pure functional language doesn't have any implicit state. You can model
state by passing around a state variable, e.g.

  main :: World - World

  main theWorld =
let theWorld' = puts theWorld What is your name?
(name,theWorld'') = gets theWorld'
...
in theWorld'''

This isn't very convenient. Worse, theWorld can't really represent the
world, because you can reuse old values, and that isn't possible in
reality.

We can solve both problems by abstracting away from the world-passing. We
think of puts and gets and similar functions as world-transformers, and we
allow the programmer to attach the output of one to the input of another.
This is the IO monad model. There's no way to duplicate the world because
there's no transformer with one input and two outputs. (Well, there is,
actually: unsafeInterleaveIO.)

The program looks like 

Re: [Haskell] Implicit parameters redux

2004-01-28 Thread David Sankel
Ben,

  Could you explain in an extremely dumbed-down way what this is?  It would
be great if there were examples of

  1)  Some common, simple, and useful code in Haskell.
  2)  Same code using Implicit Parameters with a discussion of how it is
better.

Thanks,

David J. Sankel
___
Haskell mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell


[Haskell] Implicit parameters redux

2004-01-27 Thread Ben Rudiak-Gould
This article attempts to describe in more detail the implicit-parameter-
based sequencing model that I was trying to develop in a recent thread. I
am getting more and more excited about this idea, and after reading this I
hope you'll understand why. Any comments are greatly appreciated.
(Especially if there's a fatal flaw -- you'd better tell me now and get it
over with.)


Review of implicit return values


In the syntax I proposed a few days ago, implicit values are attached to
ordinary values like this:

  (123, %x = foo, %y = 99)

(I now write %x instead of ^x for a reason described in the next section.)

The syntax resembles a tuple, and they are in fact tuples behind the
scenes (unlifted, and probably unboxed). But the semantics is not quite
the same: the example above is equivalent to the same example with %x and
%y reversed, and also to

  ((123, %y = 99), %x = foo)

These implicit return values propagate upward through expressions in the
same way that implicit parameters propagate downward. They can be caught
at the root of an expression by matching against a similar syntax:

  case expr of
(a, %x = b) - ...

Implicit return values are merged by name as they propagate upward, using
mappend. It is a compile-time error if two values to be merged have
different types, or a single type that is not an instance of Monoid.

Another extension I proposed is that the name of an implicit return
value can include type parameters: thus %foo Int and %foo Char would be
treated as though they had different names.


Threading and branching implicit values
~~~

Implicit values fall into two categories which I'll call threading and
branching.

Linear implicit parameters (%name) are branching if they have a type which
is an instance of the class Splittable, and threading otherwise.

Ordinary implicit parameters (?name) are always branching: they behave as
though their types were trivial instances of Splittable, where split x =
(x,x).

Implicit return values are like linear implicit parameters: they're
branching or threading depending on whether their types are or are not
instances of Monoid. There is no trivial instance of Monoid for arbitrary
types, so implicit return values have no variant analogous to ?name. For
this reason I will write them %name (abandoning the ^name convention I
used before).

The intuition behind the terminology is this: if you've programmed in both
procedural and functional languages (and who here hasn't?) you know that
procedural code tends to have a sequential structure, and functional code
tends to be hierarchical. Threading parameters mirror procedural
code: because they cannot be split or merged, they are passed along in
sequential fashion from a single point of production to a single point of
consumption. Branching parameters mirror functional code: they either
percolate down the expression tree, splitting as they go, or they
percolate up, merging as they go.


Branching implicit parameters render the Reader monad obsolete
~~

It has been noticed already that ordinary implicit parameters provide
essentially the same functionality as the Reader monad, but in a more
convenient and versatile way. The main advantages are:

  * Different pieces of state are distinguished by descriptive names,
instead of by depth in the monad stack.

  * Code need not be rewritten in monadic style.

These are big advantages. There's no point using Reader if you have
implicit parameters.


Branching implicit return values render the Writer monad obsolete
~

It's not hard to see that implicit return values provide a replacement for
Writer with the same advantages.


Threading implicit values render IO, ST, and State obsolete
~~~

I'll be arguing for the rest of this article that threading implicit
values can replace all state-threading monads, and provide advantages
similar to those above. In the case of IO and ST the advantage is even
greater. IO and (forall s . ST s) form a mutually exclusive infinite
family of monads: any piece of code can operate on at most one at a
time. My proposal has (or seems to have) no similar restriction. This
opens the door to fine-grained fragmentation of mutable state, something
people have wanted since the earliest days of monads in Haskell.


Input and output threads


Along with its other arguments and return value, any function takes zero
or more threading implicit arguments and returns zero or more threading
implicit return values. I'll call the former input threads and the
latter output threads.

These threads represent bits of mutable state. For example, corresponding
to the IO monad there is an implicit value %io :: RealWorld, and any
function which interacts with the world will have an input and