Hello all,

Here are some ideas on...

Extending Haskell With Attributes
---------------------------------

Below I describe a way in which attribute grammars could be added to 
Haskell, using Bird's replace-minimum as a running example.  None of 
this has yet been implemented, and it might very well turn out that 
part of these ideas must be revised when implementation is attempted.  
I'd appreciate to hear your comments on this.  Note that I do not 
explain the How, What, and Why of attribute grammars.

--Syntax and semantics--
In a given program context, every value within a type has a fixed set 
of typed attributes.  The names of these attributes are simple 
identifiers that are unique within this set, and every attribute is 
either synthesized or inherited.  For example, given the datatype

> data Tree = Leaf Int | Fork Tree Tree

one can declare that every  Tree  throughout the program has a  min  
attribute by adding this top-level declaration:

> attr Tree where synthesized min :: Int

For values that have multiple types (such as the empty list  [] ) we 
demand that the attributes of a more generally typed value form a 
subset of those of a more specificly typed value.  For instance, if all 
values of type  [a]  have a synthesized integer  length  attribute, 
then also all values of types  [Int]  and  [Char]  have one:

> attr [a] where synthesized length :: Int

Conceptually, attribute values are stored in bindings.  This makes it 
possible to decorate the same  Tree  with different attribute values in 
different contexts.  Attribute values are indicated by these notations:

  variable^attributeName
  variable!attributeName

The first is used for synthesized attributes, the second for inherited 
ones.  (Note that this notation conflicts with current Haskell 
operators; ideally we would use upward and downward arrows.)  To 
compute the smallest value of a  Tree  we could simply take its  min  
attribute:

> minTree :: Tree -> Int
> minTree t = t^min

But for this to work we must also say how the  min  attribute is 
calculated:

> eq t :: Tree of
>    Leaf n   where t^min = n
>    Fork l r where t^min = l^min `min` r^min

These equations relate the attribute values of a  Tree  with those of 
its subvalues, which are described using pattern matching.  (It is also 
possible to use guards to limit matching of a pattern.)  To demonstrate 
the use of inherited attributes, we add an inherited attribute  num  
that is simply distributed throughout a tree.

> attr Tree where inherited num :: Int

> eq t :: Tree of
>    Fork l r where l!num = t!num
>                   r!num = t!num

Finally, we add a synthesized attribute containing a tree of the same 
structure, but with every leaf replaced with the  num  attribute:

> attr Tree where synthesized tree :: Tree

> eq t :: Tree of
>    Leaf n   where t^tree = Leaf t!num
>    Fork l r where t^tree = Fork l^tree r^tree

Now we can implement Bird's replace-minimum example by

> repMin :: Tree -> Tree
> repMin t = t^tree where t!num = t^min

This transforms a tree into one with the same structure, but every leaf 
containing the smallest value of the tree.

--Remarks--
There are a couple of remarks to be made about the above syntax and 
semantics.

It is not strictly necessary to use different operators for inherited 
and synthesized attributes, but the distinction gives better 
readablity, and it enables the compiler/interpreter/translator to catch 
more errors.

When using the translation detailed below, it is easy to add guards to 
the pattern matching in an equations declaration.

We have stated that, in a given context, every value (within a type) 
has a fixed set of attributes.  In the terminology of Johnsson, this 
means that every value has one attribute sort.  (It is therefore not 
necessary to add attribute sort annotations to bindings.)  To emulate 
attribute sorts, the programmer can declare a new type with the same 
structure as an existing one, and give it different attributes.

Every kind of value can be attributed: we are not restricted to 
abstract datatypes.

We have associated attributes with values, instead of with types.  This 
has been done to support the use of local attributes, which are only 
attached to a limited set of values from a datatype.

The syntax above could be extended to allow `remote upward attribute 
sets' as in SSL; these are easily translated into additional 
attributes.

--Translation--
The implementation of the new declarations is most easily done by 
translating to `native' Haskell, roughly as follows:

* Translate the equations to so-called visit functions; for each type, 
these compute the synthesized attributes from the value and its 
inherited attributes.  The  attr  declarations give their types, and 
the  eq  declarations their definitions.

* Add calls to the visit functions in places where synthesized 
attributes are used.

* Replace all signs  ^  and  !  by an apostrophe.

In attribute grammar literature this is known as the CIRC mapping.  For 
the above example, this leads to

> visitTree :: Tree -> (Int, Tree)
> visitTree (Leaf n)   t'num = (t'min, t'tree)
>    where t'min  = n
>          t'tree = Leaf t'num
> visitTree (Fork l r) t'num = (t'min, t'tree)
>    where t'min  = l'min `min` r'min
>          l'num  = t'num
>          r'num  = t'num
>          t'tree = Fork l'tree r'tree
>          (l'min, l'tree) = visitTree l l'num
>          (r'min, r'tree) = visitTree r r'num

> minTree :: Tree -> Int
> minTree t = t'min
>             where (t'min, _) = visitTree t undefined

> repMin :: Tree -> Tree
> repMin t = t'tree
>    where t'num = t'min
>          (t'min, t'tree) = visitTree t t'num

(Note: this code has not been tested.)  The important thing about this 
translation is that it can be done automatically.

Of course, an implementation may use a different translation, or may 
even choose to implement attribute decoration on dedicated hardware, as 
long as the result is exactly equivalent to the above translation.

The easiest implementation is probably as as pre-processor, feeding the 
result into a native Haskell compiler.  A disadvantage of this approach 
is that it might become more difficult to trace errors.

Any comments?

  <><

Marnix
--
Marnix Klooster
[EMAIL PROTECTED]   //   [EMAIL PROTECTED]




Reply via email to