Hi

Something like this would do?

if_ = Compound $ If [(IntLit 6, Suite [] [Break])] Nothing
while_ = Compound $ While (IntLit 6) (Suite [] [if_]) Nothing

f = Program [while_]

-- this one fails
-- f2 = Program [if_]


newtype Ident = Id String

data BinOp = Add
           | Sub

data Exp = IntLit Integer
         | BinOpExp BinOp Exp Exp

data NormalCtx
data LoopCtx

data Statement ctx where
  Compound :: Compound ctx -> Statement ctx
  Pass     :: Statement ctx
  Break    :: Statement LoopCtx

newtype Global = Global [Ident]

data Suite ctx = Suite [Global] [Statement ctx]

type Else ctx = Suite ctx

data Compound ctx where
    If    :: [(Exp, Suite ctx)] -> Maybe (Else ctx) -> Compound ctx
While :: Exp -> (Suite LoopCtx) -> Maybe (Else LoopCtx) -> Compound ctx

newtype Program = Program [Statement NormalCtx]

Daniel

On Feb 20, 2008, at 5:12 PM, Roel van Dijk wrote:

Hello everyone,

I am trying to create an AST for Python. My approach is to create a
data type for each syntactic construct. But I am stuck trying to
statically enforce some constraints over my statements. A very short
example to illustrate my problem:


newtype Ident = Id String

data BinOp = Add
           | Sub

data Exp = IntLit Integer
         | BinOpExp BinOp Exp Exp

data NormalCtx
data LoopCtx

data Statement ctx where
  Compound :: Compound -> Statement ctx
  Pass     :: Statement ctx
  Break    :: Statement LoopCtx

newtype Global = Global [Ident]

data Suite ctx = Suite [Global] [Statement ctx]

type Else = Suite NormalCtx

data Compound = If [(Exp, Suite NormalCtx)] (Maybe Else)
              | While Exp (Suite LoopCtx) (Maybe Else)

newtype Program = Program [Statement NormalCtx]


The "global" statement makes an identifier visible in the local scope.
It holds for the entire current code block. So it also works
backwards, which is why I didn't make it a statement but part of a
suite (= block of statements).

Some statements may occur in any context, such as the "pass"
statement. But others are only allowed in certain situations, such as
the "break" statement. This is why I defined the Statement as a GADT.
I just supply the context in which the statement may be used and the
typechecker magically does the rest.

Feeling very content with this solution I tried a slightly more
complex program and discovered that my AST can not represent this
Python program:

for i in range(10):
  if i == 6:
    break

The compound if statement is perfectly valid nested in the loop
because the Compound constructor of Statement allows any context. But
the suites inside the clauses of the if statement only allow normal
contexts. Since Break has a LoopCtx the typechecker complains.

Is there some other way to statically enforce that break statements
can only occur _nested_ inside a loop? There is a similar problem with
return statements that may only occur in functions. These nested
statements should somehow 'inherit' a context, if that makes any sense
:-)

I know I can simply create separate data types statements that can
occur inside loops and function bodies. But that would make the AST a
lot more complex, something I try to avoid. Python's syntax is already
complex enough!

Most of these constraints are not in the EBNF grammar which can be
found in the language reference, but they are specified in the
accompanying text. The cpython interpreter will generate SyntaxError's
when you violate these constraints.

See also Python's language reference:
http://docs.python.org/ref/ref.html (see sections 6 and 7)
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to