On Friday 11 May 2007 21:26:42 Mike Mattie wrote:

> I am very interested in chromatic's pheme language. I have been reading
> through the code and looking at your TODO list. I thought I would tackle
> some of the easier issues to get a handle on PIR and help out a bit.

Great!

> questions:
>
> 1. Are you targetting r5 or r6 ? I think r6 would be a better fit for
> parrot myself. In particular the spec for (library foo) aka name-spaces
> would help pheme integrate with parrot/other languages better.

I haven't made a particular choice yet.  I put it on hiatus because I hated 
the hack for dealing with special forms (particularly 'if'), and hope to port 
the compiler to HLLCompiler.pir and the PAST step to PAST-pm at some point in 
the near future.

r6 is fine.

> I decided to start with something easy: whitespace.
>
> I looked up r6 which has a nice BNF grammar that is a useful starting
> point. I came up with the rules below:
>
> rule ignore { [ <comment> | <delimiter> ]* }
>
> token comment { ; \N* <eol> }
>
> rule delimiter { <blank> | <eol> }
> token blank { <[\ \t]>+ }
> token eol { \n\r? }
>
> I know almost zero about PGE, I am reading docs etc. but basically what
> I would like to do is build tokens out of tokens. Ideally I would like
> to make both "ignore" and "delimiter" be tokens , not rules.
>
> This is more of a writing convenience. In a difficult sed script I built in
> the past to go through and convert a bunch of broken C++ decls I would use
> shell variables to store regex building blocks, and then assemble those
> building blocks into higher level expressions with basic string
> interpolation.
>
> question:
>
> 1. can tokens be used to build tokens ? In perl 5 I would compile a regex
> with string interpolation to get this sort of functionality. if so is there
> a name for this feature in PGE ?

Patrick or Allison knows better.

> 2. token eol { \n\r? }
>
>    This is pretty clearly for handling windows line terminators. This is
> the sort of thing that should be pushed down into parrot. a special builtin
> "eol" or "end-of-line" token could help get rid of this stuff out of
> parrot. Is this RT worthy ? Something like this would definitely fit with
> the "conservation of cruft" principle.

That seems reasonable to me.

> 3. Is there a tool for pretty printing a AST dump ? I am thinking of
> dumping the AST using dump, then using a classic tree drawing algorithm ,
> and drawing a tree using SVG. Something like that could probably be done
> easily in perl5. Is there a tool like this ?

HLLCompiler can do this with little effort.  I usually modify 
lib/PhemeCompiler.pir and uncomment the appropriate line in compile().  It 
really needs a port to HLLCompiler.

> 4. how do you debug AST ? recommended tools ?

The brute force application of your head against a wall.

> atom handling:
>
> I noticed that atom handling looked very alpha. It looks like you want to
> distinguish between symbols "foo" | "foo-bar" , and literal values "#t" |
> "#f". This is really nasty to do at a lexical level.
>
> A nicer way to do this would be to form a string token like this:
>
> token string { <!reserved>+ }
>
> token reserved {
>   # r5 reserved
>   # future reserved
>   <[
>     \( \) \# '
>
>     \[ \] \{ \}
>   ]>
> }
>
> At this point most languages are going to need to post-lexical analysis of
> the string to distinguish literal values from symbols. A syntax like this
> would be nice:
>
> token truth:string {
>   # <[tf]>
> }
>
> token integral:string {
>   \d+
> }
>
> token symbol:string fallback
>
> This syntax would indicate that after the token string has been lexed that
> it is again analyzed by a regex, and converted to either a truth value,
> integral value, or a symbol if all else fails.
>
> If this is not already implemented I would like to create a TODO RT for
> it.

Be my guest.

I'm not sure what Patrick's new languages/abc plan means for Pheme, but it's 
unlikely I'll have time to absorb the new features and get Pheme up to date 
with regard to HLLCompiler in the next couple of weeks.

-- c

Reply via email to