Hello, I am very interested in chromatic's pheme language. I have been reading through the code and looking at your TODO list. I thought I would tackle some of the easier issues to get a handle on PIR and help out a bit.
questions: 1. Are you targetting r5 or r6 ? I think r6 would be a better fit for parrot myself. In particular the spec for (library foo) aka name-spaces would help pheme integrate with parrot/other languages better. I decided to start with something easy: whitespace. I looked up r6 which has a nice BNF grammar that is a useful starting point. I came up with the rules below: rule ignore { [ <comment> | <delimiter> ]* } token comment { ; \N* <eol> } rule delimiter { <blank> | <eol> } token blank { <[\ \t]>+ } token eol { \n\r? } I know almost zero about PGE, I am reading docs etc. but basically what I would like to do is build tokens out of tokens. Ideally I would like to make both "ignore" and "delimiter" be tokens , not rules. This is more of a writing convenience. In a difficult sed script I built in the past to go through and convert a bunch of broken C++ decls I would use shell variables to store regex building blocks, and then assemble those building blocks into higher level expressions with basic string interpolation. question: 1. can tokens be used to build tokens ? In perl 5 I would compile a regex with string interpolation to get this sort of functionality. if so is there a name for this feature in PGE ? 2. token eol { \n\r? } This is pretty clearly for handling windows line terminators. This is the sort of thing that should be pushed down into parrot. a special builtin "eol" or "end-of-line" token could help get rid of this stuff out of parrot. Is this RT worthy ? Something like this would definitely fit with the "conservation of cruft" principle. 3. Is there a tool for pretty printing a AST dump ? I am thinking of dumping the AST using dump, then using a classic tree drawing algorithm , and drawing a tree using SVG. Something like that could probably be done easily in perl5. Is there a tool like this ? 4. how do you debug AST ? recommended tools ? atom handling: I noticed that atom handling looked very alpha. It looks like you want to distinguish between symbols "foo" | "foo-bar" , and literal values "#t" | "#f". This is really nasty to do at a lexical level. A nicer way to do this would be to form a string token like this: token string { <!reserved>+ } token reserved { # r5 reserved # future reserved <[ \( \) \# ' \[ \] \{ \} ]> } At this point most languages are going to need to post-lexical analysis of the string to distinguish literal values from symbols. A syntax like this would be nice: token truth:string { # <[tf]> } token integral:string { \d+ } token symbol:string fallback This syntax would indicate that after the token string has been lexed that it is again analyzed by a regex, and converted to either a truth value, integral value, or a symbol if all else fails. If this is not already implemented I would like to create a TODO RT for it. Thanks for any comments/suggestions. Cheers, Mike Mattie - [EMAIL PROTECTED]
signature.asc
Description: PGP signature