This week's Pugs-Rule developments.

A 20-minutes hack had paid off handsomely: In 33 lines, we have a fast, pure Haskell implementation of basic Perl 6 Rules support. Adding support for each new construct should take only 2 lines -- one at the OpTable precedence parser below to recognize it, and one to compile it to the underlying PArrows library.

(As an aside, I'm glad to have finally grokked what an Arrow is. As usual with lambdafolk technologies such as ~~Monads~~Actions, it is very simple yet tricky to convey, and could really use a better name like Flow.)

Of course, the real work is done somewhere else. Aside from using PArrows for the heavy lifting in top-down parsing, we have the bottom-up OpTable.hs precedence parser (a straight port of Parrot's PGE::OPTable). This is my first successful use of GHC's implicit parameters, which are exactly like "env" variables as recently specced in the lexical syntax spec (S02), allowing a much faster and concise implementation than Reader monads.

Currently the OpTable port covers all features found in the PIR base code, and I plan to add list/chain-associative infix operators to it, so we can perform interesting optimizations like turning /cat|car|cow/ into /c[a[t|r]|ow]/.

Much kudos to Patrick Michaud for doing all the design work for Rules and OpTable, so this coding monkey can just plunge ahead without any head-banging against the wall.

Luqui is also doing well in implementing both libraries in Perl 6, using a role-based composition architecture to handle multiple input types and evaluation strategies. More details is available in its architecture document.

With luck, we can ship 6.28.0 with this Haskell-native Rule support, without having PGE (and thus Parrot) as a dependency. We'd still use PGE when compiling to Parrot, and rely on luqui's Parse::Rule when targetting _javascript_ and Perl 5. PArrows can compile to _javascript_ natively, which may speed up things a bit... We'll see. :-)

Second day of Text.Parser.Rule.

Moreover, list-associative infix support landed in OpTable parser, so /(a|b|c)/ is now parsed as alternation of a three-element list, instead of (a|(b|c)) in the current PGE implementation.

Multipositional error reporting.

I have rewritten half of PArrow for this. The neat thing is that if it matches at any given point, none of the expensive error collection would take place, so there should be no severe performance drawbacks.

Also, the returned match object is now compatible with PGE's structure, so hooking it back to Pugs is trivial.

Day 4 of Text.Parser.Rule.

Today I've rewritten PArrow's primitive combinators to support proper backtracking and non-greedy matching, as well as subrule support. This now works correctly:

miniLang :: Grammar miniLang = grammar [ "literal" ~:~ "nil | true | false" , "call" ~:~ "\\. <method>" , "method" ~:~ "\\w+" , "expr" ~:~ "<literal> <call>*" ]

ghci> "nil.foo.bar" ~~ miniLang .<> "expr" Right (MatchObj { matchString = "nil.foo.bar" , matchSubPos = fromList [] , matchSubNam = { "call" := MatchSeq (fromList [ MatchStr "." , MatchNam "method" (MatchStr "bar") ]) , "literal" := MatchStr "nil" } })

Both named and positional captures are supported, as well as negated and noncapturing variants (<!name> and <?name> respectively). This seems sufficient for porting the surface language parser of PIL^N and PIL² from Parsec; tewk already volunteered to give it a try.

If it turns out that we need more features (e.g. lookaround, integration with OpTable, callbacks and backreferences), I'll definitely hack them in. :-)

Mixing Rules with OpTable.

As Patrick and Larry had observed, Perl 6 is better parsed with a combination of top-down (Rules) and bottom-up (OpTable) parsers. To reach this goal, today I hacked in Dynamic Rules support to Text.Parser.Rule, so we can define a rule term with any function that can parse a string into anything:

Because of the flexible Dynamic return type, the term derived from OpTable doesn't have to create a MatchObj structure, but can simply return the matched Rule. We can also hook Parsec and parser methods defined in Perl 6 to this engine -- it's very likely to happen during the development towards 6.281, if not sooner.

In other news, tewk finished porting the PIL^N minilang parser to Rules. It does not run yet, because I hadn't got around to add enumerated character class support, but there's always tomorrow... :-)

This week's Pugs-Rule developments.

Perl 6 Rules implementations in Haskell and Perl 6.

Second day of Text.Parser.Rule.

Multipositional error reporting.

Day 4 of Text.Parser.Rule.

Mixing Rules with OpTable.

Reply via email to