Perl 6 Rules implementations in Haskell and Perl 6.A 20-minutes hack had paid off handsomely: In 33 lines, we have a fast, pure Haskell implementation of basic Perl 6 Rules support. Adding support for each new construct should take only 2 lines -- one at the OpTable precedence parser below to recognize it, and one to compile it to the underlying PArrows library. (As an aside, I'm glad to have finally grokked what an Arrow is.
As usual with lambdafolk technologies such as Of course, the real work is done somewhere else. Aside from using PArrows for the heavy lifting in top-down parsing, we have the bottom-up OpTable.hs precedence parser (a straight port of Parrot's PGE::OPTable). This is my first successful use of GHC's implicit parameters, which are exactly like "env" variables as recently specced in the lexical syntax spec (S02), allowing a much faster and concise implementation than Reader monads. Currently the OpTable port covers all features found in the PIR base code, and I plan to add list/chain-associative infix operators to it, so we can perform interesting optimizations like turning /cat|car|cow/ into /c[a[t|r]|ow]/. Much kudos to Patrick Michaud for doing all the design work for Rules and OpTable, so this coding monkey can just plunge ahead without any head-banging against the wall. Luqui is also doing well in implementing both libraries in Perl 6, using a role-based composition architecture to handle multiple input types and evaluation strategies. More details is available in its architecture document. With luck, we can ship 6.28.0 with this Haskell-native Rule support,
without having PGE (and thus Parrot) as a dependency. We'd still use
PGE when compiling to Parrot, and rely on luqui's Parse::Rule when
targetting _javascript_ and Perl 5. PArrows can compile
to _javascript_ natively, which may speed up things a bit... We'll
see. :-) Second day of Text.Parser.Rule.I managed to transcribe most of PGE::P6Rule to Text.Parser.Rule, so about 80% of rule syntax is now parsed correctly into a Rule tree. Instead of using GADTs, I went for a more conventional cascading data type approach, with five levels of rule nodes:
Moreover, list-associative infix support landed in OpTable parser, so /(a|b|c)/ is now parsed as alternation of a three-element list, instead of (a|(b|c)) in the current PGE implementation. Multipositional error reporting.It's past midnight here, so I'll let the commit log speak for itself:
I have rewritten half of PArrow for this. The neat thing is that if
it
matches at any given point, none of the expensive error collection
would take place, so there should be no severe performance drawbacks. Day 4 of Text.Parser.Rule.Today I've rewritten PArrow's primitive combinators to support proper backtracking and non-greedy matching, as well as subrule support. This now works correctly:
Both named and positional captures are supported, as well as negated and noncapturing variants (<!name> and <?name> respectively). This seems sufficient for porting the surface language parser of PILN and PIL2 from Parsec; tewk already volunteered to give it a try. If it turns out that we need more features (e.g. lookaround, integration with OpTable, callbacks and backreferences), I'll definitely hack them in. :-) Mixing Rules with OpTable.As Patrick and Larry had observed, Perl 6 is better parsed with a combination of top-down (Rules) and bottom-up (OpTable) parsers. To reach this goal, today I hacked in Dynamic Rules support to Text.Parser.Rule, so we can define a rule term with any function that can parse a string into anything:
As a concrete example, here is a grammar that parses a named rule declaration:
And the ruleTable is defined as an OpTable:
Because of the flexible Dynamic return type, the term derived from OpTable doesn't have to create a MatchObj structure, but can simply return the matched Rule. We can also hook Parsec and parser methods defined in Perl 6 to this engine -- it's very likely to happen during the development towards 6.281, if not sooner. In other news, tewk finished porting the PILN minilang parser to Rules. It does not run yet, because I hadn't got around to add enumerated character class support, but there's always tomorrow... :-) |
signature.asc
Description: OpenPGP digital signature