[svn:parrot] r10051 - trunk/compilers/pge

pmichaud Wed, 16 Nov 2005 14:07:09 -0800

Author: pmichaud
Date: Wed Nov 16 14:06:57 2005
New Revision: 10051

Modified:
   trunk/compilers/pge/README
Log:
Updated PGE's README to reflect current usage and status.



Modified: trunk/compilers/pge/README
==============================================================================
--- trunk/compilers/pge/README  (original)
+++ trunk/compilers/pge/README  Wed Nov 16 14:06:57 2005
@@ -2,23 +2,18 @@
 
 =head1 Parrot Grammar Engine (PGE)
 
-This is the second implementation of a regular expression/rules/grammar
-engine designed to run in Parrot.  It's still a work in progress, and
-some parts of the implementation are designed simply to "bootstrap" 
-us along (i.e., some parts such as the parser and generator are 
-expected to be discarded).  The current work is also largely incomplete 
--- although it has support for groups (capturing and non-capturing), 
-quantifiers, alterations, etc., many of the standard assertions and 
-character classes are not implemented yet but will be coming soon.
-
-The previous version of PGE used a C-based parser and generator,
-but after the capture semantics were redesigned (Winter 2005) Pm
-decided that it would be better to just write the parser and generator
-in Parrot.  Thus, the current version.
-
-In addition we'll be looking at writing a parser and compiler for
-Perl *5* regular expressions, but the focus for the time being is
-(obviously) on Perl 6.
+This is a regular expression/rules/grammar engine/parser designed to
+run in Parrot.  It's still a work in progress, but has a lot of
+nice features in it, including support for perl 6 rule expressions,
+globs, shift-reduce parsing, and ("coming soon") some support for
+perl 5 regular expressions.
+
+A nice feature of PGE is that one can easily combine many
+different parsing styles into a single interface.  PGE uses 
+perl 6 rules for its top-down parsing, an operator precedence
+parser for bottom-up (shift/reduce) parsing, and allows control
+to pass freely between the two styles as well as to custom parsing
+subroutines.
 
 =head1 Installation
 
@@ -39,18 +34,18 @@ C<parrot demo.pir>.  The demo understand
     trace             - toggle pattern execution tracing
     next              - repeat last match on target string
 
-=head1 Using PGE
+=head1 PGE's rule engine  (PGE::P6Rule)
 
 Once PGE is compiled and installed, you generally load it using
 the load_bytecode operation, as in
 
     load_bytecode "PGE.pbc"          
 
-This imports the C<PGE::p6rule> subroutine, which can be used to
-compile Perl 6 rules.  A sample compile sequence would be:
+This imports the PGE::P6Rule compiler, which can be used to compile
+strings of Perl 6 rules.  A sample compile sequence would be:
 
     .local pmc p6rule_compile
-    find_global p6rule_compile, "PGE", "p6rule"    # get the compiler
+    p6rule_compile = compreg "PGE::P6Rule"         # get the compiler
 
     .local string pattern       
     .local pmc rulesub                     
@@ -66,13 +61,16 @@ to get back a C<PGE::Match> object:
 
 The Match object is true if it successfully matched, and contains
 the strings and subpatterns that were matched as part of the capture.
-The C<dump> method can be used to quickly view the results of
-the match:
+Parrot's "Data::Dumper" can be used to quickly view the results
+of the match:
+
+    load_bytecode "dumper.imc"
+    load_bytecode "PGE/Dumper.pir"
 
   match_loop:
     unless match goto match_fail                   # if match fails stop
     print "match succeeded\n"
-    match."dump"()                                 # display matches
+    _dumper(match)                      
     match."next"()                                 # find the next match
     goto match_loop
 
@@ -86,26 +84,17 @@ the rule subroutine -- just use
 
 and you can print/inspect the contents of $S0 to see the generated code.
 
-=head1 Known Limitations
-
-Since the Parrot rewrite, PGE knows and uses as much of Unicode strings
-as Parrot does.  
+See the STATUS file for a list of implemented and yet-to-be-implemented
+features.
 
-Some backslashes aren't implemented yet, although the major ones
-are (\d, \s, \n, \D, \S, \N).
+=head1 Known limitations of the rule engine
 
 PGE doesn't (yet) properly handle nested repetitions of zero-length 
 patterns in groups -- that's coming soon.
 
-This is just the first-cut framework for building the 
-remainder of the engine, so many items (lookaround, 
-conjunctions, closures, and hypotheticals)
-just aren't implemented yet.  They're on their way!
-
-Also, many well-known optimizations (e.g., Boyer-Moore) aren't 
-implemented yet -- my primary goals at this point are to
-"release early, often" and to get sufficient features in place so
-that more people can be testing and building upon the engine.
+Many well-known optimizations (e.g., Boyer-Moore) aren't
+implemented yet, although a variety of optimizations are being
+added as we generate code.
 
 Lastly, error handling needs to be improved, but this will likely
 be decided as we discover how PGE integrates with the rest of
@@ -119,17 +108,24 @@ that can match strings.  So, PGE consist
 (for each pattern matching language), an intermediate expression
 format, and a code generator.
 
-The generated code uses bsr/ret for its internal subroutine calls
-(also optimized for tailcalls) and then uses Parrot calling 
-conventions for all interfaces with external callers/callees.
-This should give some performance improvements.
+The parsers can be written using PIR subroutines or PGE's built-in
+operator precedence (shift/reduce) parser; the parser for Perl 6
+rule expressions is built with the operator precedence parser.
+This parser produces a parse tree (in the form of a Match object)
+for a given perl 6 rule expression. the parse tree then goes through
+semantic analysis and reduction phases before being sent
+to code generation to produce a PIR subroutine.  
+
+The generated PIR code uses bsr/ret for its internal backtracking
+(optimized for tailcalls) and uses Parrot calling conventions for 
+all interfaces with external callers/callees such as subrules.
 
 PGE also uses Parrot coroutines for the matching
 engine, so that after a successful match is found, the 
 next match within the same string can be found by simply 
-returning control to the matching coroutine (which then
+returning control to the matching coroutine, which then
 picks up from where it had previously left off until
-another match is discovered).
+another match is discovered.
 
 The code still needs a fair amount of commenting.  In general,
 if you have a question about a particular section of code,

[svn:parrot] r10051 - trunk/compilers/pge

Reply via email to