I've just checked in (r9843) a new version of PGE (the grammar engine) with some substantial changes to its internal calling sequences and data structures. For those who are using PGE according to its defined interfaces things work largely the same -- anyone who is developing for PGE or making use of PGE's internals may see some differences as described below.
The biggest difference is that single-element captures in Match objects are now internally represented with the same structure as seen by the "outside world". For example, with an expression like rule = p6rule(":w (mv) [ (\w+)]*") $/[0] (aka $0) ends up with a single Match object, while $/[1] (aka $1) is an array of Match objects because of the "*" quantifier. In previous versions of PGE, the PGE::Match class internally stored all captures (quantified or not) in arrays, and used an "isarray" property on the array to indicate if it was to act as a single Match object or an array of Match objects. In the version I've just checked in, the "isarray" property is gone, and the $0, $1, $2, ... captures are stored internally as single Match objects (unquantified) or arrays as specified by the rule. In particular, this means one can now use the "get_array" and "get_hash" methods on Match objects and get exactly the correct structure. Other key differences in this new version: - PGE's internal rule calling conventions (e.g., to PIR-coded subrules such as <alpha>, <upper>, etc.) are now consistent with rules generated by PGE itself. Thus, if one wants to call the <alpha> rule directly, it can be done with: .local pmc alpha alpha = find_global "PGE::Rule", "alpha" $P0 = alpha("Some string") and $P0 will be a Match object for the "S". Note that many of PGE's built-in rules tend to act as if the :p modifier is set -- in this case anchoring the match to the beginning of the string. - The PIR code that PGE generates can now be stored externally and directly included by other PIR modules. For example, when a previous version of PGE was loaded, the initialization code executed at load-time would dynamically compile and install <ident> and <name> subrules, thus slowing down program initialization. In this new version, the PIR code for <ident> and <name> is generated as part of building PGE, so that PGE.pbc already contains the bytecode for these precompiled rules when it is loaded. - PGE Match objects can now distinguish array keys from hash keys that begin with a digit. Previously Match objects assumed that any key starting with a digit was addressing solely the array component of the Match object. - A number of performance enhancements and code cleanups, especially in the code that handles matching of quantified groups and subrules. Questions, comments, feedback welcomed as always. My next area of focus is on providing subrules that can match quoted and bracketed constructs (similar to Text::Balanced), and on completing a shift/reduce parser that integrates with PGE's rule matching capabilities. Pm