Thinking a bit more, I am still torn between the two approaches.

For simplicity, I like the idea of having the methods do the consumption, but 
having the methods return objects that do the parsing has two really big 
advantages:

1) It makes the whole thing more powerful from the perspective of subclassing.  
You can write a grammar for language X that has no actions.  You can then 
subclass this grammar (in Smalltalk or whatever), call super to get the rules, 
and then just send an -instantiating message to attach actions to them.  If 
we're using local variables for intermediate results then we don't have any way 
of accessing them from subclasses.

It seems that none of the other OMeta implementations really use this much.  
For example, every implementation of the OMeta grammar itself seems to copy the 
grammar description entirely and add actions, which, to me, completely defeats 
the point of having an OO grammar description framework to start with.

2) It's easier to do recursive rules.  For example, my listOf() can take any 
parsing expression as an argument, while the one from the original OMeta 
implementation seems to only be able to do single-selector rules or terminals.  
This means that I can do a list of arbitrary things much more easily than they 
can and I can more easily write very complex rules.

2.5) It's easier (potentially) to make concurrent.  All of the parsing state is 
stored on the stack, so we can do 'or' rules in separate threads if we want and 
see if any of them are matched.  This is really trivial to do with the existing 
futures code in EtoileThread; memoise the parsing expressions, send some of 
them an inNewThread message, and then replace the or: rule with one that tries 
to match them all but doesn't test the results until after sending the 
messages.  

My gut feeling is that we should actually keep most of the existing PEG code 
(especially now I've got it working mostly how I want it to be working ;-) but 
remove the combination rules and just inline them into rules.  Then it becomes 
easier for the rules to collect the actions.  Most importantly, we remove the 
OMCapturingExpression, and make each rule do the capturing itself, so that each 
rule has its own capture namespace.

David

On 22 Jan 2010, at 13:06, Guenther Noack wrote:

> Hi David,
> 
> Ok, no worries. I know these debugging sessions.
> 
> It's interesting that you talk about implementing chained parse
> failures. I also did a little ad-hoc implementation of this in my Ruby
> experiment. When implementing this, don't forget the case that the OR
> rule fails. When that happens, I believe you probably want to see an
> error message like "OR with 3 options failed, reasons were: (list of
> three other parse failures)". I didn't think about that initially and
> this currently makes things a bit complicated.
> 
> In both my and the other Ruby implementation, parse failures are
> exceptions. It's surely debatable whether this is good style, but the
> generated code for the rules is much more straightforward when it
> doesn't need to care about failures so much. Failure backtraces on a
> rule-level (instead of method level) can be done by encapsulating rule
> calls in an "apply" method, which attaches rule name information to
> exceptions as they fly by.
> 
> It's pretty exciting to see the OMeta stuff working. Since yesterday
> evening, my little Ruby hack can parse all kinds of fancy recognizers
> already. It still needs a Rubyish-language parser for the host
> expressions in the OMeta grammar. Maybe that can be bootstrapped from
> the JavaScript implementation as well. It's *exciting*! :-)
> 
> -Günther

_______________________________________________
Etoile-dev mailing list
Etoile-dev@gna.org
https://mail.gna.org/listinfo/etoile-dev

Reply via email to