Thinking a bit more, I am still torn between the two approaches. For simplicity, I like the idea of having the methods do the consumption, but having the methods return objects that do the parsing has two really big advantages:
1) It makes the whole thing more powerful from the perspective of subclassing. You can write a grammar for language X that has no actions. You can then subclass this grammar (in Smalltalk or whatever), call super to get the rules, and then just send an -instantiating message to attach actions to them. If we're using local variables for intermediate results then we don't have any way of accessing them from subclasses. It seems that none of the other OMeta implementations really use this much. For example, every implementation of the OMeta grammar itself seems to copy the grammar description entirely and add actions, which, to me, completely defeats the point of having an OO grammar description framework to start with. 2) It's easier to do recursive rules. For example, my listOf() can take any parsing expression as an argument, while the one from the original OMeta implementation seems to only be able to do single-selector rules or terminals. This means that I can do a list of arbitrary things much more easily than they can and I can more easily write very complex rules. 2.5) It's easier (potentially) to make concurrent. All of the parsing state is stored on the stack, so we can do 'or' rules in separate threads if we want and see if any of them are matched. This is really trivial to do with the existing futures code in EtoileThread; memoise the parsing expressions, send some of them an inNewThread message, and then replace the or: rule with one that tries to match them all but doesn't test the results until after sending the messages. My gut feeling is that we should actually keep most of the existing PEG code (especially now I've got it working mostly how I want it to be working ;-) but remove the combination rules and just inline them into rules. Then it becomes easier for the rules to collect the actions. Most importantly, we remove the OMCapturingExpression, and make each rule do the capturing itself, so that each rule has its own capture namespace. David On 22 Jan 2010, at 13:06, Guenther Noack wrote: > Hi David, > > Ok, no worries. I know these debugging sessions. > > It's interesting that you talk about implementing chained parse > failures. I also did a little ad-hoc implementation of this in my Ruby > experiment. When implementing this, don't forget the case that the OR > rule fails. When that happens, I believe you probably want to see an > error message like "OR with 3 options failed, reasons were: (list of > three other parse failures)". I didn't think about that initially and > this currently makes things a bit complicated. > > In both my and the other Ruby implementation, parse failures are > exceptions. It's surely debatable whether this is good style, but the > generated code for the rules is much more straightforward when it > doesn't need to care about failures so much. Failure backtraces on a > rule-level (instead of method level) can be done by encapsulating rule > calls in an "apply" method, which attaches rule name information to > exceptions as they fly by. > > It's pretty exciting to see the OMeta stuff working. Since yesterday > evening, my little Ruby hack can parse all kinds of fancy recognizers > already. It still needs a Rubyish-language parser for the host > expressions in the OMeta grammar. Maybe that can be bootstrapped from > the JavaScript implementation as well. It's *exciting*! :-) > > -Günther _______________________________________________ Etoile-dev mailing list Etoile-dev@gna.org https://mail.gna.org/listinfo/etoile-dev