On the subject of multi-threaded implementation, let me vote +1 for providing a mechanism for target implementors to take advantage of this. :) Multi-core systems are the norm now. In my job, we spend a LOT of time determining how best to extract maximum work in minimum time, and parallel programming is a big part of that.
On Sun, Apr 18, 2010 at 4:28 PM, Terence Parr <[email protected]> wrote: > > On Apr 18, 2010, at 4:18 PM, Cliff Hudson wrote: > > > Hmm, it seems to me that there should be a way to record the set of > > recursive actions and appropriate pointers into the lexed string such > that > > you can replicate the logical state of the DFA when you execute the > actions > > once an alternative is definitely selected. Is there more to the state > of > > the system at any given possible action point than a pointer to the start > of > > the current substring, its length, and maybe a pointer to the already > > matched token stream? I am possibly out of my depth here on > understanding > > how the lexing system really works. Or I have not adequately explained > my > > idea :) > > hiya. :) Well, imagine that you are modifying "global" state as you match > characters in identifier; this is something done in actions that ANTLR can > analyze. There is no way to "roll this back". ANTLR can only provide the > information you specified about where it is and what character it was > parsing--it can't deal with other member variables that you have updated. > Plus tracking all that information for input character could get expensive > even if it was limited to that. > > I'm leaning towards a trivial NFA implementation (Thompson's algorithm from > the 60s as described by Russ Cox) > > http://swtch.com/~rsc/regexp/regexp2.html > > because it allows us to save partial matches like id=ID very easily, unlike > a DFA. Unfortunately, we really need identifiers and keywords to be > efficient because that's what most input streams consist of. I'm not sure I > could get the NFA to go that fast compared to a DFA. On the other hand, the > new NFA-based mechanism would likely be faster than the current v3 mechanism > which is guaranteed to recognize the first few characters of any token > twice. so, in the end I should go for the smallest implementation and the > simplest that gives us the capabilities. That would spell Thompson's > NFA-based algorithm. There's even a chance that I could make it go faster > using two or three threads instead of just one to do the NFA simulation. > > choices choices > > Ter > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
