On Sun, 2007-03-18 at 14:07 +0100, Nikolai Weibull wrote: > On 3/18/07, Asiri Rathnayake <[EMAIL PROTECTED]> wrote: > > > I read several articles on possible NFA implementations ( took some time > > to understand ). But I think implementing an NFA based approach is quite > > possible ( Russ Cox has provided a sample implementation in his > > article ). But I'm having a hard time understanding the existing > > approach taken :'( . > > The current implementation is also an NFA-based one. What's more > interesting is how the NFA is simulated. The current implementation > uses backtracking, that is, it tries a solution and checks if it > works. If so, then accept. Otherwise, try the next one. > > Then there's the parallel one that keeps track of all the states that > we're currently in that accepts as soon as an accepting/final state is > reached.
I think by parallel you are referring to the usual mechanism for simulating the NFA ( on the fly subset construction ? ). Simply, we have a set of states ( initially S0 ) and the we keep eating a symbol at a time, calculate the next possible set of states ( including E-closure ) and whenever the current set of states contain an accepting state of the original NFA, Voila! > And then there's the cached DFA implementation that builds the DFA > representation of the NFA as needed while traversing the input. This > is the most difficult one to implement, but also the fastest for most > inputs. Sorry, but I'm in confusion here. Are you telling me that the current regxp engine has a cached DFA implementation ? ( I thought such an implementation was the aim of this project :( ) > Russ Cox's article is good, but doesn't show anyting new, even though > it has been received like as it would. And he really skims over some > of the important issues that one has to consider when using regular > expressions for doing anything other than just accepting or rejecting > input as being part of a regular language. Perhaps most important is > submatch reporting, which causes all kinds of difficulties that he > simply doesn't mention. I agree. Just out of curiosity, I referred to the book "Compilers - Principles, Techniques and Tools by Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman". It has a very nice explanation of NFA, NFA->DFA, Regxp->NFA and even Regxp->DFA ( I'm still trying to understand the last one ). I used the article by Russ Cox to verify what I have learnt from the above book. But I believe that above article is a good starter. > > nikolai Thank you so much for the feed back. :) - Asiri