On Thu, 2007-03-15 at 23:35 +0100, Nikolai Weibull wrote: nikolai,
> On 3/15/07, Asiri Rathnayake <[EMAIL PROTECTED]> wrote: > > Hi all, > > > > I went through the above idea presented for google summer of code 2007 > > and found it interesting. In my opinion, incorporating Thompson NFA into > > regexp from the ground up is pretty cool. I also went through the > > alternate TRE (http://laurikari.net/tre/index.html) but couldn't verify > > whether its an implementation of Thompson NFA ( I will have to dig into > > it's source and find it out ). I would like to know further details (if > > any) regarding this project and the possible mentor(s). > > It has its own finite automata, which is based on a Thompson NFA, > adding so called /tags/ in the mix for doing subgroup matching and > fuzzy matching. > > It comes with three matchers/drivers: A backtracking one for when the > pattern requires it (backrefences), a multithreaded/parallel one, that > is, the standard two tables of active states one, and one for fuzzy > matching (approximate matching). A multithreaded matcher might be useful to vim but do we have a need for fuzzy matching ? > There's a nice paper about it as well (too actually, if you count the > first one he wrote about his first implementation for . > > I actually wrote a simplification of his library, removing the > approximate matching stuff, as part of my master's, which is well > documented. I still haven't had time to put up the PDF, though. Also, it would be great if you can publish your simplified version of TRE and the relevant documentation. :) > Sadly, both his and my implementation suffer from a bug in the > subgroup matching of "opposing" repetitions, which would have to be > fixed. Neither of us have been able to do so yet. > > Anyway, it would take an immense amount of work to turn Vim onto a new > regex implementation. Vim has a whole range of its own stuff, like > matching cursor positions and so on, and is tightly bound to the > buffer implementation with its memlines and whatnot. Not to > dishearten you, but I don't think this is a project that can be > completed over a summer (not that it has to be, but you may want to > keep that in mind). > > Also, there's a TDFA (tagged, deterministic finite automaton) > implementation written in Haskell > > http://www.mail-archive.com/glasgow-haskell-users@haskell.org/msg11442.html > > Quite cool. > > nikolai Thanks a bunch for the information and the advices, I'll give my best attempts to finish the project in the summer, if it takes longer, so be it. - Asiri