On 3/22/07, Asiri Rathnayake <[EMAIL PROTECTED]> wrote:
As you might know, the reg_comp() method is called twice when compiling a r.e; first to determine the size of the compiled expression and then to actually compile it. I was thinking if this can be used to our advantage, while on the first pass, we look for occurrences of special characters and set a flag in regprog_T appropriately. If such thing was not found, we branch off the second pass into one of our own routines to compile the expression into our own structures (say, a state diagram). And we have to change other functions a bit to look for the above flag and call new routines appropriately. What do you think ?
That sounds like a good way of determining whether the old engine will be required or if a new one (with more "limited" functionality) should be used. One way of keeping this information as local as possible would be to keep a set of function pointers with the compiled regex that point to the appropriate functions to execute them on some input. For example, you could have something like this: typedef struct { int (*exec)(); int regstart; char_u reganch; char_u *regmust; int regmlen; unsigned regflags; char_u reghasz; char_u program[1]; /* actually longer.. */ } regprog_T; and change vim_regexec() to call the exec() function of the regprog_T in the regmatch_T that it gets passed. You'd then set exec() to point to either vim_old_regexec() or vim_new_regexec() (or similarly named functions) in vim_regcomp() depending on the type of regex we have. I guess it could be some flag field as well, but this makes it possible to add a third matcher, should we so desire...like a Boyer-Moore-Horspool matcher for fixed strings. nikolai