faster DFA.C state merge

Ivan Yanikov Sat, 24 Aug 2013 23:43:33 -0700

Hello! Thanks for great command line tool.

I have a large regexp, which works very slow. I take a profiler, and it shows, that bottleneck is 'insert function', which is called too often from this code during file matching, and makes O(N^2)assigmend, O(N) memory reallocatons:


-------------------
for (j = 0; j < d->states[0].elems.nelem; ++j) { 
insert(d->states[0].elems.elems[j], &follows); }
-------------------

I have a patch, which makes dfa-states merge in dfa.c faster. Also I rewrite 
this code, using priority-queue data-structure:

-------------------
 /* Find the union of the follows of the positions of the group. This is a 
hideously inefficient loop.  Fix it someday. */
 for (j = 0; j < grps[i].nelem; ++j)
   for (k = 0; k < d->follows[grps[i].elems[j]].nelem; ++k)
     insert (d->follows[grps[i].elems[j]].elems[k], &follows);
-------------------

Will it be usefull for you?

How I can properly send/commit it for review?
Currently I have copied the patch to http://pastebin.com/p6TbvM8Y and  
https://gist.github.com/dobrokot/6331243
it need little work for proper styling and optimization shortcut for common 
cases when merging 2 sets or only one set(i.e. copy).

Thanks for attention, Ivan Yanikov.

faster DFA.C state merge

Reply via email to