Hi Andrzej,
Yes, it seems like a good option. However, it is GPL, and I noticed in
one of the posts that this license is no good for apach.org :).
Regards,
Gal
Andrzej Bialecki wrote:
Gal Nitzan wrote:
Hi,
Well, the reason for this plugin is that i wish to crawl many sites
but they all must be in my list. If it was implemented with regular
expressions, the filter would still have to loop 100K expressions on
each url for a match right?
No, that's the whole point - using the library I mentioned you can
build a _single_ finite state automaton from all expressions. No
looping, just traversing a tree (or whatever equivalent structure they
use).
100k regexps is still alot, so I'm not totally sure it would be much
faster, but perhaps worth checking.