Gal Nitzan wrote:
Hi,
Well, the reason for this plugin is that i wish to crawl many sites but
they all must be in my list. If it was implemented with regular
expressions, the filter would still have to loop 100K expressions on
each url for a match right?
No, that's the whole point - using the library I mentioned you can build
a _single_ finite state automaton from all expressions. No looping, just
traversing a tree (or whatever equivalent structure they use).
100k regexps is still alot, so I'm not totally sure it would be much
faster, but perhaps worth checking.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com