Jack Tang wrote:
Hi all

RegExp is widely used in nutch, and I now wondering is it jdk/jakarta
classes is faster enough?
Here is the benchmarks i found on web.
http://tusker.org/regex/regex_benchmark.html

it seems dk.brics.automaton.RegExp is fastest among the libs.

It's not only faster, it also scales better for large and complex expressions, it is also possible to build automata from several expressions with AND/OR operators, which is the use case we have in regexp-utlfilter.

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to