FWIW, I found a regexp benchmark at http://tusker.org/regex/regex_benchmark.html

The times I got on my laptop were
org.apache.regexp.*                             14078
java.util.regex.Pattern                                656
jregex.Pattern 1000 org.apache.org.text.regex.Perl5Matcher 1891

Ralph


Ralph Goers wrote:

I found this bug report for regexp with many duplicates. Apparently it is pretty popular. I guess Vadim is aware of this. :-) The bug is closed as WONTFIX.

http://issues.apache.org/bugzilla/show_bug.cgi?id=764

Ralph Goers wrote:

I cannot believe this didn't get a stack overflow exception. We just happened to request a stack trace of a test system at this time.

I also find myself wondering if EncodeURLTransformer shouldn't be changed somehow.

"http-8088-Processor20" daemon prio=1 tid=0x8a3b8408 nid=0xf55 runnable [8ad00000..8ad0f8c8]
   at org.apache.regexp.RE.matchNodes(Unknown Source)
   at org.apache.regexp.RE.matchNodes(Unknown Source)