I found this bug report for regexp with many duplicates. Apparently it is pretty popular. I guess Vadim is aware of this. :-) The bug is closed as WONTFIX.

http://issues.apache.org/bugzilla/show_bug.cgi?id=764

Ralph Goers wrote:

I cannot believe this didn't get a stack overflow exception. We just happened to request a stack trace of a test system at this time.

I also find myself wondering if EncodeURLTransformer shouldn't be changed somehow.

"http-8088-Processor20" daemon prio=1 tid=0x8a3b8408 nid=0xf55 runnable [8ad00000..8ad0f8c8]
   at org.apache.regexp.RE.matchNodes(Unknown Source)
   at org.apache.regexp.RE.matchNodes(Unknown Source)