Re: Perl5Util performance
Thanks Daniel. Sounds like I should be moving to java.util.regex. I do like the convenience of the pattern caching but I guess it's easy enough to set that up myself for java.util.regex. Duke On 3/29/06, Daniel F. Savarese [EMAIL PROTECTED] wrote: In message [EMAIL PROTECTED], Duke Tantiprasut writes: I'm curious why there is such a significant jump from the Perl5Matcher compared to the java.util.regex? A hefty chunk of that time comes from converting strings to char[] before matching. I've tuned that benchmark before and trimmed 25% of the time just by using PatternMatcherInput instead of String. It's not exactly a rigorous benchmark anyway. Measurements I've made in the past show that the performance of the packages depends heavily on the input and how the regular expressions are written. Two equivalent regular expressions can have very different performance characteristics. That said, ORO is behind the times on performance, having been designed originally to get the most out of JDK 1.0.2. A question that bears revisiting is if Perl5Matcher needs to bother converting to char[] anymore. In JDK 1.0.2 and 1.1 days it was a big performance win, but unless you're working with your input as char[] from the start, I bet these days it would be faster to not make the conversion and work directly with String (or CharSequence) if we're willing to abandon JDK 1.2/1.3 compatibility. But now that there's a java.util.regex, the primary reason to use ORO appears to be if you're still on 1.2/1.3... In response to the email Subject, Perl5Util is a convenience class and will always be slower than using Perl5Matcher directly because Perl5Util has to parse the native Perl-style representation of expressions :( daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Perl5Util performance
Hi All, Below is an interesting benchmark result comparing a number of regex engines. http://tusker.org/regex/regex_benchmark.html I'm curious why there is such a significant jump from the Perl5Matcher compared to the java.util.regex? The DFA based engines such as JREXX look really fast but I'm not sure if it allows you get the matched group results. Duke
Re: Perl5Util performance
In message [EMAIL PROTECTED], Duke Tantiprasut writes: I'm curious why there is such a significant jump from the Perl5Matcher compared to the java.util.regex? A hefty chunk of that time comes from converting strings to char[] before matching. I've tuned that benchmark before and trimmed 25% of the time just by using PatternMatcherInput instead of String. It's not exactly a rigorous benchmark anyway. Measurements I've made in the past show that the performance of the packages depends heavily on the input and how the regular expressions are written. Two equivalent regular expressions can have very different performance characteristics. That said, ORO is behind the times on performance, having been designed originally to get the most out of JDK 1.0.2. A question that bears revisiting is if Perl5Matcher needs to bother converting to char[] anymore. In JDK 1.0.2 and 1.1 days it was a big performance win, but unless you're working with your input as char[] from the start, I bet these days it would be faster to not make the conversion and work directly with String (or CharSequence) if we're willing to abandon JDK 1.2/1.3 compatibility. But now that there's a java.util.regex, the primary reason to use ORO appears to be if you're still on 1.2/1.3... In response to the email Subject, Perl5Util is a convenience class and will always be slower than using Perl5Matcher directly because Perl5Util has to parse the native Perl-style representation of expressions :( daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]