Sorry, I should have written to the list. Also just want to say that I agree with Andreas, in Java we use regexp if everything else fails (:-))
Regards, P. ---------- Forwarded message ---------- From: P. Troshin <[email protected]> Date: 22 October 2012 21:44 Subject: Re: [Biojava-l] regex performance in Java To: Hilmar Lapp <[email protected]> Hi Hilmar, I think this is one of the myths, I do not think there is a difference. It might have been true long ago, but I do not think this is still the case. Last time we compared Perl, Python and Java performance the former was the last with a large margin :-). However, I never had to make a direct comparison of regexp. Google for "perl vs java regexp speed comparison" brings a few links. I had a quick look at one result only (http://onlyjob.blogspot.co.uk/2011/03/perl5-python-ruby-php-c-c-lua-tcl.html), it claimed that Perl regexp is faster than Java. Unfortunately the author of the test clearly lacked understanding of Java and as a result the test compared the performance of String concatenation (which is notoriously bad in Java, as Strings are immutable) rather than the regexp performance itself. I guess this is an easy mistake to make though. Hence the advice - if you are doing a lot of String permutations use the StringBuilder class, not the String itself. If you have a Java implementation which is lacking I am sure people on this list will have no problem optimizing it! Regards, Peter On 22 October 2012 15:52, Hilmar Lapp <[email protected]> wrote: > I know that this is really Java language topic, but since parsing biological > data formats is to rife with regular expression applications, I'm curious > what the experience is among the Biojava people with the use of regular > expressions in Java. > > They (at least as in java.util.regex) have been reported to me as performing > much slower (by several orders of magnitude) than the regex implementation in > Perl, and some simple benchmarking tests seem to bear that out. Even after > scrutinizing the benchmark and finding nothing obvious, I'm still skeptical > as to why this would be the case - naively I would have assumed that the > underlying runtime library is implemented in C in both cases. But perhaps > this is not true? > > Any experience people have made here speed-wise (or tricks or things not to > do for Java regex's) would be appreciated. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
