Sorry, I should have written to the list. Also just want to say that I
agree with Andreas, in Java we use regexp if everything else fails
(:-))

Regards,
P.



---------- Forwarded message ----------
From: P. Troshin <[email protected]>
Date: 22 October 2012 21:44
Subject: Re: [Biojava-l] regex performance in Java
To: Hilmar Lapp <[email protected]>


Hi Hilmar,

I think this is one of the myths, I do not think there is a
difference. It might have been true long ago, but I do not think this
is still the case. Last time we compared Perl, Python and Java
performance the former was the last with a large margin :-).  However,
I never had to make a direct comparison of regexp. Google for "perl vs
java regexp speed comparison" brings a few links. I had a quick look
at one result only
(http://onlyjob.blogspot.co.uk/2011/03/perl5-python-ruby-php-c-c-lua-tcl.html),
it claimed that Perl regexp is faster than Java. Unfortunately the
author of the test clearly lacked understanding of Java and as a
result the test compared the performance of String concatenation
(which is notoriously bad in Java, as Strings are immutable) rather
than the regexp performance itself. I guess this is an easy mistake to
make though. Hence the advice - if you are doing a lot of String
permutations use the StringBuilder class, not the String itself.
If you have a Java implementation which is lacking I am sure people on
this list will have no problem optimizing it!

Regards,
Peter



On 22 October 2012 15:52, Hilmar Lapp <[email protected]> wrote:
> I know that this is really Java language topic, but since parsing biological 
> data formats is to rife with regular expression applications, I'm curious 
> what the experience is among the Biojava people with the use of regular 
> expressions in Java.
>
> They (at least as in java.util.regex) have been reported to me as performing 
> much slower (by several orders of magnitude) than the regex implementation in 
> Perl, and some simple benchmarking tests seem to bear that out. Even after 
> scrutinizing the benchmark and finding nothing obvious, I'm still skeptical 
> as to why this would be the case - naively I would have assumed that the 
> underlying runtime library is implemented in C in both cases. But perhaps 
> this is not true?
>
> Any experience people have made here speed-wise (or tricks or things not to 
> do for Java regex's) would be appreciated.
>
>         -hilmar
> --
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Biojava-l mailing list  -  [email protected]
> http://lists.open-bio.org/mailman/listinfo/biojava-l

_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to