Hi - Its often hard to compare a perl lib to biojava without knowing what the perl lib does, biojava does a reasonable amount of checking that the symbols used match the alphabet etc and does most of its work on Symbols as Objects, probably the perl lib does everything as Strings.
You can cut down on overhead if you only want a particular part of the sequence. Matthew and I where just discussing how making a custom listener for a particular field in a file can perform as fast a grep. If you are only interested in the Sequence information for example you could ignore all the rest as by default it gets processed and stored as annotations and features of the object. - Mark > -----Original Message----- > From: David P Dean [mailto:deandp@;groton.pfizer.com] > Sent: Wednesday, 30 October 2002 10:02 a.m. > To: [EMAIL PROTECTED] > Subject: [Biojava-l] readGenbank performance > > > Hi, > I'm new to BioJava and am very keen to learn more about it. > I've got a routine to read some Genbank sequences and do > stuff and that works fine. But I'm suprised it doesn't run > faster. A basic read loop like: > > sit = SeqIOTools.readGenbank(br); > while( sit.hasNext() ) { > Sequence entry = sit.nextSequence(); > > takes about 90 seconds to read 10,000 Genbank EST entries on > my Sparc Ultra 10. A comparable perl library I have that > iterates over the set and parses all the records takes about > half the time. Is this expected, or any suggestions? > > I have downloaded and built biojava-live and am game to tweak > things. Is there any kind of profiling tool that would show > where the time is going? Also, I am using an older Solaris > JVM, 1.3.0. Could this be a factor? > > Thanks! > David Dean > ---- > Count your blessing. > > _______________________________________________ > Biojava-l mailing list - [EMAIL PROTECTED] > http://biojava.org/mailman/listinfo/biojava-l > ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l
