Andy -
 
String manipulation is nortoriously slow in Java (due to the Security model which 
needs Strings to be immutable) so every manipulation reqiures a new String. If you are 
dealing with a small alphabet and are going to be treating each Symbol as a String 
then it may pay to create one of each String "a", "c" etc and intern them using the 
String.intern() method.
 
I'm not sure if it will imporve performance but you might be able to avoid using 
Strings and chars at all by using Edits and cross product alphabets.
 
see the alphabets and symbols section of http://www.biojava.org/docs/bj_in_anger/ and 
also http://www.biojava.org/docs/bj_in_anger/edit.htm
 
Hope this helps,
 
Mark

        -----Original Message----- 
        From: Andy Hammer [mailto:[EMAIL PROTECTED] 
        Sent: Sat 13/09/2003 12:05 p.m. 
        To: bio java 
        Cc: 
        Subject: [Biojava-l] Performance of SymbolList.subStr() function
        
        

        This blows my mind!
        In both blocks of code:  seq = si.nextSequence();
        //just a protein sequence
        
        In this block of code, my system would take hours and
        often crash with an out of memory error.
        
             for(int i = 1; i <= seqLength; i++){
                String seqString = seq.subStr(i,i);
                char protein = seqString.charAt(0);
                newSeq =
        newSeq.append(newCodon(protein));//add a 3 char string
        to the newSeq StringBuffer
              }
        
        I altered the above block to:
        
            String seqString = seq.seqString();
            for(int i = 0; i < seqLength; i++){
              char protein = seqString.charAt(i);
              newSeq = newSeq.append(newCodon(protein));
            }
        
        This program used to take 4 hours to complete.  With
        this simple change it is now done in 30 minutes!
        Obviously, the SymbolList.subStr() function was
        severly hampering my efficiency.  I guess it was poor
        use of the SymbolList.subStr() function.  My rational
        was that I didn't want to create an unneeded String
        simply to represent my already existing SymbolList.  I
        am just trying to learn from this to become a better
        programmer and would appreciate any comments on this.
        
        Thanks!
        
        __________________________________
        Do you Yahoo!?
        Yahoo! SiteBuilder - Free, easy-to-use web site design software
        http://sitebuilder.yahoo.com
        _______________________________________________
        Biojava-l mailing list  -  [EMAIL PROTECTED]
        http://biojava.org/mailman/listinfo/biojava-l
        


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================

_______________________________________________
Biojava-l mailing list  -  [EMAIL PROTECTED]
http://biojava.org/mailman/listinfo/biojava-l

Reply via email to