Once upon a time, Karin Lagesen wrote: > I am using the SeqIOTools.readFastaDNA() method to get hold sequences > which are stored in a file. These are as far as I can tell in the > correct fasta format. However, whenever the fasta description line > contains a paranthesis, like this for instance: > > >(gi|16127994:1080570-1080686, 1080677-1081408) > > this sequence does not get read. Is this a bug or is it a feature? And > if it is a feature, could somebody tell me how to work around it?
What errors are you seeing? Or is the sequence just disappearing completely? Reading FASTA files containing parentheses works for me. The one caveat is that BioJava determines the name of the sequence from the text between the '>' and the first ' ' character. So in this case, BioJava will, by default, name your sequence "(gi|16127994:1080570-1080686,", which might not be what you want. Where did you get this file? Is this another special case of FASTA format that BioJava ought to understand? In the mean time, you can get at the complete description line of a sequence using code like: SequenceIterator si = SeqIOTools.readFastaDNA(...); while (si.hasNext()) { Sequence seq = si.nextSequence(); System.out.println(seq.getAnnotation().getProperty( FastaFormat.PROPERTY_DESCRIPTIONLINE )); } Thomas. _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l