It should parse until the first space as the unique id. Lots of extra info gets added in to the header. You should find a getOriginalHeader method that will preserve to contents of the header. I use this when writing the sequences back to disk to restore the original header.
You can also do your own custom header parser which we use to support the known different fasta headers. If you have extra information in the header you can formally associate that with the sequence at the time of the parse. We can also add support for your header if it is standard ouput from a device. Thanks Scooter ----- Reply message ----- From: "Hannes Brandstätter-Müller" <[email protected]> To: "biojava-l" <[email protected]> Subject: [Biojava-l] FASTA Header Parser Date: Wed, Jan 11, 2012 9:30 am Hi there - I just came across a puzzling "feature" of the GenericFastaHeaderParser. It seems to throw away everything in the header after (and including) "length=" (see GenericFastaHeaderParser.java lines 71-76) ... Why? Also, is there a Fasta Header Parser I can use that does not mess about with the header? I really would like to have that as key (still working on my FASTA/QUAL parsing) and not having that (only in the originalHeader, not in the Hashmap key) really breaks stuff. Hannes _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
