nope, the header is in the hashmap in total, except for everything after length= -- there are whitespaces before that and these are still left in the header that is used as key.
either make it work like you say or even better, leave the header as-is. I need to quickly find the sequence, I don't want to iterate over all my 35k sequences and look up the original headers. Hannes On Wed, Jan 11, 2012 at 15:38, Scooter Willis <[email protected]> wrote: > It should parse until the first space as the unique id. Lots of extra info > gets added in to the header. You should find a getOriginalHeader method that > will preserve to contents of the header. I use this when writing the > sequences back to disk to restore the original header. > > You can also do your own custom header parser which we use to support the > known different fasta headers. If you have extra information in the header > you can formally associate that with the sequence at the time of the parse. > We can also add support for your header if it is standard ouput from a > device. > > Thanks > > Scooter > > > ----- Reply message ----- > From: "Hannes Brandstätter-Müller" <[email protected]> > To: "biojava-l" <[email protected]> > Subject: [Biojava-l] FASTA Header Parser > Date: Wed, Jan 11, 2012 9:30 am > > > > Hi there - > > I just came across a puzzling "feature" of the GenericFastaHeaderParser. > It seems to throw away everything in the header after (and including) > "length=" > (see GenericFastaHeaderParser.java lines 71-76) > > ... Why? > > Also, is there a Fasta Header Parser I can use that does not mess > about with the header? > > I really would like to have that as key (still working on my > FASTA/QUAL parsing) and not having that (only in the originalHeader, > not in the Hashmap key) really breaks stuff. > > Hannes > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
