On Wed, 25 Jun 2003, Russell Smithies wrote: > Looks good but doesn't do what I need but I don't think it was ever going to > :-( > > The blast XML data has loads of info in it (I guess thats the reason for the > format) but I want to be able to get at individual tags, not just hits. For > example, some of the stats data (Statistics_entropy, Statistics_eff-space > etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of > just hitID and e-value might be useful? > I guess I'll have to implement some new bits (from > SimpleSeqSimilaritySearchSubHit?) but not exactly sure where. > Ah, OK. I have picked up most but not all the fields.
Hsp_align-len is picked up and placed in an alignmentSize attribute. The others are not but it should not be difficult to parse and stuff them into the SAX output stream. If a suitable fit with the BlastLikeDataSetCollection.dtd can be achieved it should be possible to map it over readily. If not, we will have to extend that appropriately without breakage. However, not all the data can be mapped to the SeqSimilarity stuff so you may have to place a listener to handle those yourself. I don't see Hsp_pattern-from in my XML output. Do you have an output file with it? This parser was written by reverse engineering the semantics from the output ;-). I seem to recall that the semantics of orientation was weird. Regards, David _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l