Looks good but doesn't do what I need but I don't think it was ever going to :-(
The blast XML data has loads of info in it (I guess thats the reason for the format) but I want to be able to get at individual tags, not just hits. For example, some of the stats data (Statistics_entropy, Statistics_eff-space etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of just hitID and e-value might be useful? I guess I'll have to implement some new bits (from SimpleSeqSimilaritySearchSubHit?) but not exactly sure where. any ideas? thanx Russell ----- Original Message ----- From: "David Huen" <[EMAIL PROTECTED]> To: "Russell Smithies" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED] Org" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, June 25, 2003 2:28 PM Subject: Re: [Biojava-l] SAX parser demo > Hi, > OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml. > It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger > ported to use the BlastXML parser. You will need to do a "cvs update -d" > to create the new directories for the demos and for the DTD directory. > > I have added a facade to the BlastXML parsing framework. The facade is > called BlastXMLParserFacade and is used identically to the way the existing > BlastLikeSAXParser is used with blast text output. I think this will make > it easier for users all round: that both have the same interface. You can > look in that class to see how the BJ parsing framework is actually set up. > > I won't have more time available to work on this for a bit but bug reports > are welcome for eventual fixes. As previously mentioned, running multiple > sequence queries on a database with NCBI blast results in the concatenation > of all the Blast XML outputs resulting in an almighty completely non-XML > compliant file (multiple <xml> and <DOCTYPE> elements for example). > Parsing those requires a hack I have previously described but it is ugly, > ugly, ugly. Maybe the latest NCBI version might have fixed this problem > but I haven't looked. > > Best wishes, > David Huen > P.S. It is really really bedtime, guys..... > P.P.S There is an ugly entity resolver hack I will need to clean up later > too. > _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l