Hi - Depends how much you want to bind it to biojava. If you don't need biojava objects just make a SAX parser to listen for the bits you want. If you do want to bind it to biojava objects I would suggest modifying the parser to put the info into an Annotation object. - Mark
-----Original Message----- From: Russell Smithies [mailto:[EMAIL PROTECTED] Sent: Wed 25/06/2003 4:46 p.m. To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Org Cc: Subject: Re: [Biojava-l] SAX parser demo Looks good but doesn't do what I need but I don't think it was ever going to :-( The blast XML data has loads of info in it (I guess thats the reason for the format) but I want to be able to get at individual tags, not just hits. For example, some of the stats data (Statistics_entropy, Statistics_eff-space etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of just hitID and e-value might be useful? I guess I'll have to implement some new bits (from SimpleSeqSimilaritySearchSubHit?) but not exactly sure where. any ideas? thanx Russell ----- Original Message ----- From: "David Huen" <[EMAIL PROTECTED]> To: "Russell Smithies" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED] Org" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, June 25, 2003 2:28 PM Subject: Re: [Biojava-l] SAX parser demo > Hi, > OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml. > It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger > ported to use the BlastXML parser. You will need to do a "cvs update -d" > to create the new directories for the demos and for the DTD directory. > > I have added a facade to the BlastXML parsing framework. The facade is > called BlastXMLParserFacade and is used identically to the way the existing > BlastLikeSAXParser is used with blast text output. I think this will make > it easier for users all round: that both have the same interface. You can > look in that class to see how the BJ parsing framework is actually set up. > > I won't have more time available to work on this for a bit but bug reports > are welcome for eventual fixes. As previously mentioned, running multiple > sequence queries on a database with NCBI blast results in the concatenation > of all the Blast XML outputs resulting in an almighty completely non-XML > compliant file (multiple <xml> and <DOCTYPE> elements for example). > Parsing those requires a hack I have previously described but it is ugly, > ugly, ugly. Maybe the latest NCBI version might have fixed this problem > but I haven't looked. > > Best wishes, > David Huen > P.S. It is really really bedtime, guys..... > P.P.S There is an ugly entity resolver hack I will need to clean up later > too. > _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l