I have developed a parsing framework called LSAX that I would like to submit to BioJava. It was inspired by the work of Cambridge Antibody Technology (Simon Brocklehurst et al.) on the BioJava BlastLikeSaxParser. The idea is the same -- create a bridge between XML applications and Non-XML data. The difference between the CAT parser and LSAX is in the design of the raw file parser. I use LEX (actually JFLEX) to tokenize the raw data files and generate Start, Data, and End SAX events. I have developed two parsers using this framework an NCBI Blast and a Fasta parser. The advantage to using LEX is that you can specify the rules of your parser at a high level with regular expressions. The actual parser is then auto-generated using JFLEX and is often times faster than a parser you would write by hand.
Let me know if you would like to include this in BioJava, -R Thomas Down wrote: > On Fri, Oct 12, 2001 at 05:22:14PM -0700, Robert Hubley wrote: > > Hi all, > > > > I have some code that I thought I could contribute to the > > biojava project. What is the procedure for doing so? > > Hi... > > Most of the day-to-day business of BioJava takes place on > this mailing list, so you've come to the right place. Could > you post a brief description of the code you've written? > > If you decide you want to contribute some code, small patches > can be checked in by one of the existing developers. But if it's > more than 2 or 3 files, it may be easier to get a read/write > account for accessing the project's CVS repository, then check > things in yourself. > > Thanks, > > Thomas.
begin:vcard n:Hubley;Robert tel;fax:(206) 732-1299 tel;work:(206) 732-1292 x-mozilla-html:FALSE url:www.systemsbiology.org org:Institute for Systems Biology;Computational Biology version:2.1 email;internet:[EMAIL PROTECTED] title:Software Engineer adr;quoted-printable:;;Institute for Systems Biology=0D=0A4225 Roosevelt Way NE=0D=0ASTE 200;Seattle;WA;98105-6099;USA x-mozilla-cpt:;-9792 fn:Robert Hubley end:vcard