> I also have a question. Can it cope specifying formats that span multiple > lines? Or is it limited to treating non-XML files as being essentially > record-based i.e. dealing with single lines at a time? Sometimes, it's > useful (and necessary) to be able to read ahead several lines of a file, > before actually parsing i.e. emitting SAX events.
That's a good question. The short answer is yes. You complicate your specification every time you have to do this. Basically there are two types of "lookahead". There is line lookahead (assuming you are parsing line by line) which allows you to match a pattern based on what lies ahead of it in the line. You can also get tricky with using parser states with the ability to "pushback" data on the processing stream. And lastly you can incorporate traditional parser code into the "action" of a particular rule. They all have their drawbacks...and cause one to reflect on ever writing a report-like output procedure into any future program. :-) I have written my code to fit into the BioJava project as: org.biojava.bio.program.lsax Should I just wrap it up and send it to you? -R
begin:vcard n:Hubley;Robert tel;fax:(206) 732-1299 tel;work:(206) 732-1292 x-mozilla-html:FALSE url:www.systemsbiology.org org:Institute for Systems Biology;Computational Biology version:2.1 email;internet:[EMAIL PROTECTED] title:Software Engineer adr;quoted-printable:;;Institute for Systems Biology=0D=0A4225 Roosevelt Way NE=0D=0ASTE 200;Seattle;WA;98105-6099;USA x-mozilla-cpt:;-9792 fn:Robert Hubley end:vcard