Hi - The general pattern of input in Biojava is:
BufferedReader -> SequenceFormat -> SeqIOListener -> SequenceIterator. A specific example would be: BufferedReader -> FastaFormat -> SequenceBuilderFactory -> SequenceIterator The format object is responsible for parsing the input and generating events that the SequenceBuilderFactory listens for and uses to build one or more Sequence objects. The SequenceFormat implementation usually draws on a SymbolTokenization object which determines how characters are mapped to biojava Symbols (eg a is mapped to adenosine). Basically you make a parser that emits callback events to a registered SeqIOListener. If the listener is a SequenceBuilderFactory it generates Sequence objects base on those events. It is very similar to the SAX API for XML parsing. Hope this helps, Mark "Gang Wu" <[EMAIL PROTECTED]> 06/04/2004 12:52 AM Please respond to gwu To: Mark Schreiber/GP/[EMAIL PROTECTED] cc: "Bio-Java" <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> Subject: RE: [Biojava-l] Genbank ASN.1 or XML parser Hi, I would be glad to write the parser. Since I am pretty new to BioJava project, can anybody give me a guide on how to start? Thanks. - Gang -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 02, 2004 8:59 PM To: [EMAIL PROTECTED] Cc: Bio-Java; [EMAIL PROTECTED] Subject: Re: [Biojava-l] Genbank ASN.1 or XML parser Hello - At the moment there are no parsers for XEMBL, GenBankXML or ASN.1, They could both be easily made if someone had that time. GenBankXML could easily draw on a SAX or DOM parser to pass events to the BioJava SequenceBuilders (using some kind of adapter). ASN.1 would need a more custom parser but because it is highly structured that shouldn't be too hard. Any volunteers? - Mark "Gang Wu" <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 06/02/2004 11:21 PM Please respond to gwu To: "Bio-Java" <[EMAIL PROTECTED]> cc: Subject: [Biojava-l] Genbank ASN.1 or XML parser Hi everyone, I just tried out the APIs for parsing Genbank format files. Though it works well, I still wonder if there are APIs for parsing Genbank files in ASN.1 or XML formats because the Genbank format was designed for human being and the ASN.1 and XML formats should be more reliable for data exchange. Gang Wu _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l