Hi Alexandre, To parse the ClustalW results I use a SequenceAlignmentSAXParser and a custom implementation of DefaultHandler which I call 'SequenceAlignmentContentHandler'.
The code for the custom DefaultHandler class is: public final class SequenceCollectionContentHandler extends DefaultHandler { private final Map sequenceMap; private final Alphabet alphabet; private String currentSeqName; private String currentSeq; /** * Creates a new <code>SequenceAlignmentContentHandler</code> instance. * * @param map * The map to be filled with sequences * @param alphabet * The alphabet to be used */ public SequenceCollectionContentHandler(Map map, Alphabet alphabet) { this.sequenceMap = map; this.alphabet = alphabet; } // This method is called when an element is encountered public final void startElement(String namespaceURI, String localName, String qName, Attributes atts) { if (localName.equals("Sequence")) { startCurrentSequence(atts); } } /* * (non-Javadoc) * * @see org.xml.sax.ContentHandler#characters(char[], int, int) */ public final void characters(char[] ch, int start, int length) throws SAXException { String content = new String(ch, start, length); this.currentSeq = content; } /* * (non-Javadoc) * * @see org.xml.sax.ContentHandler#endElement(java.lang.String, * java.lang.String, java.lang.String) */ public final void endElement(String uri, String localName, String qName) throws SAXException { if (localName.equals("Sequence")) { endCurrentSequence(); } } private void startCurrentSequence(Attributes atts) { String attName = atts.getLocalName(0); if (attName.equals("sequenceName")) { this.currentSeqName = atts.getValue(0); } } private void endCurrentSequence() { if (this.alphabet.equals(DNATools.getDNA())) { try { Sequence seq = DNATools.createDNASequence(currentSeq, currentSeqName); this.sequenceMap.put(currentSeqName, seq); } catch (IllegalSymbolException e) { System.err.println(this.getClass() + " - IllegalSymbolException: " + e.getMessage()); } } else if (this.alphabet.equals(RNATools.getRNA())) { try { Sequence seq = RNATools.createRNASequence(currentSeq, currentSeqName); this.sequenceMap.put(currentSeqName, seq); } catch (IllegalSymbolException e) { System.err.println(this.getClass() + " - IllegalSymbolException: " + e.getMessage()); } } else if (this.alphabet.equals(ProteinTools.getAlphabet())) { try { Sequence seq = ProteinTools.createProteinSequence(currentSeq, currentSeqName); this.sequenceMap.put(currentSeqName, seq); } catch (IllegalSymbolException e) { System.err.println(this.getClass() + " - IllegalSymbolException: " + e.getMessage()); } } } } Then, the code to use the SequenceAlignmentSAXParser and the handler could be: // copy and paste from here File alnFile = new File("/yout/aln/file"); // put here the path to the aln output file from the clustal Alphabet alphabet = ...; // put here the alphabet to be use (eg. DNATools.getDNA()); Map seqMap = new HashMap(); // this map will be fill by the sequences from the alignment SequenceAlignmentSAXParser parser = new SequenceAlignmentSAXParser(); ContentHandler handler = new SequenceCollectionContentHandler( seqMap, alphabet); try { BufferedReader contents = new BufferedReader(new InputStreamReader( alnStream)); parser.setContentHandler(handler); parser.parse(new InputSource(contents)); } catch (FileNotFoundException fnfe) { System.out.println(fnfe.getMessage()); System.out.println("Couldn't open file"); } catch (IOException ioe) { ioe.printStackTrace(); } catch (SAXException se) { System.err.println(se.getMessage()); se.printStackTrace(); } // Finally I create the alignment object using the Map Alignment alignment = new SimpleAlignment(seqMap); // end of copy So you have an Alignment instance which contains all the sequences in the alignment. I know there are better aproximations, but this one works for me... If you have any doubt, don't hesitate to ask again! Cheers, Bruno _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l