That's right, clustalw can output in several formats including fasta. It would be nice to have Biojava able to read and write the clustalw format as it is a widely used format. How, easy is it to write something like this? Maybe when I start to learn more about Java I could have a go at doing this.
Nath > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: 15 May 2006 10:16 > To: Richard Holland > Cc: [email protected]; [EMAIL PROTECTED] > Subject: Re: [Biojava-l] Creating an alignment object > > I think ClustalW can output alignments as fasta alignment format which > biojava definitely can read. > > - Mark > > > > > > Richard Holland <[EMAIL PROTECTED]> > Sent by: [EMAIL PROTECTED] > 05/12/2006 04:34 PM > > > To: [EMAIL PROTECTED] > cc: [email protected], (bcc: Mark > Schreiber/GP/Novartis) > Subject: Re: [Biojava-l] Creating an alignment object > > > Sorry for the delay in replying - I had to leave work a bit early > yesterday. > > > Nope, I don't need to generate an alignment, I already have an alignment > in > > a file created by third party software (clustalw). > > There is nothing that I know of in BioJava that reads ClustalW files > directly into Alignment objects. (If someone else knows different, > please correct me). There are certainly methods in BioJava which read > the alignments from ClustalW into a set of String objects, each one > representing a member sequence (see SequenceAlignmentSAXParser), but I > don't know of anything more detailed than that. > > The third-party package called Strap which I mentioned yesterday happily > reads/writes many of the major alignment formats, and has wrappers for > running ClustalW and other aligners programatically and reading back in > the results, so it is definitely worth a look. You can use a lot of its > functions without having to run the GUI, including reading/writing > various alignment formats. > > > > > In fact, the app I'd > > eventually like to have written in Java would include some sort of > wrapper > > for clustalw in order to construct the alignments from a set of > unaligned > > sequences, but algorithms implemented in Biojava would also be a welcome > > addition to the app. > > If you want to wrap clustalw, the simplest way would be to create > Sequence objects in BioJava, write them out to Fasta using the BioJava > sequence IO tools, use the Java 'system' command (or one of the > alternatives to it) to run ClustalW. However you still then have the > problem of reading the output back in again. > > The classes in org.biojava.bio.alignment that I mentioned yesterday > implements several useful alignment algorithms which you can use as an > alternative to ClustalW. > > > But first things first. > > If I didn't have any sequences or an alignment in any files. What is the > > easiest way to get an alignment object in Java to have a play around > with? > > Make an instance of FlexibleAlignment from org.biojava.bio.alignment, > and use its methods to add sequences to it. It doesn't do any aligning > itself - it is just a placeholder to contain sequences and information > about how they align. You have to use its methods to add and remove > sequences from the alignment, to add/remove gaps and deletions, and get > things like consensus sequences etc. > > Technically I suppose you could use FlexibleAlignment in conjunction > with SequenceAlignmentSAXParser to read alignment members as strings, > construct sequences based on them, and add them to the alignment object, > but I haven't tried this myself. It'd probably require some extra > processing to convert the dashes (gaps) in the inputted strings into > proper gaps in the alignment. > > > Is there a way to just "magically" create a default alignment of say 5 > > sequences with 20 positions? > > You'd have to manually create yourself 5 sequences and add them to a > FlexibleAlignment as described above. > > cheers, > Richard > > -- > Richard Holland (BioMart Team) > EMBL-EBI > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD > UNITED KINGDOM > Tel: +44-(0)1223-494416 > > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l > > --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0619-3, 12/05/2006 Tested on: 15/05/2006 10:24:25 avast! - copyright (c) 1988-2006 ALWIL Software. http://www.avast.com _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
