I've been doing some work on the multiple alignment stuff in biojava. I've found that the existing org.biojava.bio.seq.io.MSFAlignmentFormat class doesn't really seem to work, and have been re-engineering it to work and be more generic. I'll contribute this work when complete.
My question relates to how alignments should work. Currently org.biojava.bio.symbol.Alignment is composed of org.biojava.bio.symbol.SymbolList objects, which seems very reasonable. To specify an alignment with gaps, you would presumably use a org.biojava.bio.symbol.GappedSymbolList, which is a SymbolList that allows gaps to be inserted/deleted. This GappedSymbolList seems to wrap another SymbolList (without gaps) that contains the underlying sequence (a SymbolList could be generated that contained gaps, but as the sequence is immutatable you wouldn't be able to add/delete gaps, hence the need for the GappedSymbolList). Now to my point. I want to be able to use an alignment that contains sequences that can have features. I want to be able to add a feature (e.g. a prosite motif) to the underlying sequence, but be able access this both through the underlying coordinates of the protein and the coordinates within the alignment i.e both the GappedSymbolList and the SymbolList it wraps. Hence to do this I would need to create the org.biojava.bio.symbol.Alignment that contained a GappedSymbolList that wraps a SimpleSequence to which Features could be attached. However I see no way of getting hold of the SimpleSequence from the GappedSymbolList that wraps it. Are there any solutions here? many thanks Tim -------------------------------------------- Dr. Tim Dudgeon OSI Pharmaceuticals, Watlington Road, Oxford, OX4 6LT, UK Tel: +44 (01865) 871 244 email: [EMAIL PROTECTED] _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l