Hans Rudolf Hotz wrote: > A few months back, I played arround with the source code and changed one > of the library files (ajalign.c). This now allows the display of up to 20 > characters, by using a new output format "pairln" for sequence alignment > programs, like matcher or needle. This is in comparison to the default > which displays only the first 6 characters, or "pair" which displays the > first 13 characters, eg:
We can make the ID arbitrarily long for a "new" alignment format. We will need formats similar to the existing matcher and needle outputs to avoid breaking too many existing parsers (I remember when NCBI changed the use of a blank at the start of each line of blast output and almost all parsers had to change). The formats are easy to make (as you found out) from the existing ones. We need to decide what to do with the standard alignment formats that have 6 characters in their definition (I assume this goes back to the days of PIR database identifiers when FASTP was first written). As we cannot fit many of the existing identifiers, we can make up unique identifiers for these (truncate the identifier, and make the names unique if they match). Or, should we change the existing formats to allow longer IDs? What do the authors of parsers think? regards, Peter _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
