Jack,
We have a 454 genome sequence, which is fragmented into many contigs. We used a fasta file of all contigs to start genome annotation. Artemis was used to determine ORFs. Unfortunately some ORFs cross the contig border to the next (random position) contig. This makes annotation, especially automated blasts, more difficult, and sometimes we miss info. Is there a way to tell Artemis not to cross contig borders while defining ORFs? If it is not possible I request an option for this for future releases ;) I think that more and more people will have fragmented genome data for quick/early screening.
For our 454 data, we first re-order the contigs to best match a related genome if possible, then we join the contigs with this joiner sequence: NNNNNCATTCCATTCATTAATTAATTAATGAATGAATGNNNNN It has stop AND start codons in all six reading frames. Not only does this prevent ORFs crossing contig boundaries, it allows interrupted genes to get a start codon, allowing gene/ORF predictor to pick up partial genes. The only downside is that your predicted genes will have this fake start zone in their sequence which needs to be removed if want to design oligos for an array etc. As a "bonus", if your sequence doesn't have N's in it, the NNNNN parts can be used to re-discover the original contigs. Ref: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1216834 Hope this helps, -- --Torsten Seemann --Victorian Bioinformatics Consortium, Monash University, AUSTRALIA _______________________________________________ Artemis-users mailing list [email protected] http://lists.sanger.ac.uk/mailman/listinfo/artemis-users
