Jack,

We have a 454 genome sequence, which is fragmented into many contigs. We used a 
fasta file of
all contigs to start genome annotation. Artemis was used to determine ORFs.
Unfortunately some ORFs cross the contig border to the next (random position) 
contig. This makes
annotation, especially automated blasts, more difficult, and sometimes we miss 
info. Is there a
way to tell Artemis not to cross contig borders while defining ORFs? If it is 
not possible I
request an option for this for future releases ;) I think that more and more 
people will have
fragmented genome data for quick/early screening.

For our 454 data, we first re-order the contigs to best match a
related genome if possible, then we join the contigs with this joiner
sequence:

NNNNNCATTCCATTCATTAATTAATTAATGAATGAATGNNNNN

It has stop AND start codons in all six reading frames. Not only does
this prevent ORFs crossing contig boundaries, it allows interrupted
genes to get a start codon, allowing gene/ORF predictor to pick up
partial genes.

The only downside is that your predicted genes will have this fake
start zone in their sequence which needs to be removed if want to
design oligos for an array etc.

As a "bonus", if your sequence doesn't have N's in it, the NNNNN parts
can be used to re-discover the original contigs.

Ref: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1216834

Hope this helps,

--
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University, AUSTRALIA

_______________________________________________
Artemis-users mailing list
[email protected]
http://lists.sanger.ac.uk/mailman/listinfo/artemis-users

Reply via email to