Good point Torston, this is the way I do it too, apologies for
forgetting to mention the start codons in my earlier post! I tend to
use more N's at either end too, so it is really obvious by eye when
looking at a "fake" translated ORF in later analyses.
It is a great idea to re-order your contigs first, but don't get too
carried away by this "virtual scaffold": many of the gaps between
contigs will be due to repeats (e.g. IS elements, rRNA genes etc) so
in reality there are likely to be some rearrangements in contig
order / orientation relative to the reference genome.
Cheers
Scott
On 26/06/2007, at 8:47 AM, Torsten Seemann wrote:
For our 454 data, we first re-order the contigs to best match a
related genome if possible, then we join the contigs with this joiner
sequence:
NNNNNCATTCCATTCATTAATTAATTAATGAATGAATGNNNNN
It has stop AND start codons in all six reading frames. Not only does
this prevent ORFs crossing contig boundaries, it allows interrupted
genes to get a start codon, allowing gene/ORF predictor to pick up
partial genes.
---
Scott Beatson PhD
NHMRC Howard Florey Research Fellow
School of Molecular and Microbial Sciences
University of Queensland
Brisbane QLD 4072
Australia
Tel: +61 7 33654863
Fax: +61 7 33654699
_______________________________________________
Artemis-users mailing list
[email protected]
http://lists.sanger.ac.uk/mailman/listinfo/artemis-users