Re: [Artemis-users] multiple contig reading problem
Dear Tim, That solved the problem. In the event I needed union's -source option to capture the contig names and a few finds and replaces to get it displaying well (e.g. changing 'source' features to 'fasta_record' features, with the name as a 'label' rather than a 'note' in order to see them), but the end result was very much what I was after. Many thanks, Chris Tim Carver wrote: Hi Chris You need to use something like the EMBOSS application 'union'. Separate them into individual EMBL files and concatenate them into a single EMBL entry file (use the -feature option and -sformat embl). Regards Tim On 20/3/08 15:18, "Chris Knight" <[EMAIL PROTECTED]> wrote: I am having difficulties opening an EMBL file in Artemis: The file in question contains a single genome divided into several hundred contigs. These contigs are listed in the file I have as separate entries, each with a separate sequence (SQ) entry- I'd like to read them all in (and have them appear as separate contigs), however I can only persuade Artemis to read the first contig. The separation between the contigs at present is a line containing only two forward slashes between the end of the preceding contig's sequence entry (SQ section) and the beginning of the next contig (ID section). I've tried manipulating the file with Readseq v 2.1.26, which will happily output everything to fasta format, which allows me to read all the contigs into Artemis correctly. However, I then lose the annotation in the embl file. Readseq will separate out the annotation into a separate .fff file (by using -unpair=1), however, this file is in gff format v2 and it doesn't seem to read in as an entry into artemis which wants gff v3 (or rather the file reads, but appears as an empty entry). Apologies if I've missed something obvious, but any help much appreciated, Thanks, Chris I'm using Artemis release 10 on a Mac running OSX 10.5.2 -- Dr Christopher Knight Michael Smith Building Wellcome Trust RCD Fellow Faculty of Life Sciences Tel: +44 (0)161 2755378The University of Manchester room B.2012 Oxford Road www.dbkgroup.org/MCISB/people/knight/ Manchester M13 9PT · . ,,><(((°> UK ___ Artemis-users mailing list Artemis-users@sanger.ac.uk http://lists.sanger.ac.uk/mailman/listinfo/artemis-users
Re: [Artemis-users] multiple contig reading problem
Hi Chris You need to use something like the EMBOSS application 'union'. Separate them into individual EMBL files and concatenate them into a single EMBL entry file (use the -feature option and -sformat embl). Regards Tim On 20/3/08 15:18, "Chris Knight" <[EMAIL PROTECTED]> wrote: > I am having difficulties opening an EMBL file in Artemis: > > The file in question contains a single genome divided into several > hundred contigs. These contigs are listed in the file I have as separate > entries, each with a separate sequence (SQ) entry- I'd like to read them > all in (and have them appear as separate contigs), however I can only > persuade Artemis to read the first contig. > > The separation between the contigs at present is a line containing only > two forward slashes between the end of the preceding contig's sequence > entry (SQ section) and the beginning of the next contig (ID section). > > I've tried manipulating the file with Readseq v 2.1.26, which will > happily output everything to fasta format, which allows me to read all > the contigs into Artemis correctly. However, I then lose the annotation > in the embl file. Readseq will separate out the annotation into a > separate .fff file (by using -unpair=1), however, this file is in gff > format v2 and it doesn't seem to read in as an entry into artemis which > wants gff v3 (or rather the file reads, but appears as an empty entry). > > Apologies if I've missed something obvious, but any help much appreciated, > > Thanks, > > Chris > > I'm using Artemis release 10 on a Mac running OSX 10.5.2 ___ Artemis-users mailing list Artemis-users@sanger.ac.uk http://lists.sanger.ac.uk/mailman/listinfo/artemis-users
[Artemis-users] multiple contig reading problem
I am having difficulties opening an EMBL file in Artemis: The file in question contains a single genome divided into several hundred contigs. These contigs are listed in the file I have as separate entries, each with a separate sequence (SQ) entry- I'd like to read them all in (and have them appear as separate contigs), however I can only persuade Artemis to read the first contig. The separation between the contigs at present is a line containing only two forward slashes between the end of the preceding contig's sequence entry (SQ section) and the beginning of the next contig (ID section). I've tried manipulating the file with Readseq v 2.1.26, which will happily output everything to fasta format, which allows me to read all the contigs into Artemis correctly. However, I then lose the annotation in the embl file. Readseq will separate out the annotation into a separate .fff file (by using -unpair=1), however, this file is in gff format v2 and it doesn't seem to read in as an entry into artemis which wants gff v3 (or rather the file reads, but appears as an empty entry). Apologies if I've missed something obvious, but any help much appreciated, Thanks, Chris I'm using Artemis release 10 on a Mac running OSX 10.5.2 -- Dr Christopher Knight Michael Smith Building Wellcome Trust RCD Fellow Faculty of Life Sciences Tel: +44 (0)161 2755378The University of Manchester room B.2012 Oxford Road www.dbkgroup.org/MCISB/people/knight/ Manchester M13 9PT · . ,,><(((°> UK ___ Artemis-users mailing list Artemis-users@sanger.ac.uk http://lists.sanger.ac.uk/mailman/listinfo/artemis-users