Bruno, That is what happened to me using union (and I did use "-feature"): the features were incorrectly mapped. I was using EMBOSS v6.5.7. Thanks for the tip on ugene.
The alternative way I got this to work in Artemis was basically as Tim described, which was to load an assembly sequence (multifasta) entry + a gff file entry. I could then write it back out as an .embl file with the coordinates remapped to the concatenated assembly. The process of loading the gff (which I downloaded from EupathDB) into Artemis involved a lot of trial and error and massaging of the GFF file. I also had to edit the final embl file feature table in order to get gene IDs and products displayed correctly. On Tue, Feb 25, 2014 at 10:31 AM, Bruno Donzelli <b...@cornell.edu> wrote: > There was a bug introduced in the later versions of the "union" routine > in EMBOSS and I am not sure if it was fixed. As far as I know union may > work with GFF files but messes up the location of the features in > concatenated genbank and embl files. I reported the bug almost 2 years ago. > I have to use the emboss version 6.3.1 to concatenate files correctly. > Another program that can be used to join files is ugene.Go to workflow > designer/samples/merge sequences and shift corresponding annotation. Select > the correct files for input and output. Occasionally this program will > introduce 2 quotes ("") in place of one ("). Just search and replace using > a text editor. > > Bruno > > > On 25/02/14 04:51, Tim Carver wrote: > > > In Artemis, with a multi-fasta sequence file, the options for the > annotation file are to use a GFF file or to in some way to concatenate the > EMBL/GenBank feature table (adjusting the coordinates to match the correct > position of the assembly). This is what 'union' should do with EMBL files. > I am not sure why this wasn't successful for you. You obviously do need to > use '-feature' with union to get the feature table included: > > union -feature -osf embl entry.embl > > Using GFF and union are the options used here. > > Regards > Tim > > On 24/02/2014 20:05, "Steven Sullivan" <sulli...@nyu.edu> wrote: > > ENA (EMBL) provides TEXT and FASTA file downloads for eukaryotic > assemblies. The FASTA download is single a multi-fasta file containing > separate records for each chromosome. The TEXT download is a single EMBL > feature table concatenating all the feature tables of the individual > chromosomes. It does not contain the DNA sequence. > > Loading these two files into Artemis yields a view of the entire assembly > as a concatenated sequence, but only the features for the first chromosome > in the feature file are loaded. > > I understand that this issue has been brought up before. (e.g. > https://www.mail-archive.com/artemis-users%40sanger.ac.uk/msg00690.html) > What I don't see is a workaround. Mention was made of the EMBOSS 'union' > command, which I have tried, but I am unable to make that generate an > .embl file that contains the correctly remapped coordinates of the features > onto the concatenated sequence. The closest I came to success was an .embl > file that mapped the first chromosome features only , and incorrectly, onto > the concatenated sequence. > > > Is there a 'correct' way to do load a multifasta record and its annotation > into Artemis? The Artemis user manual is rather opaque on this topic. > > > > _______________________________________________ > Artemis-users mailing > listartemis-us...@sanger.ac.ukhttp://publists.sanger.ac.uk/mailman/listinfo/artemis-users > > > > -- > Bruno Donzelli > Research Associate > Dept. of Plant Pathology and Plant-Microbe Biology, Cornell University > Robert W. Holley Center for Agriculture and Health > 538 Tower Road, Cornell University > Ithaca, NY 14853 > Phone: 607 255-2179 > > -- Dr. Steven Sullivan Center for Genomics & Systems Biology New York University 12 Waverly Place New York, NY 10003
_______________________________________________ Artemis-users mailing list Artemis-users@sanger.ac.uk http://publists.sanger.ac.uk/mailman/listinfo/artemis-users