There was a bug introduced in the later versions of the "union" routine in EMBOSS and I am not sure if it was fixed. As far as I know union may work with GFF files but messes up the location of the features in concatenated genbank and embl files. I reported the bug almost 2 years ago. I have to use the emboss version 6.3.1 to concatenate files correctly. Another program that can be used to join files is ugene.Go to workflow designer/samples/merge sequences and shift corresponding annotation. Select the correct files for input and output. Occasionally this program will introduce 2 quotes ("") in place of one ("). Just search and replace using a text editor.


On 25/02/14 04:51, Tim Carver wrote:
Re: [Artemis-users] loading a eukaryotic genome assembly(multifasta) and annotation into Artemis 16.0.0 In Artemis, with a multi-fasta sequence file, the options for the annotation file are to use a GFF file or to in some way to concatenate the EMBL/GenBank feature table (adjusting the coordinates to match the correct position of the assembly). This is what 'union' should do with EMBL files. I am not sure why this wasn't successful for you. You obviously do need to use '-feature' with union to get the feature table included:

union --feature --osf embl  entry.embl

Using GFF and union are the options used here.


On 24/02/2014 20:05, "Steven Sullivan" <> wrote:

    ENA (EMBL) provides TEXT and FASTA file downloads for eukaryotic
    assemblies.  The FASTA download is single a multi-fasta file
    containing separate records for each chromosome. The TEXT download
    is a single EMBL feature table concatenating all the feature
    tables of the individual chromosomes.  It does not contain the DNA

    Loading these two files into Artemis yields a view of the entire
    assembly as a concatenated sequence, but only the features for the
    first chromosome in the feature file are loaded.

    I understand that this issue has been brought up before. (e.g.
     What I don't see is a workaround.  Mention was made of the EMBOSS
    'union' command, which I have tried,  but I  am unable to make
    that generate an .embl file that contains the correctly remapped
    coordinates of the features onto the concatenated sequence. The
    closest I came to success was an .embl file that mapped the first
    chromosome features only , and incorrectly, onto the concatenated

    Is there a 'correct' way to do load a multifasta record and its
    annotation into Artemis?  The Artemis user manual is rather opaque
    on this topic.

Artemis-users mailing list

Bruno Donzelli
Research Associate
Dept. of Plant Pathology and Plant-Microbe Biology, Cornell University
Robert W. Holley Center for Agriculture and Health
538 Tower Road, Cornell University
Ithaca, NY 14853
Phone: 607 255-2179

Artemis-users mailing list

Reply via email to