Currently if you fetch from Uniprot, EMBL a compound sequence name is made e.g.

UNIPROT|accession|accession|accession|...|name|name|...

PDB|pdbId|name|chain|id (?)

EMBL|accession


but if fetching from Pfam, Rfam or Ensembl the sequence name is just the 
accession id.


Is there a rationale to this?

I would like to know since SequenceIdMatcher depends on it.

It has to know to look for a sequence called "UNIPROT|P1560" to resolve a 
UNIPROT database reference, but not to include the source database if resolving 
an ENSEMBL reference, which seems ad hoc.


Or does this problem go away when 'primary db reference' (JAL-2106) is, well, 
resolved? Which will I guess remove the overloading of the sequence name with 
this information.


Any thoughts?


thanks


Mungo Carstairs
Jalview Computational Scientist
The Barton Group
Division of Computational Biology
School of Life Sciences
University of Dundee, Dundee, Scotland, UK.
www.jalview.org<http://www.jalview.org/>
www.compbio.dundee.ac.uk<http://www.compbio.dundee.ac.uk/>

The University of Dundee is a registered Scottish Charity, No: SC015096
_______________________________________________
Jalview-dev mailing list
[email protected]
http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-dev

Reply via email to