Re: [ccp4bb] Insertion codes

Robbie Joosten Wed, 04 May 2011 00:01:53 -0700

Hi Ed,

> Personally I don't care one way or the other, but it may be pointed out
> that if D25 is actually number 37 in a homologous protein, it should be
> D37. Just as acknowledgement of the (somewhat purist) point of view
> that the residue number should denote its linear distance from the
> N-terminus.
> 
But which N-terminus should we use? The N-terminus of the protein, the one of 
the construct, or the N-terminus of what is ordered in the PDB file? And what 
about deletions, isn't it usefull to have gaps in the residue numbering 
indicating a deletion?
 
Getting proper residue numbering is difficult and there will always be 
exceptions. Dealing with all the different possible schemes is a nightmare. 
That is why residue numbering is always one of the first topics in structural 
bioinformatics. The PDB now seems to follow the numbering from UniProt which 
makes things a lot clearer, but fusion proteins now lead to crazy jumps in the 
residue numbering resulting in chains with numbers going from 100, to 1200 and 
back to 300. 
For many well studied groups of proteins insertion codes help the biological 
interpretation of the structures. Unfortunately, insertion codes are 
surprisingly poorly supported by software that uses PDB files especially 
outside crystallography (but even CCP4 software has some remaining problems). I 
hope this thread will at least increase awareness of the existence of insertion 
codes. It is very much needed...
 
Cheers,
Robbie


> 
> Cheers,
> 
> Ed.
> 
> -- 
> "Hurry up before we all come back to our senses!"
> Julian, King of Lemurs

Re: [ccp4bb] Insertion codes

Reply via email to