Hi Meindert,

The PDB will let you do what you want and as a result there are a few PDB
entries with crazy residue numbering. I would use insertion codes only for
real insertions or engineered linkers. Like Nat said, they are a nightmare
for many programmers which is why they are poorly supported by many
programs. So go with Mitch's suggestion and offset the residue numbers of
the second protein, by some value that makes it clear that the residues are
not from another part of the protein. You can add a comment in REMARK 999 if
you want to provide extra explanation. 
According to the PDB standard you do not need a LINK: the connectivity of
residues is implied by the order in which the appear in the SEQRES records.
That said, programs may do quite different things here. FYI, many programs
assume that residue numbering is unidirectional, i.e. always increasing (or
in some double stranded DNA molecules in the PDB, always decreasing). So
avoid things like going from residue 299 to 300 to 170 to 171. This can
cause big problems, for instance when you define your TLS group from residue
200 to residue 173. 

Cheers,
Robbie


> -----Original Message-----
> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
> Nat Echols
> Sent: Tuesday, October 23, 2012 19:01
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Convention on residue numbering of fusion proteins?
> 
> On Tue, Oct 23, 2012 at 9:55 AM, Meindert Lamers <mlamers@mrc-
> lmb.cam.ac.uk> wrote:
> > Is there any convention on the numbering of residues in a fusion
protein?
> >
> > I have a structure of two domains fused together but would like to
> > keep the biological numbering intact.
> > 1st domain: residue 200-300 (protein A).
> > 2nd domain: residue 170-350 (protein B).
> > The fusion is between A300 and B170
> >
> > Is it OK to label them chain A and B and create a LINK between the two
> > (thus keeping the biological residue number intact).
> > Or do I have to start the 2nd domain with residue number 301 (and
> > loose all biological information).
> 
> You could use the insertion code: the first domain could be residues 200A
-
> 300A, the second domain would be residues 170B - 350B, e.g.
> 
> ATOM   2743  CA  THR A 300A     -9.899   6.476  21.720  1.00 27.53
C
> ATOM   2750  CA  VAL A 170B     -6.589   4.599  21.939  1.00 32.82
C
> 
> but the chain ID stays the same, with no BREAK or TER record (and no LINK
> required).  The insertion code can be a pain to deal with from a
programmer's
> perspective, and it makes it more difficult to specify residue ranges, but
I
> think this is exactly what it's supposed to be used for.
> 
> -Nat

Reply via email to