Hi Peter,

Thanks for the info. I'd better go check whether my code assumes insertion 
codes are not  digits.

Cheers,
Robbie 

> Date: Wed, 5 Dec 2012 17:57:58 +0000
> From: pkel...@globalphasing.com
> Subject: Re: [ccp4bb] thanks god for pdbset
> To: CCP4BB@JISCMAIL.AC.UK
> 
> Hi Robbie,
> 
> On Wed, 2012-12-05 at 17:02 +0100, Robbie Joosten wrote:
> > Hi Ian,
> > 
> > It's easy to forget about LINK records and such when dealing with the
> > coordinates (I recently had to fix a bug in my own code for that). 
> > The problem with insertion codes is that they are very poorly defined in the
> > PDB standard. Does 128A come before or after 128? There is no strict rule
> > for that, instead they are used in order of appearance. This makes it hard
> > for programmers to stick to agreed standards. Instead people rather ignore
> > insertion codes altogether. They are really poorly soppurted by many
> > programs. Perhaps switching to mmCIF gets rid of the problem.
> 
> Properly used, the PDB exchange dictionary for mmCIF can indeed sort
> this out. In addition to the PDB-style residue number + insertion code,
> it has an item for the residue sequence number in the chain (running
> from 1 .. n). The relevant item names are:
> 
>   _atom_site.pdbx_PDB_residue_no
>   _atom_site.pdbx_PDB_ins_code
> 
> and:
>   _entity_poly_seq.num
> 
> One thing to be careful of, is cases where the insertion code is a digit
> (which does happen sometimes). I have seen code many times where an
> assumption is made that the insertion code is not a digit, and this is
> assumption is used to separate the residue number from the insertion
> code (e.g. a user is asked to enter a residue number + insertion code as
> a single item). If the insertion code is a digit, this won't work.
> 
> This is easy to handle in the fixed-width PDB format:
> 
>    85
>    851
>    852
>    86
> 
> but if it gets written to mmCIF incorrectly as:
> 
> loop_
> _atom_site.pdbx_PDB_residue_no
> _atom_site.pdbx_PDB_ins_code
>    85  .
>    851 .
>    852 .
>    86  .
> 
> instead of the correct:
> 
> loop_
> _atom_site.pdbx_PDB_residue_no
> _atom_site.pdbx_PDB_ins_code
>    85  .
>    85  1
>    85  2
>    86  .
> 
> it can be really hard to sort out later on.
> 
> Regards,
> Peter.
> 
> -- 
> Peter Keller                                     Tel.: +44 (0)1223 353033
> Global Phasing Ltd.,                             Fax.: +44 (0)1223 366889
> Sheraton House,
> Castle Park,
> Cambridge CB3 0AX
> United Kingdom
                                          

Reply via email to