Dear Ed,
I do not think there is any split opinion as far as users are concerned:
Programs that pay attention to insertion codes will also handle regular
pdb files correctly so nothing changes for the users. The advantage is
that in the odd case one loads a vintage pdb file with insertion codes,
one can run all calculations without having to renumber the file first.
For software developers, it is quite a different story and here I
suspect that the split will rather be more close to 90/10 in favor of
not implementing insertion codes. In a time that everybody can download
pdb files and view the structures in 3D on their iphones, there is no
longer a need for insertion codes. There is still the problem with the
existing literature, but I am sure one could make a pdf reader which
automatically renumbers the residues mentioned in the papers.
I fully agree that insertion codes are a mess and, hopefully, people
will find a way to get rid of them (or keep them in parallel to
consecutive numbers) once the cif format gets adopted for protein
coordinate files.
Cheers,
Herman
________________________________
From: Ed. Pozharski [mailto:[email protected]]
Sent: Thursday, February 21, 2013 1:46 PM
To: Schreuder, Herman R&D/DE; [email protected]
Subject: Re: [ccp4bb] Link problem with Refmac.
We might have just found a new recurring discussion - what to do
with insertion codes! I am sure the opinion split is close to 50/50.
Personally, I don't think insertion codes make sense in the
first place. Are catalytic triad residues always the same distance from
the N-terminus? No. The residue number designates its position in the
sequence, not its relationship to other like-minded enzymes. That is a
useful (as it defines peptide bonds) and consistent definition. Even
with antibodies, where these can be indeed interpreted as insertions,
the mess is enormous since only half the structures conform to Wu&Kabat
numbering (and there are two of those).
It is true however that it's just as easy to implement sorting
that pays attention to insertion codes as it is structural alignment
based residue matching.
Cheers,
Ed.
-------- Original message --------
From: [email protected]
Date:
To: [email protected]
Subject: Re: [ccp4bb] Link problem with Refmac.
I fully second this. The treatment of insertion codes of many
programs
is a mess, leaving you with the option to renumber (and loose
contact
with often a huge body of existing literature), or stuff the pdb
with
link and gap records (if recognized at all by the program used).
It would be a great help if the programmers would use a simple
distance
criterion (e.g. N - C distance < 2.0 A) to decide whether amino
acids
are linked instead of forcing a link between residues which are
more
than 10 A apart as in the current case.
Cheers,
Herman
-----Original Message-----
From: CCP4 bulletin board [mailto:[email protected]] On
Behalf Of
Robbie Joosten
Sent: Monday, February 18, 2013 10:57 PM
To: [email protected]
Subject: Re: [ccp4bb] Link problem with Refmac.
Hi Ian,
I avoid renumbering whenever I can. If I do have to renumber
things
(e.g. to get proper connectivity in PDB entry 2j8g), I do it by
hand. So
no help there.
As for dealing with insertion codes in general, why not try to
convince
the developers of the 'brain-damaged' to support insertion
codes? I've
asked quite a few for these sort of updates and many were very
helpful.
The problem is that most developers discover the existence of
insertion
codes after they set up a data structure for the coordinates.
Adding
support afterwards can be quite a hassle. The more users ask for
such
support, the more likely it will be implemented.
Cheers,
Robbie
> -----Original Message-----
> From: CCP4 bulletin board [mailto:[email protected]] On
Behalf Of
> Ian Tickle
> Sent: Monday, February 18, 2013 19:40
> To: [email protected]
> Subject: Re: [ccp4bb] Link problem with Refmac.
>
> Hi Robbie
>
>
> OK I just realised what's going on. In my script I renumber
the input
> PDB
file
> (starting at 1 for each chain and incrementing by 1) and keep
the
> mapping
so
> I can renumber it back afterwards for human consumption. So
you're
> completely correct: there is indeed a residue A59 after
renumbering!
> This
is
> to avoid headaches with brain-damaged programs that can't cope
with
> insertion codes and residue numbers out of sequence. So I
guess I'm
> going to have to be smarter in my renumbering program and make
sure I
> maintain any increasing gaps in the numbering which indicate
real gaps
> in the sequence and only renumber over insertions and
decreasing gaps.
> It
doesn't
> actually matter what the new numbers are since the user never
sees
them.
>
>
> But this must be a common problem: how do others handle this?
E.g.
> pdbset blindly renumbers with a increment of 1 (and anyway it
doesn't
> renumber any LINK, SSBOND & CISPEP records as I do) so it
would have
> the same problem.
>
>
> Cheers
>
>
> -- Ian
>
>
>
> On 18 February 2013 17:09, Robbie Joosten
<[email protected]>
> wrote:
>
>
> Hi Ian,
>
> The warning refers to a MET 59 in chain A whereas you only
have
MET
> 72. That
> is very suspicious. Non-sequential residues further apart than
x
> Angstrom
> automatically get a gap record. Have you tried a newer version
of
> Refmac,
> because this feature was added quite a while ago?
> What is your setting for 'MAKE CONN' when you run Refmac?
>
> Cheers,
> Robbie
>
>
>
>
> > -----Original Message-----
> > From: CCP4 bulletin board [mailto:[email protected]] On
Behalf
> Of
> > Ian Tickle
> > Sent: Monday, February 18, 2013 17:32
> > To: [email protected]
> > Subject: [ccp4bb] Link problem with Refmac.
> >
> >
> > All, I'm having a problem with Refmac (v. 5.7.0025) that I
don't
> understand.
> > It's linking 2 residues that it shouldn't be. Here's the
relevant
> message
> in the
> > log file:
> >
> > WARNING : large distance for conn:TRANS dist =
10.768
> > ch:AA res: 58 THR --> 59 MET
> ideal_dist= 1.329
> >
> > Note that there are no LINK (or LINKR) records in the PDB
header.
> >
> >
> > Here are the input co-ords for the relevant residues (not
linked):
> >
> > ATOM 887 N THR A 58 13.587 1.365 19.814 1.00
14.28
A
> N
> > ATOM 888 CA THR A 58 14.743 1.126 18.960 1.00
17.64
A
> C
> > ATOM 890 CB THR A 58 14.325 0.613 17.567 1.00
17.69
A
> C
> > ATOM 892 OG1 THR A 58 13.605 1.650 16.879 1.00
15.24
A
> O
> > ATOM 894 CG2 THR A 58 13.505 -0.658 17.658 1.00
17.33
A
> C
> > ATOM 898 C THR A 58 15.573 2.346 18.631 1.00
22.80
A
> C
> > ATOM 899 O THR A 58 15.144 3.492 18.842 1.00
20.41
A
> O
> > ATOM 956 N MET A 72 13.605 -6.845 13.378 1.00
43.23
A
> N
> > ATOM 957 CA MET A 72 12.268 -6.980 12.733 1.00
39.06
A
> C
> > ATOM 959 CB MET A 72 12.308 -6.361 11.331 1.00
42.06
A
> C
> > ATOM 962 CG MET A 72 12.455 -4.846 11.320 1.00
43.45
A
> C
> > ATOM 965 SD MET A 72 13.020 -4.153 9.755 1.00
46.07
A
> S
> > ATOM 966 CE MET A 72 14.695 -4.789 9.653 1.00
49.84
A
> C
> > ATOM 970 C MET A 72 11.544 -8.344 12.624 1.00
36.94
A
> C
> > ATOM 971 O MET A 72 10.314 -8.353 12.558 1.00
34.24
A
> O
> >
> >
> > Here are the same residues (linked) after refinement:
> >
> > ATOM 887 N THR A 58 14.212 0.104 18.340 1.00
43.09
A
> N
> > ATOM 888 CA THR A 58 14.332 -1.166 17.541 1.00
45.12
A
> C
> > ATOM 890 CB THR A 58 12.906 -1.657 17.309 1.00
39.26
A
> C
> > ATOM 892 OG1 THR A 58 12.400 -1.039 16.117 1.00
38.40
A
> O
> > ATOM 894 CG2 THR A 58 12.010 -1.301 18.435 1.00
33.96
A
> C
> > ATOM 898 C THR A 58 14.805 -1.376 16.064 1.00
59.98
A
> C
> > ATOM 899 O THR A 58 15.304 -0.470 15.386 1.00
69.73
A
> O
> > ATOM 901 N MET A 72 14.609 -2.641 15.623 1.00
61.67
A
> N
> > ATOM 902 CA MET A 72 13.990 -2.997 14.308 1.00
60.32
A
> C
> > ATOM 904 CB MET A 72 14.898 -2.730 13.093 1.00
71.29
A
> C
> > ATOM 907 CG MET A 72 14.126 -2.345 11.812 1.00
73.22
A
> C
> > ATOM 910 SD MET A 72 12.912 -3.499 11.087 1.00
69.42
A
> S
> > ATOM 911 CE MET A 72 13.917 -4.503 9.996 1.00
63.68
A
> C
> > ATOM 915 C MET A 72 13.413 -4.438 14.205 1.00
59.57
A
> C
> > ATOM 916 O MET A 72 12.199 -4.599 14.130 1.00
60.33
A
> O
> >
> >
> > Residues 59-71 are present but in a poorly defined loop so I
> definitely do
> not
> > want residues 58 & 72 linked! I'm puzzled because I'm sure
it
never
> used
> to
> > do this, i.e. you had to specify a LINK if you wanted one
and
Refmac
> was
> > smart enough to recognise that residues across a break
should
not
> be
> linked.
> > So how do I tell it NOT to link them?
> >
> >
> > Cheers
> >
> >
> > -- Ian
> >
>
>