We might have just found a new recurring discussion - what to do with insertion codes! I am sure the opinion split is close to 50/50.
Personally, I don't think insertion codes make sense in the first place. Are catalytic triad residues always the same distance from the N-terminus? No. The residue number designates its position in the sequence, not its relationship to other like-minded enzymes. That is a useful (as it defines peptide bonds) and consistent definition. Even with antibodies, where these can be indeed interpreted as insertions, the mess is enormous since only half the structures conform to Wu&Kabat numbering (and there are two of those). It is true however that it's just as easy to implement sorting that pays attention to insertion codes as it is structural alignment based residue matching. Cheers, Ed. -------- Original message -------- From: [email protected] Date: To: [email protected] Subject: Re: [ccp4bb] Link problem with Refmac. I fully second this. The treatment of insertion codes of many programs is a mess, leaving you with the option to renumber (and loose contact with often a huge body of existing literature), or stuff the pdb with link and gap records (if recognized at all by the program used). It would be a great help if the programmers would use a simple distance criterion (e.g. N - C distance < 2.0 A) to decide whether amino acids are linked instead of forcing a link between residues which are more than 10 A apart as in the current case. Cheers, Herman -----Original Message----- From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of Robbie Joosten Sent: Monday, February 18, 2013 10:57 PM To: [email protected] Subject: Re: [ccp4bb] Link problem with Refmac. Hi Ian, I avoid renumbering whenever I can. If I do have to renumber things (e.g. to get proper connectivity in PDB entry 2j8g), I do it by hand. So no help there. As for dealing with insertion codes in general, why not try to convince the developers of the 'brain-damaged' to support insertion codes? I've asked quite a few for these sort of updates and many were very helpful. The problem is that most developers discover the existence of insertion codes after they set up a data structure for the coordinates. Adding support afterwards can be quite a hassle. The more users ask for such support, the more likely it will be implemented. Cheers, Robbie > -----Original Message----- > From: CCP4 bulletin board [mailto:[email protected]] On Behalf Of > Ian Tickle > Sent: Monday, February 18, 2013 19:40 > To: [email protected] > Subject: Re: [ccp4bb] Link problem with Refmac. > > Hi Robbie > > > OK I just realised what's going on. In my script I renumber the input > PDB file > (starting at 1 for each chain and incrementing by 1) and keep the > mapping so > I can renumber it back afterwards for human consumption. So you're > completely correct: there is indeed a residue A59 after renumbering! > This is > to avoid headaches with brain-damaged programs that can't cope with > insertion codes and residue numbers out of sequence. So I guess I'm > going to have to be smarter in my renumbering program and make sure I > maintain any increasing gaps in the numbering which indicate real gaps > in the sequence and only renumber over insertions and decreasing gaps. > It doesn't > actually matter what the new numbers are since the user never sees them. > > > But this must be a common problem: how do others handle this? E.g. > pdbset blindly renumbers with a increment of 1 (and anyway it doesn't > renumber any LINK, SSBOND & CISPEP records as I do) so it would have > the same problem. > > > Cheers > > > -- Ian > > > > On 18 February 2013 17:09, Robbie Joosten <[email protected]> > wrote: > > > Hi Ian, > > The warning refers to a MET 59 in chain A whereas you only have MET > 72. That > is very suspicious. Non-sequential residues further apart than x > Angstrom > automatically get a gap record. Have you tried a newer version of > Refmac, > because this feature was added quite a while ago? > What is your setting for 'MAKE CONN' when you run Refmac? > > Cheers, > Robbie > > > > > > -----Original Message----- > > From: CCP4 bulletin board [mailto:[email protected]] On Behalf > Of > > Ian Tickle > > Sent: Monday, February 18, 2013 17:32 > > To: [email protected] > > Subject: [ccp4bb] Link problem with Refmac. > > > > > > All, I'm having a problem with Refmac (v. 5.7.0025) that I don't > understand. > > It's linking 2 residues that it shouldn't be. Here's the relevant > message > in the > > log file: > > > > WARNING : large distance for conn:TRANS dist = 10.768 > > ch:AA res: 58 THR --> 59 MET > ideal_dist= 1.329 > > > > Note that there are no LINK (or LINKR) records in the PDB header. > > > > > > Here are the input co-ords for the relevant residues (not linked): > > > > ATOM 887 N THR A 58 13.587 1.365 19.814 1.00 14.28 A > N > > ATOM 888 CA THR A 58 14.743 1.126 18.960 1.00 17.64 A > C > > ATOM 890 CB THR A 58 14.325 0.613 17.567 1.00 17.69 A > C > > ATOM 892 OG1 THR A 58 13.605 1.650 16.879 1.00 15.24 A > O > > ATOM 894 CG2 THR A 58 13.505 -0.658 17.658 1.00 17.33 A > C > > ATOM 898 C THR A 58 15.573 2.346 18.631 1.00 22.80 A > C > > ATOM 899 O THR A 58 15.144 3.492 18.842 1.00 20.41 A > O > > ATOM 956 N MET A 72 13.605 -6.845 13.378 1.00 43.23 A > N > > ATOM 957 CA MET A 72 12.268 -6.980 12.733 1.00 39.06 A > C > > ATOM 959 CB MET A 72 12.308 -6.361 11.331 1.00 42.06 A > C > > ATOM 962 CG MET A 72 12.455 -4.846 11.320 1.00 43.45 A > C > > ATOM 965 SD MET A 72 13.020 -4.153 9.755 1.00 46.07 A > S > > ATOM 966 CE MET A 72 14.695 -4.789 9.653 1.00 49.84 A > C > > ATOM 970 C MET A 72 11.544 -8.344 12.624 1.00 36.94 A > C > > ATOM 971 O MET A 72 10.314 -8.353 12.558 1.00 34.24 A > O > > > > > > Here are the same residues (linked) after refinement: > > > > ATOM 887 N THR A 58 14.212 0.104 18.340 1.00 43.09 A > N > > ATOM 888 CA THR A 58 14.332 -1.166 17.541 1.00 45.12 A > C > > ATOM 890 CB THR A 58 12.906 -1.657 17.309 1.00 39.26 A > C > > ATOM 892 OG1 THR A 58 12.400 -1.039 16.117 1.00 38.40 A > O > > ATOM 894 CG2 THR A 58 12.010 -1.301 18.435 1.00 33.96 A > C > > ATOM 898 C THR A 58 14.805 -1.376 16.064 1.00 59.98 A > C > > ATOM 899 O THR A 58 15.304 -0.470 15.386 1.00 69.73 A > O > > ATOM 901 N MET A 72 14.609 -2.641 15.623 1.00 61.67 A > N > > ATOM 902 CA MET A 72 13.990 -2.997 14.308 1.00 60.32 A > C > > ATOM 904 CB MET A 72 14.898 -2.730 13.093 1.00 71.29 A > C > > ATOM 907 CG MET A 72 14.126 -2.345 11.812 1.00 73.22 A > C > > ATOM 910 SD MET A 72 12.912 -3.499 11.087 1.00 69.42 A > S > > ATOM 911 CE MET A 72 13.917 -4.503 9.996 1.00 63.68 A > C > > ATOM 915 C MET A 72 13.413 -4.438 14.205 1.00 59.57 A > C > > ATOM 916 O MET A 72 12.199 -4.599 14.130 1.00 60.33 A > O > > > > > > Residues 59-71 are present but in a poorly defined loop so I > definitely do > not > > want residues 58 & 72 linked! I'm puzzled because I'm sure it never > used > to > > do this, i.e. you had to specify a LINK if you wanted one and Refmac > was > > smart enough to recognise that residues across a break should not > be > linked. > > So how do I tell it NOT to link them? > > > > > > Cheers > > > > > > -- Ian > > > >
