[ccp4bb] Does any software use non-TRIPOS sections in mol2 files
Is it safe to assume that the section headers in mol2 files are all @TRIPOSsomething or is all that's guaranteed the initial @? http://tripos.com/data/support/mol2.pdf has everything TRIPOS, but that's defining the 'Tripos Mol2 File Format' and I don't know if someone else has defined a different class of records for their program's own use. Tom
Re: [ccp4bb] Announcing a Web Server for the Grade ligand restraints generator.
On 20 Mar 2012, at 12:34, Eleanor Dodson wrote: I would like to use this to check an existing ligand. I have the PDB refined according to a cif file, and that cif file used for input to REFMAC and phenix. I dont want to lose the atom names assigned there so is it possible to start GRADE with one of those inputs or do I have to convert it to a MOL2 file (I guess thsat is a SYBIL file?) You will have to convert it to a mol2 file; we have had very acceptable results doing this using openbabel, simply obabel ligand.pdb -Oligand.mol2 at least in the case where the ligand has all its hydrogen atoms present and named. If the ligand doesn't have hydrogen atoms, you will have to use obabel ligand_noH.pdb -h -Oligand_H.mol2 then edit ligand_H.mol2 so that all the hydrogen atoms have different names (I appreciate this is tedious, it will be automatic in the next version), then use that as input. Ton
Re: [ccp4bb] sudden drop in R/Rfree
On 2 Mar 2012, at 16:02, Regina Kettering wrote: Rajesh; I am not sure that you have a high enough data:refinement parameters ratio to refine TLS. It just adds more parameters to refine that can lead to over-refinement of your model, especially at the 3.3 A. TLS only adds twenty parameters per chain; so it's a really parsimonious thing to do at low resolution. I'd say that adding lots of waters at 3.3A (at four parameters per added water) was much more likely to be the cause of a very wide R/Rfree gap. I'm a bit worried that a user working at low resolution on a protein with more than one chain per ASU is not using NCS from the very beginning; that's another good way of adding more restraints and effectively getting the parametersto-data ratio down (because the 'parameters' in that ratio is really 'parameters minus K * number of restraints'; there is scope for a lot of debate as to the right value of K, it clearly depends on the strength of the restraints) If he's using the Global Phasing refinement software, I would strongly suggest that Rajesh use targetting to the initial molecular replacement result throughout the refinement, as yet a third way of adding more restraints. Tom Womack (Global Phasing) HTH, Regina From: Rajesh kumar ccp4...@hotmail.com To: CCP4BB@JISCMAIL.AC.UK Sent: Friday, March 2, 2012 10:54 AM Subject: [ccp4bb] sudden drop in R/Rfree Dear All, I have a 3.3 A data for a protein whose SG is P6522. Model used was wild type structure of same protein at 2.3 A. After molecular replacement, first three rounds of refinement the R/Rf was 26/32.8, 27.1/31.72 % and 7.35/30.88 % respectively. In the fourth round I refined with TLS and NCS abd added water and the R/Rf dropped to 19.34/26.46. It has almost 7% difference. I also see lot of unanswerable density in the map where lot of waters were placed. Model fits to the map like a low resolution data with most of side chains don't have best density. I was not expecting such a sudden drop in the R/Rfree and a difference is 7.2%. I am wondering if I am in right direction. I am not sure if this usual for 3.3A data or in general any data if we consider the difference. I appreciate your valuable suggestions. Thanks Raj
Re: [ccp4bb] Sub-angstrom resolution
On 11 Jan 2012, at 02:13, Artem Evdokimov wrote: There are two sides to this qustion: the scientific one is actually easier to answer in generic terms - but I also would like to point out the very recent example of a mystery that required very high resoluton (and orthogonal techniques) to answer, namely the puzzle of the light atom in the center of the mofe nitrogenase protein. Highly recommended reading. That does sound interesting: could you give a reference? I can find various papers about small slices of the puzzle, but not a review article. Tom
Re: [ccp4bb] Sub-angstrom resolution
On 11 Jan 2012, at 11:36, Thomas Womack wrote: On 11 Jan 2012, at 02:13, Artem Evdokimov wrote: There are two sides to this qustion: the scientific one is actually easier to answer in generic terms - but I also would like to point out the very recent example of a mystery that required very high resoluton (and orthogonal techniques) to answer, namely the puzzle of the light atom in the center of the mofe nitrogenase protein. Highly recommended reading. That does sound interesting: could you give a reference? I can find various papers about small slices of the puzzle, but not a review article. http://www.sciencemag.org/content/334/6058/974.full is the work using orthogonal techniques to figure out which the light atom actually was, with a discussion at http://www.sciencemag.org/content/334/6058/914.full The high-resolution structure that revealed that there was a light atom there is from 2002: http://www.sciencemag.org/content/297/5587/1696.full with discussion at http://www.sciencemag.org/content/297/5587/1654.full Tom
[ccp4bb] Making fixes to cif2mtz easier
The patch in https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=CCP4BB;325e1870.1112 solves the problem with 3u57. A few days ago I read the interesting paper http://journals.iucr.org/d/issues/2011/01/00/dz5216/dz5216bdy.html referring to a crystal structure containing a diselenide bond; I downloaded the model and structure factors for 2xsk, and cif2mtz refused to convert them. This turns out to be because the Bijvoet pairs measured in the header were described as _refln.pdbx_F_meas_plus _refln.pdbx_F_meas_plus_sigma _refln.pdbx_F_meas_minus _refln.pdbx_F_meas_minus_sigma while both the ccp4-6.2.0/lib/data/cif_mm.dic and the http://mmcif.pdb.org/dictionaries/ascii/mmcif_pdbx_v40.dic dictionaries require these to be _refln.pdbx_F_plus _refln.pdbx_F_plus_sigma _refln.pdbx_F_minus _refln.pdbx_F_minus_sigma It was trivial to fix the problem with a text editor (and this is a case where the right answer is to get wwpdb to fix the sf.cif file at their end), but this led to some discussion at Global Phasing as to what could be done to make cif2mtz handle sf.cif files with unusual data items without requiring modifying and recompiling the source code each time. A summary of where we got to is at http://www.globalphasing.com/buster/wiki/index.cgi?CCP4cif2mtzImproveIdeas I would appreciate any comments on how to proceed in this direction. Tom
[ccp4bb] How to get cif2mtz to handle new fields
The current version of the mmcif_pdbx dictionary at http://mmcif.pdb.org/dictionaries/mmcif_pdbx.dic/Index/index.html defines _refln.pdbx_DELFWT _refln.pdbx_DELPHWT _refln.pdbx_FWT _refln.pdbx_PHWT as fields in which you can deposit coefficients from the computation of weighted Fo-Fc and 2Fo-Fc maps; this is marvellous, since previously it's been very unclear how you deposit maps. However, when I make an mmcif file with entries for these fields and pass it through the version of cif2mtz in ccp4-6.2.0, I get Line 77:data name _refln.pdbx_DELFWT not present in dictionary Line 78:data name _refln.pdbx_FWT not present in dictionary Line 79:data name _refln.pdbx_DELPHWT not present in dictionary Line 80:data name _refln.pdbx_PHWT not present in dictionary Is it possible to edit the dictionary? It appears to be supplied as lib/cif_mmdic.lib which is a binary file; presumably that's produced with some kind of compiler from some kind of source file, but I'm not sure how to start looking for the compiler and the source. Yours sincerely, Thomas Womack (Global Phasing)
Re: [ccp4bb] Another paper structure retracted
On 11 Aug 2011, at 17:40, Diana Tomchick wrote: A quick glance at the header of the PDB file shows that there is one glaring discrepancy between it and the table in the paper that hasn't been mentioned yet in this forum. The data completeness (for data collection) reported in the paper is 95.7%, but in the header of the PDB file (actually, in both the 2QNS and the 3KJ5 depositions) the data completeness (for data collection) is reported as only 59.4%. This is nastily anisotropic data; using the Sawaya diffraction anisotropy server lists principal components 23.2, -9.9, -13.3A^2; the resolution cut-off is roughly where the C*-axis goes to F/sigF=2 and there's a good deal of information left on the A* and B* axes. There is also a large cone of missing data around the A* axis, and both missing and poorly-correlated reflections at low resolution - beamstop issues? The peptide is arranged roughly parallel to the B axis. It's not an irredeemably bad apo structure, there are a few peptide flips and I can rebuild quickly to R/Rfree 0.188/0.259 against the aniso-corrected data from the Sawaya server (first step in rebuilding was deleting the C chain, and it's not coming back). Tom
Re: [ccp4bb] Off Topic: PDB validation server
On 8 Jul 2011, at 19:13, Katherine Sippel wrote: I know that the PDB updated its validation server in May as described in their news link but it seemed to indicate an increase in output options rather than a change in criteria. Is anyone aware of what changes were made to the validation server in regards to the preferred geometrical and stereochemical features? As far as I can tell empirically, if I run the validation server today it complains about a) waters which make a perfectly good contact with a residue in a different ASU b) waters which make a perfectly good contact with metal ions or with other waters which themselves make a perfectly good contact with the protein. and this means it's really not much use for validation of large complicated proteins with hundreds of waters. Tom
Re: [ccp4bb] Follow-up: non-waters among structured solvent atoms
On 16 Jun 2011, at 17:19, Pavel Afonine wrote: Hi, On Thu, Jun 16, 2011 at 7:49 AM, Jan Dohnalek dohnalek...@gmail.com wrote: Modeling more UNKNOWN atoms might be the future for these cases? one needs to specify chemical element type in 77-78 position, otherwise these records are useless. But if you know the chemical element type then there's no point in calling it UNK. BUSTER uses the scattering factors for oxygen for modelling X, on the grounds that you'll have put in an X because it doesn't look enough unlike water to be obviously something else. Tom
[ccp4bb] A small bug in the CCP4 dictionary?
The restraint dictionary for hydrogenated tryptophan, lib/data/monomers/t/TRP.cif, lists a 15-atom plane for the sidechain, omitting the atom HZ2. Unless this is an exciting result derived from neutron diffraction experiments, would it be possible to fix the dictionary? Yours sincerely, Thomas Womack (Global Phasing)
Re: [ccp4bb] Coot cannot read mtz or pdb files
On 4 Oct 2010, at 11:15, Leiman Petr wrote: Dear all, Coot behaves in a very strange way on my student's MacBook (32bit) running MacOS X 10.6.4. Both versions of coot are affected - the precompiled Prof. Scott's one and the compiled from source. It cannot read in MTZ files (quote: This is not an mtz file). PDB files are garbled up on reading as well. Most (but not all) connections are broken. A screenshot is attached. I think this is a locale issue; try running 'LANG=C coot'. I suspect the parser is assuming that the decimal-point character is 'comma' not 'full-stop', which is why the atoms have been moved to exact-integer locations. Tom
Re: [ccp4bb] Deposition of riding H
On 15 Sep 2010, at 18:04, Ed Pozharski wrote: On Wed, 2010-09-15 at 07:57 -0700, Pavel Afonine wrote: if you refined your structure with H, then you should deposit it with H sure. But the structure is not *refined with hydrogens* when they are in predicted positions. Following the same logic one could suggest that electron density should be deposited, since we can approximate it. And I notice that a fair number of groups do deposit electron density - at least, they deposit PHIC and sometimes even HL coefficients in the sf.cif file. HL coefficients in the sf.cif file can get badly corrupted in the deposition process, but they definitely show willing. I think it's useful to limit the information presented in a pdb-file to what was actually refined + specific instructions on how the refinement was done. I suppose I come to this from a background where every deposition is a fresh new test-case for new refinement software; it's only lack of download bandwidth and CPU power that makes me not want to start from the images. I like the idea that what you deposit is the output of a well-defined refinement; which means that you need to deposit the instructions for doing the refinement, and the model you used as input. There's a perfectly good PDB protocol for multi-MODEL files. Nobody does such depositions, I think the PDB would complain if you tried, and there's the problem of endless regression. I would be very happy if every PDB deposition with 'METHOD: MOLECULAR REPLACEMENT' had an extra MODEL in it containing the input to the molrep tool, and some REMARK lines describing how molrep was used; I would not complain if this was made compulsory for depositions which nowadays say 'STARTING MODEL: NULL'. 26 of the 130 depositions with method MOLECULAR REPLACEMENT this week have starting model NULL, as well as seven depositions with method FOURIER SYNTHESIS and starting model NULL. (why do MAD and SAD depositions still have a STARTING MODEL field?) (while we're on the subject of riding hydrogens, I would invite people to admire the conformations of the hydrogens in such places as the C-alpha of residues A45 and A57 of deposition 2x5n - it's clearly a software bug rather than any mistake on the part of the authors, but nonetheless striking) Tom
Re: [ccp4bb] Low-resolution structure refinement with Refmac
On 27 Aug 2010, at 10:55, Petr Kolenko wrote: Dear crystallographers, I have a structure at 3.3A resolution, 16 identical chains in AU, merohedral twinning present. I started to refine using NCS restraints with chain A as a reference chain. Current Rwork/Rfree is 21/25. There is almost nothing to refine manually in whole structure now. But, refinement without NCS restraints results in Rwork/Rfree of about 17/28. What should I do? Or is it possible to deposit the structure refined using NCS restraints in final refinement? This seems like a really well-done NCS refinement; using the multi-fold NCS is allowing you to get what is an excellent Rfree and Rwork-Rfree gap for 3.3A data. Definitely deposit the NCS-restrained version; refining without NCS restraints just increases the number of parameters by a factor sixteen and spends most of those on fitting noise. 1% Ramachandran outliers at 3.3A also seems entirely reasonable. Tom
Re: [ccp4bb] Should I be worried about negative electron density?
On 19 May 2010, at 00:36, Paul Emsley wrote: Jay Pan wrote: Hello Everyone, I have a reasonably well fitted electron density map through molecular replacement. However, there is always some red region left no matter how hard I tried when the mtz file is loaded into Coot. Is this because my model is still not good enough or it’s natural to most model fittings. In another word, should I be worried about the red region? Thanks in advance. Turn up the contour level and make it go away - that's what I do :) 3 or 3.5 sigma peaks are typical. Metals, carboxyls and disulfides are often associated with relatively strong negative density, some people try adjust their model to compensate (and others not, of course). As a rule of thumb, if you have 5 sigma peaks at the end of your refinement, that might be worrying/interesting. The median height of the tallest positive peak after autoBUSTER re-refinement for the PDB *depositions* during April this year is about 7.2, and of the tallest negative peak about -5.0. 25/50/75th quantiles: negative peaks -5.9 / -5.0 / -5.6 positive peaks 6.1 / 7.2 / 8.6 Tom Womack (Global Phasing)
Re: [ccp4bb] Distinguishing Between Na+ and H2O
The deposition 3fiy from the start of last year might be of interest: FORMUL 2 NA199(NA 1+) FORMUL 20 HOH *256(H2 O) It is annoying that the periodic table offers such a discrete range of sizes for 1+ ions; I hoped the lanthanide contraction would provide a heavy sodium substitute with lots of anomalous scattering, but (if I believe http://en.wikipedia.org/wiki/Ionic_radius) no ... Ag+ is the closest match in size (still 15% or so bigger) but silver(I) compounds are usually insoluble, La3+ is the same size as Na+, and LaCl3 nicely soluble, but obviously it coordinates very differently. Tom
[ccp4bb] Haem-cysteine interactions
One of the features that Global Phasing's routine runs of deposited PDB structures often pick up is very close contacts between the SG of cysteine residues and the CAB and CAC atoms of the propenyl groups on HEM ligands, not described by LINK cards in the header of the deposited structure. There is generally a CXXC motif in the protein which provides two cysteines to hold a haem in place. Am I right that, in general, a haem bound to cysteines should be modelled as the molecular entity called by the PDB HEC, with the CAB and CAC atoms essentially tetrahedral, the CAB-CBB and CAC-CBC bond lengths the same as a carbon-carbon single bond, and a link from SG to CAB with angle and bond lengths around the SG as in methionine? There are a number of high-resolution structures deposited recently containing haems near cysteines, and in most of them a re-refinement gives substantial positive difference density about the CM atoms; there is even occasionally a sign of some kind of longer tail coming out from the CMB position. It seems to be purely positive density, rather than the dipole that tends to be diagnostic of anisotropy. My current thought is that even a 1.4A structure of a haem-containing protein (I'm looking at an autoBUSTER re-refinement of the 3fo3 deposition from EMBL Hamburg) may well come from a crystal sufficiently well-ordered that we're seeing hydrogens - one of these structures has at least one isoleucine with green blobs at positions which would be reasonable for every hydrogen on the side-chain - but I would be interested to know other peoples' experience and interpretation. Tom Womack (Global Phasing)
Re: [ccp4bb] Eleven plausible phasing elements remain unused
On Wed, 2009-04-01 at 14:33 -0700, Ethan Merritt wrote: On Wednesday 01 April 2009 07:21:16 Thomas Womack wrote: A perusal of the PDB reveals that the game of Periodic Table bingo still has eleven rounds to run: scandium, titanium, germanium, zirconium, niobium, neodymium, dysprosium, thulium, hafnium, bismuth and thorium remain absent from PDB entries. Does this imply that there is a PDB entry containing Radon? I defined 'plausible' as a half-life greater than a billion years, though I wouldn't have been totally amazed to see a plutonium or technetium derivative. Elements with half-lives 10^6 to 10^9 years for the most stable isotope are Np, Tc, Cm, Pu; next shortest is 31kyears for 231Pa. The long-lived curium-247 and plutonium-244 isotopes are neutron-heavy and inconvenient to produce. The web-accessible subset of the ICSD features a technetium arsenide, a plutonium boride, a sodium neptunate(VII) and an americium iodide. Tom
[ccp4bb] Eleven plausible phasing elements remain unused
A perusal of the PDB reveals that the game of Periodic Table bingo still has eleven rounds to run: scandium, titanium, germanium, zirconium, niobium, neodymium, dysprosium, thulium, hafnium, bismuth and thorium remain absent from PDB entries. OK, many of these are elements that would rather be refractory oxides or jet-engine components than hexammines, and niobium chloride clusters don't seem to be as water-stable as Ta6Br14, but why have neodymium, dysprosium and thulium so consistently been left out there in the cold rather than admitted to the warmish embrace of carboxyl groups? There must somewhere be a protein with a site that cries out for ThCl2(2+), an unexpectedly water-stable cation. Tom