Re: [ccp4bb] add ligand with AceDRG
Indeed, as Diana points out: PDB's own components.cif defines LIG as: _chem_comp.id LIG _chem_comp.name "3-PYRIDIN-4-YL-2,4-DIHYDRO-INDENO[1,2-.C.]PYRAZOLE" _chem_comp.type NON-POLYMER _chem_comp.pdbx_type HETAIN _chem_comp.formula "C15 H11 N3" So they probably should fix that. Also that chem_comp.name seems to be associated with a variety of ligand IDs with different formulae and also turns up as a synonym of others. Things seem to be a little wayward in there. Phil Jeffrey Princeton On 4/26/24 10:40 AM, Diana Tomchick wrote: But I think that is a mistake, if you search for LIG in the PDB, it brings up a definite ligand that has that 3-letter code. Diana Sent from my iPhone On Apr 26, 2024, at 8:04 AM, Deborah Harrus wrote: Dear all, Just to clarify, "LIG" is also a reserved code, so it's safe to use. See https://www.wwpdb.org/news/news?year=2023#656f4404d78e004e766a96c6 Kind regards, Deborah Harrus PDBe To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] request for applications
:: I expect to have ~ $1e12 USD on current ledgers. Presumably via the Bankman-Fried algorithm Phil On 4/1/24 3:01 AM, James Holton wrote: Hey Everyone, It may sound like an incredibly boring thing that there has never been a formal mathematical proof that finding the prime factors of very large numbers doesn't have a more efficient algorithm than simply trying every single one of them. Nevertheless, to this day, encryption keys and indeed blockchain-based cryptocurrencies hinge upon how computationally hard it is to find these large prime factors. And yet, no one has ever proven that there is not a more efficient way. [snip] To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Fragile Crystals
Hello Morgan In addition to the other good suggestions, I have a few observations of my own. If your crystals crack without handling or adding anything to the drop, then they are extremely environment-sensitive. If that's the case, testing at room temperature will be problematic because that tends to be somewhat stressful on the crystal either mechanically (ye olde capillary mount method) or via dehydration (loop mounts with the sleeve). Growing in the presence of at least a little cryoprotectant as per Vaheh would be less stressful than multi-step processes like Tao-Hsin's advice unless your crystals can re-anneal after stress. Mounting directly from the drop is probably essential, and mounting under oil is a good thing to try in addition - apart from anything else oil on the drop slows down the environmental changes. Using Mitegen mounts might be less stressful on some crystals than standard nylon loops if they are mechanically sensitive. Spending some time optimizing the mechanics of your freezing technique might help significantly in reducing the amount of time your crystal dehydrates while moving through air. (Jim Pflugrath's article is full of useful information: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4461322/ ) Small crystals often freeze more smoothly than large ones - even for robust crystals like tetragonal lysozyme. Try a lot of crystals - I've had projects were two different crystals in the same loop from the same drop showed radically different diffraction. Also I've encountered several cases where the appearance of disorder varies within a crystal when using a microfocus synchrotron beam line (I mostly use FMX and AMX at NSLS2). Lastly, really cranky crystals rings a distant bell of something we encountered in the p19(INK4d)-Cdk6 structure back in the 1990's. I think it was Jie-Oh Lee that did the hard work on this, but in many instances crystals cracked in situ when merely opening the drop, and the fix was by adding a cross-linker to the well, resealing the drop and waiting for the cross-linker to diffuse: "The crystals were pretreated with glutaraldehyde (diffused into the drop from a reservoir of 30% glutaraldehyde) to reduce their tendency to crack and lose diffraction along b* and c*." https://www.nature.com/articles/26155#Sec9 Most crystals don't love being cross-linked, and I would call this a successful instance of a desperation maneuver. Good luck. Phil Jeffrey Princeton On 11/22/23 11:44 AM, Blake, Morgan Elizabeth wrote: Hello I am a PhD student working on a crystallography project to wrap up my dissertation research. I have purified a complex of two proteins, and I can consistently grow crystals in 10% PEG3350, 0.2M KSCN, 0.1M BIS-TRIS propane pH 7.5. These crystals have sharp edges and can grow to a large size (greater than 0.5 mm), but the crystals seem to be very fragile. When we open the drops to harvest the crystals, we have little time to harvest the crystals before they crack. When we move the crystals to a cryoprotectant, over time they start fracturing. We've tried using different percentages of glycerol, ethylene glycol, PEG400, and oil for cryoprotectants with no success. Needless to say, the crystals do not diffract well, with spot patterns that look very streaky/mosaic, which I presume is due to the defects that we see in harvesting/handling. We have screened for alternate crystallization conditions, but we seem to get the same morphology in other conditions. Does anyone have suggestions for additives we could use post-crystallization to help stabilize our crystals? Thanks for your advice! To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C7df53dda9c18474a25d208dbeb7bbeec%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638362688963366377%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=tru7KwHbWYoYZZ3AOrMUilKXqAopNBpCZ32XASFWcaQ%3D=0> To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Cannot select any recommended SG for the protein BsAlaDH
1. Completeness is primarily an issue with using the right point group and crystal system, not the actual space group (e.g. in primitive point group mmm the space groups P222, P2221, P21212, P212121 should all have essentially the same completeness). 2. If "refinalize" in CrysAlisPro doesn't let you choose the right point group and system, then you should process the data with another program. XDS, MOSFLM, DIALS, autoPROC etc might work, and I have to believe they'd be better at scaling your data. 3. If you can export the unscaled data from CrysAlisPro you might be able to feed it into POINTLESS and AIMLESS for scaling 4. On the model front, go find an AlphaFold model, they have worked for me multiple times in molecular replacement so far. Phil Jeffrey Princeton On 8/2/23 3:00 PM, CENGIZ KAAN FERAH wrote: Hello, So I'm trying to get the data processed that I gathered from XRD for the protein BsAlaDH. Unfortunately from the method that I know of on CrysAlisPro I cannot select the recommended space group for the protein. This results in the data not being complete. Still I can get good unit cells and the degrees. Another problem is that this protein has no structure published on PDB. And the homolog proteins do not have high similarities. By that way I cannot really find a suitable space group. Can someone give me a hand on this issue. Thank you. To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7Ca39f5ce183ef4277715608db938c34bd%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638266002730226032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=yOpB80eriM%2FMS07e8o4YNZRFiTVN%2FdGOmNiSNZxSxN0%3D=0> To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] CIF file problem
Hello Ning CheckCIF checks small molecule crystallographic cif files - the dictionaries and expectations on the contents are not the same as for mmcif, although the underlying syntax is the same. I'm unaware of anything that does the equivalent of CheckCIF for macromolecular cif files. Phil Jeffrey Princeton On 4/20/23 4:02 PM, Ning Li wrote: Hi all, Does anybody know why I got this error message: /Checking for embedded fcf data in CIF .../ /No extractable fcf data in found in CIF/ as I uploaded the CIF file to https://checkcif.iucr.org/ <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcheckcif.iucr.org%2F=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C68cb158f5d564e523dd208db41da53c1%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638176178344904359%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=AGJyz7EJ7Isif6JDsBSfsrr8lg%2BgFsp4gjjA2pOTld0%3D=0> for structure validation? The CIF file was directly from phenix.refine during structure refinement. Appreciate your help Ning To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C68cb158f5d564e523dd208db41da53c1%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638176178344904359%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=lPCCK274is8BSjU5Hl5qTY6wODAdprb5VepQ%2F%2FLJTUA%3D=0> To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Problem with crystal structure solution
Hello Gargi I don't think you mean Fc. Apart from anything else, there's not enough room for 4 more domains per Fab. I think you mean the CL:CH1 domain dimer of the Fab. Fabs have a well-characterized variability in the "elbow angle" between the VL:VH domain pair and the CL:CH1 domain pair. I suspect you've tried molecular replacement without trying a range of different elbow angles. I see +ve difference map peaks within domains and -ve difference map peaks along polypeptide chain, which is refinement's way of saying that it doesn't want atoms in the latter location and wants atoms in the former location. There's also some quite serious clashing (chain interdigitation) in places, which can't possibly be correct. The easiest way to do this is to re-do your molecular replacement with VL:VH and CL:CH1 models which will inherently allow their correct relative placement to be modeled. If you managed to get TFZ of 42 with flawed models, this should be a pretty easy thing to pull off, but if not please contact me off list. Phil Jeffrey Princeton On 3/27/23 1:52 PM, Kher, Gargi M wrote: Hello, I obtained diffraction data for one of my crystallographic projects. Data collection determined the space group to be P1 at a ~2.67Å. I solved it using MR (it contains a Fab-Fab complex) and the Phaser solution (TFZ of 42.0), which is close to the Matthews coefficient prediction, placed 16 Fabs in the unit cell (8 copies of the same Fab-Fab complex). The Matthews coefficient was predicted to be 2.44 with 9 copies in the ASU, but for 8 and 10 copies in the unit cell, the Matthews coefficients were 2.75 and 2.20, respectively. I did set up MR jobs searching for 9 and 10 Fab-Fab copies in the unit cell, but still, only 8 Fabs were placed. While most of the Fabs fit nicely in the density, 6 Fcs (chains e, A, M, Q, U, and c) do not fit within the electron density. I have tried re-processing my data, but P1 seems to be the “correct” space group, and Xtriage does not indicate any red flags. There is some extra positive density visible close to these Fcs. I believe these 6 Fcs might be in a different orientation/position than how they’re currently being placed, and should fit into the additional positive density I’m seeing. However, I have been unable to place them correctly: either by using rigid body refinement in Phenix for these 6 Fc domains or doing it manually as Fc domains. Does anyone have ideas as to what might be happening here and what I could do to try and fix this? I’ve attached my PDB and MTZ file for your reference. Thank you, Gargi Kher To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1=05%7C01%7Cpjeffrey%40PRINCETON.EDU%7C6e91270dddb447ccc5d108db2eee6e8d%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7C638155375396165543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=HdrbNX%2Fw7VGPrUFglnM3edzgSsMKAELuPTIhrOvYjJc%3D=0> To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] To Trim or Not to To Trim
On 3/10/23 4:05 AM, Julia Griese wrote: Hi all, My impression has been that the most common approach these days is to “let the B-factors take care of it”, but I might be wrong. Maybe it’s time to run another poll? Personally, I call any other approach R-factor cosmetics. The goal in model building is not to achieve the lowest possible R-factors, it’s to build the most physically meaningful, most likely to be correct, model. And I could call your approach "model cosmetics". If you can't see the side-chain, you don't know where it is and you probably don't even know where the centroid of the distribution is. Only in the case of very short side-chains with few rotamers can you make a reasonable volume approximation to where the side-chain is and "let the B-factors" smear out the density to cover a range of the projected conformations. For longer side-chains, if you put it in a single conformation, you are very likely NOT coming close to correctly modeling the actual distribution of the conformations. So let's circle back on "most likely to be correct model" and ask what we *actually* know about where the atoms are. Put your disordered Arg in with 10 alternate conformations, each with a refined relative occupancy, and then let the B-factors smear that lot out, and that's your better model. Phil Jeffrey Princeton To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] what would be the best metric to asses the quality of a mtz file?
CCP4's PEAKMAX program would be quite scriptable. Phil On 10/27/21 1:58 PM, Murpholino Peligro wrote: So... how can I get a metric for noise in electron density maps? First thing that occurred to me open in coot and do validate->difference map peaks-> get number of peaks (is this scriptable?) or Second phenix.real_space_correlation detail=residue file.pdb file.mtz Thanks again To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Lowering R factor from small molecule structure
Unlike macromolecular crystallography, small molecule crystallography is infrequently starved for data. So it makes no sense at all to extend your data to e.g. I/sigI of 1.0 amd Rmeas > 80% unless you want your R1 to be >10% for no good reason or utility, which is what was behind my suggestion - test to see if the data cutoff is an issue. Also about the fastest test you can do in SHELXL. > Yes, ANIS and adding hydrogens (in SHELXL) are good things to do - with 0.8Å data most small molecule crystallographers would do this as a first step after fitting all the non-H atoms. Actually, adding AnisoB's and hydrogens too soon will mess up your disorder modeling, so blanket statements like that work for well-behaved structures but not so much for more challenging ones. e.g. in one of the the four structures I've done this week, one had significant main-molecule disorder so that comes ahead of adding hydrogens, and refining unrestrained anisoB (as is the default) for disordered atoms is asking for trouble. It's not as cookie-cutter as you represent, and I stick to all my suggestions. Phil Jeffrey Princeton On 6/4/21 4:27 AM, Harry Powell - CCP4BB wrote: Hi Yes, ANIS and adding hydrogens (in SHELXL) are good things to do - with 0.8Å data most small molecule crystallographers would do this as a first step after fitting all the non-H atoms. One thing I can’t agree with is to cut the resolution of your data _unless_ you have a very, very good reason to do so. Normal small molecule refinements will use data to ~0.8Å and not use a cut-off based on resolution or I/sig(I). A good dataset will often go to higher resolution and small molecule crystallographers will be very happy to use these data (unless, as I say, they have a very good reason not to), and would certainly have to “explain to the referees” why they didn’t if they ignored a systematic chunk. Something else that you might not have thought of - have you actually told SHELXL what the reflection data are - i.e., are they F, F^2, intensity? It’s perfectly possible to solve a small molecule structure by e.g. telling the program you’re giving it F^2 but actually giving it F, but refinement would be somewhat less straightforward. SHELXL normally uses F^2 in refinement, macromolecular programs still normally use F (AFAIK). What programs did you use for processing the diffraction data? Of course, lowering the R factor is not the objective of the exercise - a lower R-factor is a consequence of having a model that fits the data better. I would be strongly inclined to ask a small molecule crystallographer (or someone with a strong background in it) to have a look at your data & model - they could probably give you a definitive answer by return of e-mail. Just my two ha’porth Harry On 4 Jun 2021, at 03:10, Jon Cooper <488a26d62010-dmarc-requ...@jiscmail.ac.uk> wrote: Agreed, ANIS is the command to try. Sent from ProtonMail mobile Original Message On 3 Jun 2021, 20:18, Philip D. Jeffrey < pjeff...@princeton.edu> wrote: R1 of 17% is bad for small molecule. 0.8 Å is in the eye of the beholder - if you're using macromolecular cutoffs then these might be too aggressive for small molecule-type refinement stats - try a more conservative cutoff lie 0.9 and see how that changes R1. However I suspect it's more to do with how your model is fitting the data. Have you refined anisotropic Bfactors ? Have you added hydrogens ? I would suggest non-CCP4 programs like Olex2 or SHELXLE as the interface for the refinements - I use the latter and it's somewhat Coot like with useful features that are particular to small molecule. Also PLATON has some things (like expand-to-P1 and Squeeze) that, respectively, might be useful to explore space group issues and disordered solvent. PLATON also has a means to check for some forms of twinning. Phil Jeffrey Princeton From: CCP4 bulletin board on behalf of Jacob Summers <60a137e4bf3a-dmarc-requ...@jiscmail.ac.uk> Sent: Thursday, June 3, 2021 2:49 PM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Lowering R factor from small molecule structure Greetings! I am currently trying to reduce the R factor of a cyclic small molecule peptoid in ShelXle. The max resolution of the molecule is 0.8 angstroms. The molecule itself fits the density very well, but there are a few unexplained densities around the molecule which do not seem to be anything in the crystallization conditions. The R1 factor of the refinement is 17.07% but I am unsure how to lower this value. Any ideas on how to better refine this molecule or fill densities to lower the R1 factor? I do not have much experience working with small molecule refinement or with ShelX. Thanks so much, Jacob Summers To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.a
Re: [ccp4bb] (scattering factors) f and f" for Sr Heavy atom
http://skuld.bmsc.washington.edu/scatter/AS_periodic.html http://skuld.bmsc.washington.edu/scatter/data/Sr.dat Would probably work for initial values. 9700 eV for that wavelength. https://people.mbi.ucla.edu/sumchan/crystallography/ang-eV_convertor.html Phil Jeffrey Princeton On 1/22/21 2:36 PM, rohit kumar wrote: Hello All, I have data collected at Wavelength: 1.2782 (For Sr Heavy atom) with a resolution of 1.6 A. I was trying to run Crank in ccp4 for SAD phasing and It asks me to fill the values of (scattering factors) f and f" for the heavy atom. Can anyone please help with this, how to calculate or where to find these f and f" values for Sr heavy atoms? Please let me If you need any information from my side. Thank you in advance -- Regards Dr. Rohit Kumar Singh Postdoctoral fellow To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 <https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1> To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Finding partial occupancy monomer by MR ?
Thanks for the suggestions. The idea that it's related to a trigonal space group and twinning or pseudo space group is an interesting one, but this is C2221 and the intensity stats don't show twinning. Twinned P21 -> C2221 doesn't solve the non-unit occupancy in this case. Since the other monomers are full-occupancy it can't be 3 overlapping dimers so the phenomenon is rather unusual in my finite experience. (Also only one set of Se peaks for this 4th monomer). I used Herman's suggestion of finding 3 monomers first (with very large RFZ/TFZ/LLG since the monomers had been refined against the data) since that's very fast. And then Phaser took a long while to not find the 4th monomer. Once I figure out how to make modern versions of phaser to "fail quickly" like the older versions I'll scan a range of homology% and see if that changes anything. Phil On 12/10/20 9:46 AM, Schreuder, Herman /DE wrote: Dear Phil, 0.32 is awfully close to 1/3, which brings a nice mathematical puzzle to my mind to see if the 1/3 occupancy is somehow related to the 3 fully occupied monomers... It may also be related to a (trigonal??) space group... You probably have already tried it, but phaser has the option to give it already solved molecules and ask it to search for additional molecules. Here I would indeed lower the expected % homology significantly, to crudely compensate for the low occupancy. In contrast to the advice of Dale, I would play around with the % homology to find the value which works best. My 2 cents, Herman -Ursprüngliche Nachricht- Von: CCP4 bulletin board Im Auftrag von Phil Jeffrey Gesendet: Donnerstag, 10. Dezember 2020 14:49 An: CCP4BB@JISCMAIL.AC.UK Betreff: [ccp4bb] Finding partial occupancy monomer by MR ? Preamble: I have an interesting crystal form with 3 monomers (~400aa) at full occupancy and apparently one at much reduced occupancy. It was built recently from Se-SAD and was in moderately good condition: Rfree=32% for trimer, 2.6 Å. In recent refinement cycles it became obvious that there was a 4th monomer in a region of weaker/choppy 2Fo-Fc and Fo-Fc density that corresponded to a "confusing" set of low-occupancy SeMet sites found by SHELXD and Phaser-EP. The experimental map was bad in that region and was probably flattened during density modification anyway, in retrospect. Question: Phaser failed to find the 4th monomer after trivially finding the other 3 with a recent version of the monomer. I'm wondering if there's a way to indicate "this one is partial occupancy" to Phaser, or if there's a way to improve the odds of success beyond just lowering the expected % homology. Or if anyone has had success with other programs. This is perhaps a rare edge case but I naively expected Phaser to work. In the end I used the weak SeMet sites to locate the monomer and the occupancy appears to be around 0.32 in refinement. Cheers, Phil Jeffrey Princeton To unsubscribe from the CCP4BB list, click the following link: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1data=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=G28AUNQrAgQYblmaYBVnXESTXiekmWzfTLPMMX%2B%2BOgw%3Dreserved=0 This message was issued to members of https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2FCCP4BBdata=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=WitzjV%2F3hzx1SzMmmDzKVX56uVBD1fXluDDAcyY8Y1g%3Dreserved=0, a mailing list hosted by https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2Fdata=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=wv3OjpY8AGsTj0RfiiEMjWPLDD5QOMvfV1TwcaRXczs%3Dreserved=0, terms & conditions are available at https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fpolicyandsecurity%2Fdata=04%7C01%7CHerman.Schreuder%40SANOFI.COM%7C5ee2ee2e87f04d853f0408d89d126615%7Caca3c8d6aa714e1aa10e03572fc58c0b%7C0%7C1%7C63743205007789%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=mEy2Z2rqn8IBmLklgmDe0%2BZOXq4gcLHqSqze6fW%2Fhx4%3Dreserved=0 To unsubscribe from the CCP4BB
[ccp4bb] Finding partial occupancy monomer by MR ?
Preamble: I have an interesting crystal form with 3 monomers (~400aa) at full occupancy and apparently one at much reduced occupancy. It was built recently from Se-SAD and was in moderately good condition: Rfree=32% for trimer, 2.6 Å. In recent refinement cycles it became obvious that there was a 4th monomer in a region of weaker/choppy 2Fo-Fc and Fo-Fc density that corresponded to a "confusing" set of low-occupancy SeMet sites found by SHELXD and Phaser-EP. The experimental map was bad in that region and was probably flattened during density modification anyway, in retrospect. Question: Phaser failed to find the 4th monomer after trivially finding the other 3 with a recent version of the monomer. I'm wondering if there's a way to indicate "this one is partial occupancy" to Phaser, or if there's a way to improve the odds of success beyond just lowering the expected % homology. Or if anyone has had success with other programs. This is perhaps a rare edge case but I naively expected Phaser to work. In the end I used the weak SeMet sites to locate the monomer and the occupancy appears to be around 0.32 in refinement. Cheers, Phil Jeffrey Princeton To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Model refinement problems when upgrading Phenix
Hello Juan First, there's a phenix.refine bulletin board, on which you might attract the attention of the developers, which might help. http://www.phenix-online.org/mailman/listinfo/phenixbb I've been using 1.17-3644 without issues after transitioning from something older. Consider downgrading to something new-ish. At first glance this looks like a change in the weighting between the X-ray term and the (sum of) the geometric terms - if you are using a wxc_scale command or explicit weighting value I'd turn that off and see what happens. But you say that you've been using weight optimization, which seems to suggest otherwise. What are your Rwork, Rfree, RMSD bonds, RMSD angles, Rama stats for the same model in the two refinement program versions ? If the weight changes the Rwork vs geometry should be a pretty easy indicator. If you get worse geometry with the same Rwork that's a lot more troubling. And, try REFMAC. REFMAC is usually faster, and on a couple of high resolution projects gave a significant drop in Rfree. Usually they are comparable but it's worth running both to see what happens. Things that traditionally give me issues in phenix.refine are: real space refine subprocess sometimes "unrefines" my structure (try turning it off); there appears to be enough of a difference between first and subsequent passes of the weight estimation that the weight-refinement scheme gets thrown off. Phil Jeffrey Princeton On 7/21/20 10:40 AM, JUAN ESTEVEZ GALLEGO wrote: Dear all, I have been working on the refinement of a crystal structure using phenix.refine from the 1.12-2828-Intel-Linux-2.6 version of Phenix. I have recently replaced my computer by a MacBook and I have upgraded Phenix to the 1.18.2-3874-MacOs version. However, I found that the refinement introduced an amazing huge amount of outliers, specially C-beta and ramachandran. The structure is at 2.5A resolution and I used the following refinement strategy: XYZ coordinates, Real-space, Individual B-factors, occupancies, X-ray/stereochemistry weight optimization, no experimental phase restrain, automatic metal and ligand linking and automatic correct N/H/Q errors. I also tried using TLS instead of Indivudal B-factors, but the problems persist. Does anybody know why this could be happening? Thanks a lot for your help! Best, Juan To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] number of frames to get a full dataset?
The people that already use multiplicity are going to find reasons why it's the superior naming scheme - although the underlying reason has a lot to do with negative associations with 'redundant', perhaps hightened in the current environment. And conversely redundant works for many others - Graeme's pragmatic defense of multiplicity actually works both ways - any person who takes the trouble to read the stats table, now exiled to Supplementary Data, knows what it means. Surely, then, the only way forward on this almost totally irrelevant discussion is to come up with a universally-loathed nomenclature that pleases nobody, preferably an acronym whose origins will be lost to history and the dusty CCP4 archives (which contain threads similar to this one). I humbly submit: NFDOF: Nearly Futile Data Overcollection Factor ? [*] Or, even better, could we not move on to equally pointless discussions of the inappropriateness of "R-factor" ? I have a long history of rearguard action trying to give stupid acronyms a wider audience, so you're guaranteed to hear from me on this for years. (Personally I'm pining for Gerard Kleywegt to resume his quest for overextended naming rationales, of which ValLigURL is a personal 'favo[u]rite'. But I'm just old-fashioned.) Ironically, Phil Jeffrey Princeton [* I too have collected 540 degrees in P1 to solve a SAD structure, just because I could, hence "nearly"] [** The actual answer to this thread is: history is written by the authors of scaling programs - and I think the Americans are currently losing at this game, thus perilously close to making themselves redundant.] On 6/30/20 4:14 AM, Winter, Graeme (DLSLtd,RAL,LSCI) wrote: Or, we could accept the fact that crystallographers are kinda used to multiplicity of an individual Miller index being different to multiplicity of observations, and in Table 1 know which one you mean? Given that they add new information (at the very least to the scaling model) they are strictly not “redundant”. The amount that anyone outside of methods development cares about the “epsilon” multiplicity of reflections is … negligible? Sorry for chucking pragmatism into a dogmatic debate Cheerio Graeme To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] refinement of 0.73A data in shelxl
That doesn't sound right re: PART numbers classically: PART 1 majority disordered atoms with FVAR/occupancy of e.g. "21." instead of usual "11." PART 2 minority disordered atoms with FVAR/occupancy of e.g. "-21." PART 0 The 21.000/-21.000 pairs makes the sum of occupancies add to 1.0, but the actual value of each group is defined by the second free variable. See: http://shelx.uni-goettingen.de/shelxl_html.php#PART The "PART 1" atoms would not interact with the "PART 2" atoms. There's even an example for a disordered SER in the documentation. PART -n is used for disorders that overlap on themselves on symmetry axes. "If n is negative, the generation of special position constraints is suppressed and bonds to symmetry generated atoms with the same or a different non-zero PART number are excluded; this is suitable for a solvent molecule disordered on a special position of higher symmetry than the molecule can take". I use PART 1/PART 2/PART 0 all the time in "small molecule world" but I've used PART -1 precisely once. Phil Jeffrey Princeton On 2/6/20 4:15 PM, Tim Gruene wrote: Dear Matthias, some developers introduce new features of their refinement programs with the words " ... which has been there in SHELXL since the beginning of time". If you are only looking for two conformations, you are looking for the combination of free variable number N with part N and part -N. In case you deal with more than two conformations, take a look at SUMP (as Jon suggested). The use of free variables is easier to explain right at the computer, so please ask a colleague near you office, who is familiar with SHELXL for the details. Best, Tim On Thursday, February 6, 2020 8:10:01 PM CET Barone, Matthias wrote: Sorry if the mail was not clear. I figured that out now yes. As I wrote in the update, I found this stupid error I made and now everything looks good. Now that I got the feeling of how shelxl works, I miss one of it's features in the pdb format, namely the possibility to link occupancies of a double confirmation to another moiety, say a water or a double confirmation of the ligand. It's there a way to use something similar like FVAR in a pdb file? Dr. Matthias Barone AG Kuehne, Rational Drug Design Leibniz-Forschungsinstitut für Molekulare Pharmakologie (FMP) Robert-Rössle-Strasse 10 13125 Berlin Germany Phone: +49 (0)30 94793-284 From: bogba...@yahoo.co.uk Sent: Thursday, February 6, 2020 5:01:14 PM To: Barone, Matthias Cc: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] refinement of 0.73A data in shelxl Hello, hope I can help. OK, so here is the disp table... SFAC C H CL N O DISP $C 0.005100.00239 15.73708 DISP $H-0.20.0 0.66954 DISP $CL0.188450.21747 1035.16450 DISP $N 0.009540.00480 28.16118 DISP $O 0.016050.00875 47.79242 If we take these coordinates... N 30.414964 -0.1476350.11689611.00.19533 0.44341 = H0A 20.427823 -0.1386560.12325611.0 -1.5 C 10.348035 -0.1607760.11097911.00.20723 0.28451 = O 40.363785 -0.1741540.10290611.00.21226 0.22954 = SG50.1773030.1012670.04057210.040000.06849 0.03024 = O 40.2413040.0717350.03856710.960000.14982 0.12755 = ... the first N (followed by 3) is being assigned the scattering factors of chlorine because this element is 3rd in the SFAC list. The SG (followed by 5) is being assigned the scattering factors of O because the latter is 5th in the SFAC list. I think you need to check these assignments and the chlorine occupancy are Ok. Jon Cooper On 6 Feb 2020 11:13, "Barone, Matthias" wrote: Dear community here is an update of my shelxl problem. I solved it after an epiphany last night in bed... I tried countless things to get the postive density on the Cl under control. Markus suggested that the density came from a radiolysed chloride, so I tried to superimpose chlorinated and radiolysed ligands. However that did not lead to anything fruitful. Remember that I tried to incorporate DISP of Cl into the .ins file: This is the original of the protein .ins, chloride just pasted as last element: SFAC C H N O S CL DISP $C 0.005100.00239 15.73708 DISP $H-0.20.0 0.66954 DISP $N 0.009540.00480 28.16118 DISP $O 0.016050.00875 47.79242 DISP $S 0.159950.16998812.87489 DISP $CL0.188450.21747 1035.16450 The upper list only creates postive density on the Chloride, the rest of the map is clean and looks the same as if you would omit the DISP line of Cl alltogether. The following list is coming from the .ins file of the converted prodrg file: SFAC C H CL N O DISP $C 0.005100.00239 15.73708 DISP
Re: [ccp4bb] Another difficult MR case
Are you *sure* there's no translational NCS ? For example your first molecular replacement solution out of Phenix shows EULER 293.6 27.7 288.7 FRAC -0.02 0.02 0.02 (that's "first molecule at origin in P1") and EULER 294.0 27.9 288.8 FRAC -0.37 0.02 0.02 which is essentially the same orientation, and a translation down one crystallographic axis (a*) And this suggests to me that either Xtriage or Phaser is missing something here. Does Phaser find translational NCS in its initial data analysis ? Unmodeled translational NCS could cause significant problems with the molecular replacement search. Phil Jeffrey Princeton On 8/29/19 11:28 AM, Napoleão wrote: Deal all, Sorry for the long post. I have a data set obtained from a crystal produced after incubating a protease with a protein which is mostly composed by an antiparallel beta sheet. I have tried numerous approaches to solve it, and failed. Molecular replacement using Phaser, and the protease or the protein as a template yields no solution. However, molecular replacement using only part of the beta sheet yields LLG=320 TFZ==28.0 (see below). The apparently good data extends to 1.9 A, as processed by XDS, and the space group is P1 (pointless agree). XDS info below: SPACE_GROUP_NUMBER= 1 UNIT_CELL_CONSTANTS= 44.43 72.29 77.30 97.802 89.939 101.576 a b ISa 9.647E-01 3.176E-03 18.07 RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr 1.90 24890 19149 23814 80.4% 58.1% 63.7% 11482 0.77 82.2% 63.8* 3 0.694 492 total 163756 125884 146938 85.7% 10.6% 10.8% 75744 3.78 15.0% 99.0* -3 0.761 5834 Xtriage in Phenix 1.16-3549 gives me all green lights (print below), suggesting the data presents no twinning, no translational NCS, no ice rings and is not anisotropic. http://fullonline.org/science/phenix_xtriage_green.png Molecular replacement in Phaser yields single solutions like: Solution annotation (history): SOLU SET RFZ=3.0 TFZ=* PAK=0 LLG=29 RFZ=2.8 TFZ=8.8 PAK=1 LLG=310 TFZ==27.6 LLG=320 TFZ==28.0 SOLU SPAC P 1 SOLU 6DIM ENSE ensemble1 EULER 293.6 27.7 288.7 FRAC -0.02 0.02 0.02 BFAC -6.03 SOLU 6DIM ENSE ensemble1 EULER 294.0 27.9 288.8 FRAC -0.37 0.02 0.02 BFAC -6.52 SOLU ENSEMBLE ensemble1 VRMS DELTA -0.1983 RMSD 0.49 #VRMS 0.21 or partial solutions like: Partial Solution #1 annotation (history): SOLU SET RFZ=3.7 TFZ=* PAK=0 LLG=32 RFZ=2.8 TFZ=13.0 PAK=0 LLG=317 TFZ==30.2 LLG=331 TFZ==30.5 RFZ=2.4 TFZ=7.2 PAK=0 LLG=464 TFZ==18.5 RFZ=2.7 TFZ=5.7 PAK=1 LLG=501 TFZ==6.8 LLG=509 TFZ==6.6 SOLU SPAC P 1 SOLU 6DIM ENSE ensemble1 EULER 85.4 153.0 138.5 FRAC -0.01 -0.00 -0.00 BFAC -12.30 SOLU 6DIM ENSE ensemble1 EULER 86.2 153.2 139.5 FRAC -0.36 -0.01 -0.01 BFAC -9.16 SOLU 6DIM ENSE ensemble1 EULER 83.8 152.3 135.9 FRAC -0.00 0.00 -0.25 BFAC 1.52 SOLU 6DIM ENSE ensemble1 EULER 191.2 109.1 39.3 FRAC -0.27 -0.01 0.22 BFAC 10.18 SOLU ENSEMBLE ensemble1 VRMS DELTA -0.0447 RMSD 0.49 #VRMS 0.44 However, after 1 refinement round in Phenix_Refine (Final: r_work = 0.4881 r_free = 0.5009) I got densities that are part good and part bad, and if I delete the bad parts and refine again, the good parts become bad. Please check the prints: http://fullonline.org/science/good_part_of_density.png http://fullonline.org/science/bad_part_of_density.png What is the explanation for these molecular replacement results? What else should I try? Arcimboldo takes 2 days+ to run and yields no good solution. Thank you! Regards, Napo To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1 To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
Re: [ccp4bb] problem with the symmetry water molecule
Hello Firdous You are seeing two because you are displaying crystallographic symmetry and you are seeing its symmetry mate. Coot only places one (check the PDB file) but displays the second generated by symmetry. It pays to place that water molecule as precisely as possible on the symmetry axis so that refinement programs will treat this as a special position water and eliminate the extra one - i.e. make it as close as possible to its symmetry mate. Phil Jeffrey Princeton On 3/11/19 12:09 PM, Firdous Tarique wrote: Hello everyone I am having a difficult time fitting a water molecule which is right at the centre of symmetry. Every time I am trying to fit one water molecule it fits two because of the symmetry atom is at the same place. What is the best way to solve this problem? I am talking about the water molecule where two molecules are paced at one place (4th position in the semicircle having both pink and purple). Thanks Firdous Screen Shot 2019-03-11 at 12.00.21 PM.png To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
Re: [ccp4bb] translational NCS & twinning
Donghyuk The combination of two things gives me cause for concern: 1. You've reindexed something that apparently scaled OK in point group 622 into point group 2, with a smaller cell. Since it's hard to fake that sort of data agreement in 622, I assume your data is at the very least pseudo-622. 2. You've modeled that additional symmetry using a whole array of twin operators and some non-crystallographic symmetry. This may in fact be the correct model, but there's a significant risk that you're inappropriately modeling something. Let's assume for a moment that your small-cell C2 refinement with 6 twin ops improves the model somewhat. Now go back and generate scaled data sets in all possible point groups suggested by Pointless for that data, and try and find molecular replacement solutions with your partially-refined model by testing every possible space group in every possible point group based on the Pointless suggestions. The idea is to try and model more of the 622 pseudo-symmetry as crystallographic symmetry, using fewer twin operators in refinement. If you test all of these combinations, which might be quite extensive, you might find one or more that fit nearly as well as your current C2 solution and has higher symmetry. You should take a hard look at that those potential solutions. Only when you've thoroughly exhausted alternative molecular replacement solutions would you be confident that your C2 model is in fact the only reasonable explanation of your data. But as it stands it is rather atypical and it warrants further investigation Additional evidence that your C2 cell is the only reasonable model would be identifiable electron density differences between each chain in your (presumed) multi-chain model. Cheers Phil Jeffrey Princeton On 1/10/19 11:08 AM, Donghyuk Shin wrote: Dear Jacob Keller and Vipul, Thank you both very much for the reply. Regarding the R-values, I am just wondering whether the huge gap between refinements w/ w/o twin operator can be possible even the crystal is not twinned? Best wishes, Donghyuk To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1 To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
[ccp4bb] Refmac: removing selected ligand hydrogens after making a link
I've got a couple of instances where I have non-standard amino acids, nevertheless present in the monomer dictionary, that have additional non-peptide covalent linkages. I've figured out how to define these, but if I opt to output hydrogens as a diagnostic I see that Refmac doesn't delete the ligand hydrogens that were present at the linkage point. Nothing catastrophic happens in refinement but extra atoms lying along other covalent bonds makes me a little queasy. Is there something (non-obvious) in additional user-defined .cif library that I can use to do this ? Do I simply define a new version of the monomer (w/o errant hydrogen) and hope that it overwrites the previous definition ? I'm doing this at borderline atomic resolution. Thanks Phil Jeffrey Princeton
Re: [ccp4bb] high TFZ score but 50% of Rfree
R=50% at a resolution of 2.2 Å is a lot different to 50% at, say, 3.5 Å resolution. What happens if you refine it at 3.0 or 3.5 Å ? What's the model vs sequence % identity ? Phil Jeffrey Princeton On 11/17/17 3:23 PM, Yue Li wrote: Dear all, I have several datasets (one best resolution reaching 2.2A), giving C2 space group, two molecules in an asymmetry unit (65.1% of solvent content). When running MR using a template (<20% sequence identify to the target molecule), I got a solution with high TFZ (23.7) and LLG (842). However, the Rfree sticks to 50% in the structural refinement using phenix. There is no complain in Xtriage - no twin, no translational NCS. I think that the structure solution looks reasonable, which can explain the three disulfide bond formations through the sequence threading. I tried to search three molecules in the asymmetry unit, but the final solution gives me two molecules. Do you have any suggestion for this high Rfree problem? Thank you very much for your help. All the very best, Simon
Re: [ccp4bb] AW: Another troublesome dataset (High Rfree after MR)
Rarely do I disagree with the wit and wisdom of James Holton, but R1 is not a property that Macromolecular World is unaware of. R1 is just Rwork. It's just R1 = Σ | |Fo| – |Fc| | / Σ |Fo| However e.g. George Sheldrick's SHELXL reports it based on a 4 sig(F) cutoff as well as on all data. Example: R1 = 0.0421 for 27579 Fo > 4sig(Fo) and 0.0488 for all 30318 data wR2 = 0.1153, GooF = S = 1.083, Restrained GooF = 1.083 for all data (this Small Molecule World structure is not yet finished) wR2 is a weighted R-factor based on |F|^2 See: http://shelx.uni-ac.gwdg.de/SHELX/shelxl_user_guide.pdf The CIF file stores the two different R1 values as: _refine_ls_R_factor_all 0.0488 _refine_ls_R_factor_gt0.0421 So, don't expect that labeling anything "R1" uniquely defines whatever sigma cutoff you are actually using. It's not implicit. You must specify it but preferably don't report it at all, and just use it for diagnostic purposes. Phil Jeffrey Princeton On 10/16/17 11:02 AM, James Holton wrote: If you suspect that weak data (such as all the spot-free hkls beyond your anisotropic resoluiton limits) are driving up your Rwork/Rfree, then a good sanity check is to compute "R1". Most macromolecular crystallographers don't know what "R1" is, but it is not only commonplace but required in small-molecule crystallography. All you do
[ccp4bb] Job Opening - Biophysics Facility Manager, Chemistry and Molecular Biology Depts, Princeton University
Biophysics Facility Manager Associate Professional Specialist Rank The Departments of Chemistry and Molecular Biology at Princeton University seek an Associate or more senior Professional Specialist to manage a new state-of-the-art Biophysics Core Facility. The successful candidate will play a leading role in equipping the Facility with the latest instrumentation and in interfacing with a vibrant community of faculty and research scientists. This individual will also serve as Analytical Spectroscopist for the Spectrometry and Small Instruments Core Facilities. Applicants must have a Ph.D. and demonstrated proficiency working with macromolecules using technologies such as analytical ultracentrifugation, surface plasmon resonance (SPR), microscale thermophoresis, and/or isothermal titration calorimetry (ITC). Applicants must apply online at https://www.princeton.edu/acad-positions/position/3421 and submit a cover letter, CV, and the names and email addresses of 3 references. Appointment is for one year with renewal contingent on satisfactory performance. This position is subject to the University’s background check policy. Princeton University is an Equal Opportunity/Affirmative Action Employer and all qualified applicants will receive consideration for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity or expression, national origin, disability status, protected veteran status, or any other characteristic protected by law. For questions not covered by the application URL above, email Prof. Hughson at hugh...@princeton.edu Cheers Phil Jeffrey Princeton
[ccp4bb] Assistant Professor position in Cryo-Electron Microscopy/Tomography at Princeton University
The Molecular Biology Department at Princeton University invites applications for a tenure-track faculty position at the Assistant Professor level. We are seeking a colleague whose research will leverage high resolution cryo-electron microscopy and/or cryo-electron tomography in the study of outstanding biological questions. The successful candidate will join a friendly, highly collaborative faculty and will have access to superb resources including a new 300 kV Titan Krios TEM. We seek faculty members with a strong commitment to teaching, mentoring, and fostering a climate that embraces both excellence and diversity. See the link: https://puwebp.princeton.edu/AcadHire/apply/application.xhtml?listingId=2821 Applications must be received by October 31, 2017. Note: I'm just the messenger and not involved at all in the search process. Phil Jeffrey Macromolecular crystallography facility manager Princeton
Re: [ccp4bb] Problem with a cell content
Hello Anna You've already found the correct number of molecules in the asymmetric unit. 21% Rwork is a quite respectable value for a structure at this resolution, and while 80% solvent is a relatively rare occurrence it's not unprecedented (a couple of years back I did one at 3.0Å with 75% solvent - PDB 4U6U). If you were missing half your asymmetric unit from your model, Rwork would be held up in the mid-30% range and there would be regions of relatively high difference density outside the model. Phil Jeffrey Princeton On 7/11/17 12:31 PM, Koromyslova, Anna wrote: Dear CCP4 members, I am working on a structure of a protein in complex with an antibody fragment (approx. 50kDa together). Molecular replacement with closely related proteins always comes up with one complex in the asymmetric unit, although MW of protein to which Matthews applies is 125kDa and corresponds to two complexes. Phaser gives two warnings: Large non-origin Patterson peak indicates that translational NCS is present. Solutions with Z-scores greater than 27.2 (the threshold indicating a definite solution) were rejected for failing packing test I couldn’t get a solution with two subunits although I have tried multiple combinations including only conserved parts of both proteins and different space groups including P1. Phenix Autobuild also yielded only one complex. So, the question is whether I can use that structure as is despite very high solvent content (80%) or should I try smth else. I would be very grateful for any suggestions. When the solution with a single complex is refined the statistics are the following: R-work 0.2129 R-free 0.2459 Matthews Coefficient: 6.22 Percentage Solvent: 80.22 Resolution range (Å) 48.34 - 2.9 (2.98 - 2.9) Space group P 62 2 2 Unit cell 167.45 167.45 143.538 90 90 120 Multiplicity 19.1 (18.3) Completeness (%)99.44 (94.39) Mean I/sigma(I) 24.59 (2.71) Wilson B-factor64.28 R-merge 0.1256 (1.186) R-meas 0.1291 CC1/2 0.999 (0.85) CC*1 (0.959) Thank you very much for your help, Anna Dr. Anna Koromyslova, Postdoctoral researcher German Cancer Research Center (DKFZ), F150 Im Neuenheimer Feld 242 D-69120 Heidelberg Germany
[ccp4bb] RuH3R system bits and pieces (USA)
I've just started decommissioning our aged Rigaku X-ray system prior to replacement and before I consign every bit to surplus and the scrap yard there's a possibility that someone else in the USA is nursing an old RuH3R/RaxisIV++ system and could find use for a board or TMP controller or ... The only item that might have residual value is the optical upgrade that we did 7 years ago - a Xenocs Fox2D Cu 25-25P multilayer. I have one anode rebuild kit that dates from mid-2013. I have an unopened (but decade+ old) box of filaments (CN4892V2). Please email me directly, not to the list. Thanks Phil Jeffrey Princeton
Re: [ccp4bb] Fwd: how to calculate a difference map between two heterodimers in heterotetrameric protein
Tricky - perhaps this could be viewed as "anti-averaging" methodologically. Use USF programs MAMA, IMP to generate a mask and optimize the NCS operator (or skip this step if you feel you know yours accurately) Use CCP4's MAPROT to rotate the map of one monomer onto the other Conceivably use USF program COMDEM to create the "difference map" assuming it will tolerate a weighting of -1. Or perhaps MAPROT will take a negative scale. Or MAPMASK, or there's a range of map manipulation programs that can scale an input map, but I've never tried to invert one. Unless you actually wanted Fourier coefficients it shouldn't be impossible to create a masked volume of the difference between two maps after rotating one of them. Phil Jeffrey Princeton On 3/10/17 5:24 PM, Oleg Zadvornyy wrote: Dear All, We are working on a tetrameric protein containing 2 heterodimers which are related to each other by 2 fold symmetry. There are difference at the active site between the two heterodimers in the crystals, and we would like to make a difference map to compare one heterodimer to the other. I would really appreciate your advise and suggestions how to perform this comparison. Thank you, Oleg -- Oleg A. Zadvornyy, PhD Institute of Biological Chemistry, Washington State University 237 Clark Hall Pullman, WA 99163 Tel: (509)-335-9837 Lab: (509)-335-1958
Re: [ccp4bb] CCP4BB Digest - 13 Jan 2017 to 14 Jan 2017 (#2017-15)
On 1/19/17 3:54 PM, Panneerselvam, Saravanan wrote: We observed additional density around ADP that fits perfectly like a gamma phosphate Hello Saravanan At 1.4 Angstrom resolution wouldn't that suggest that you've somehow got ATP in there ? I don't think I understand the other option - were you proposing a ADP-O-C(O)2 arrangement to explain the density ? Surely that has a rather different shape, considerably different scattering power at the center of the terminal group (C vs P) and probably different X-O bond lengths. All of these should show in the density maps at 1.4 Å, although the bond length issue could be quite subtle. Phil Jeffrey Princeton mimicking like ATP bound state, surrounded and coordinated by two metal ions(resolution is 1.4A). There is a change in space group (from I212121 to P212121 ) and further important conformation changes are observed around ATP binding pocket and distant region. This is the only xtal we obtained in this space group, and all other xtals(measured 10 xtals) from the same plate belong to I212121. Thanks your help and time! Saravanan
Re: [ccp4bb] OT: mapping PDB to mmCIF data quantities
Thanks Jose - I missed that one. REMARK 2 is somewhat ambiguous with: _refine.ls_d_res_high and _reflns.d_resolution_high although the former makes more sense and seems to be what corresponds to REMARK 2. Haven't yet seen an entry with only _reflns.d_resolution_high and not _refine.ls_d_res_high but there are several where the resolution of refinement is apparently significantly higher than the resolution of the source data: 1AU7, 1AW7 etc. Cheers Phil Jeffrey Princeton On 7/8/15 10:04 AM, Jose Manuel Duarte wrote: This looks like the mapping you are after: http://mmcif.wwpdb.org/docs/pdb_to_pdbx_correspondences.html It maps only the structured PDB data items to their equivalent mmCIF items. For instance REMARK 2 is not there, but REMARK 200 is. The resolution value should then be in REMARK 200 RESOLUTION RANGE HIGH (corresponding to mmCIF data item _refine.ls_d_res_high). Jose
Re: [ccp4bb] crystal habit/morphology and the relationship to unit cell contents
I would have thought that what the indexing routine defined as [001] vs [00-1] would be essentially random as one would obtain the equivalent indexing in 622 in both up and down alignment of the crystallographic a/b/c axes with respect to crystal morphology. Phil Jeffrey Princeton On 6/1/15 1:44 PM, Scott Lovell wrote: Hi Paul, If you have access to diffractometer equipped with a 4-circle goniometer, you should be able to index the faces of the crystals. All you need to do is collect some images to index the lattice and determine the orientation matrix. Most instruments have software that allows one to then orient specific faces or crystallographic directions relative to various directions of the instrument (eg. camera, phi axis, direct beam, etc). So after indexing, you could then orient the [001] direction of the crystal towards the camera to determine if this is the top or the base. You can also determine the direction of the a/b axes [100] and [010] relative to the crystal and index the other faces. If you can also measure the interfacial angles, this may help you to confirm the indices. If you do this for a number of samples, is the top face always the [001] direction or is it the [00-1] direction for other crystals? Assuming that you are growing these crystals by hanging drop, my guess is that the base is in contact with the coverslip during growth and you observe this half pyramid habit. If you were to grow the crystals using the floating drop method, to prevent contact with the plate materials, would the crystals form a bipyramidal habit? Or do you see crystals in the current drop that have the same habit but are not in contact with the plate materials? Scott -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Paul Paukstelis Sent: Monday, June 01, 2015 11:21 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] crystal habit/morphology and the relationship to unit cell contents I'm interested in knowing how to figure out the relationship between the unit cell contents and the crystal habit in these crystals (small attachment, two roughly orthogonal views). Space group is P64 (enantiomeric) , and you can clearly see the six-fold. The question becomes how to determine which direction the screw axis is going with respect to top and the base of the pyramidal crystals (right image) so I can gauge how/why the crystals grow this way based on the cell contents. Thanks in advance. --paul
Re: [ccp4bb] proton scattering by X-rays
Mark, In the small-molecule crystal structures I work with it's relatively common to see localized difference electron density along covalent bonds or in the places you'd expect to see lone pairs during refinement after you've fit and modeled the atoms reasonably well and the phases are pretty good. It's usually not as strong as difference density for hydrogens, before you put them in, but it's often pretty clearly visible once you have. (I use SHELXLE as an interface for small molecule refinements because of a somewhat Coot-like experience in viewing maps). Phil Jeffrey Princeton What you CAN do in fact is appropriately subtract spherical electron density from the experimental density and see what is left (i.e. directional ED that is 'surplus'). I tried to quickly find a paper on that, they exist, and they show that experimental density does confirm what we learn in chemistry class, orbitals are not imaginary. Mark
Re: [ccp4bb] Molecular Replacement model preparation
That document is fairly old and is in dire need of revision to reflect the modern arsenal of programs. Nevertheless: Putting the hinge axis along Z was a trick told to me by Steven Sheriff back in the days when we worked on Fab structures - which after all are classical examples of hinged molecules. One would search with separate domain fragments - split either side of the hinge - and the Z-orientation trick makes it easier to spot pairs of peaks from each search model that are related to each other. In the Fab world we searched with Fv models (VH:VL heterodimer) and CH1:CL constant region heterodimeric models. Peaks related solely by hinge motion would have similar alpha and beta angles and potentially different gamma (Crowther convention Eulerian angles). Historical note: this was back in the days when it was possible to remember the names of all the Fab fragments that were in PDB and their respective IDs. This ploy was more important in the days before Phaser or Molrep, which will now gleefully try a long list of rotation function peaks for you quite quickly, so manually parsing the list of rotation function peaks is rather unnecessary. And perhaps counter-productive. Split your molecule apart at the hinge, giving fragment1 and fragment2. Attempt to find both fragments independently. Choose the one that gives the best results: TFZ score or LLG score or discrimination between possible space groupr or whatever you like. Then, attempt to find the *other* fragment in the context of that first solution. Phil Jeffrey Princeton On 10/5/14 3:34 AM, Luzuokun wrote: Dear all, I’m doing molecular replacement using Phaser. My protein is predicted to have two domain with a “hinge” linking them. The model sequence identity is 0.27. But the MR result is poor. I’ve tried other programme (Molrep, MrBump, Balbes,,,_.) But no improvement was observed. I think that this is due to the “open” or “closed” conformation around the hinge. I was told that I could place the Z axis along the hinge (http://xray0.princeton.edu/~phil/Facility/Guides/MolecularReplacement.html), could anyone tell me more details about how to do next? Thanks! Lu Zuokun
Re: [ccp4bb] Extract Euler angles from fractional coordinate matrix
The orthogonal/fractional matrix is outlined here: http://www.iucr.org/__data/assets/pdf_file/0009/7011/19_06_cowtan_coordinate_frames.pdf Sorry to say I apparently ditched my old Fortran o2f and f2o programs to do that. Bear in mind, however, that orthogonal has no fixed orientation with respect to fractional - for most space groups ncode 1 is often used but for primitive monoclinic ncode 3 is sometimes used, and I think the matrix shown in Kevin Cowtan's document above corresponds to ncode 1. Phil Jeffrey Princeton On 9/4/14 3:55 PM, Chen Zhao wrote: I am sorry, just to clarify, the fractional coordinate matrix I referred to is a rotational matrix in the fractional coordinate system. On Thu, Sep 4, 2014 at 3:52 PM, Chen Zhao c.z...@yale.edu mailto:c.z...@yale.edu wrote: Hi all, I am just curious whether there are some tools extracting the Euler angles from a fractional coordinate matrix. I have no luck searching it online. Alternatively, I found the analytical solution for the Euler angles from an orthogonal coordinate matrix. So in the worst case, my problem reduces to calculating the transformation matrix between the fractional and orthogonal coordinate system. I feel a little bit at a loss because it is 6 years since I last studied linear algebra. How can I calculate this for a specific unit cell? Thanks a lot in advance! Sincerely, Chen
Re: [ccp4bb] Proper detwinning?
Chris, To change the axis ordering for e.g. changing which cell edge is the P21 B axis use an hkl matrix command. Probably can do this via the Macros during scaling but I distrust this and just edit scl.in by hand and run it as scalepack scl.in For k,l,h reindexing use hkl matrix 0 1 0 0 0 1 1 0 0 For l, h, k reindexing use hkl matrix 0 0 1 1 0 0 0 1 0 Systematic absences would be an anecdotal indicator that it is/isn't P212121. That would show strong systematic absences for (h,0,0), (0,k,0), (0,0,l) reflections. (Or reflexions if one prefers). While not impossible I would think it statistically unlikely to observe such absences if the data was really P1, P2 or P21. Going back to the images to eyeball the actual reflections on the display can be pretty illuminating. I don't remember Scalepack giving much detail in postrefinement but paying attention to the positional chi^2 values during integration might give clues about how far from 90 those unit cell axes are wandering if you try integrating in different space groups. There's also a method (or was, last time I tried it) to change the way Scalepack postrefines unit cell dimensions (value per frame or value per crystal) which might also help. More hacking of scl.in might be required. However I'm usually pretty happy if my R-free drops 12% at 2.0 Angstrom resolution when going from P21 to P1. I would look for legitimate deviations between previously identical monomers in the map and probably consider using NCS to reduce the random deviation between monomers that actually are identical by symmetry. You may have assigned the crystallographic 21 down the wrong unit cell axis in that P21 test case. Phil Jeffrey Princeton On 7/11/14 7:33 PM, Chris Fage wrote: Nat and Misha, Thank you for the suggestions. Xtriage does indeed detect twinning in P1, reporting similar values for |L|, L^2, and twin fraction as in P212121. The unit cell dimensions for the 2.0-A structure (P1) are: 72.050 105.987 201.142 89.97 89.98 89.94 P 1 The unit cell dimensions for the 2.8-A structure (P212121) are: 75.456 115.154 202.022 90.00 90.00 90.00 P 21 21 21 I have been processing in HKL2000, which only recognizes one set of unit cell parameters for each Bravais lattice (does anyone know how to change this?). Specifically, for a primitive monoclinic unit cell it estimates: 104.53 71.82 200.99 89.86 91.80 91.16 This is the unit cell which refined to Rwork/Rfree ~ 27%/34%. Indexing in mosflm gives three options for primitive monoclinic: 105.6 71.7 200.9 90.0 90.1 90.0 71.7 105.6 201.0 90.0 89.9 90.0 71.7 200.9 105.6 90.0 90.3 90.0 Attempting to integrate in any of these space groups leads to a fatal error in subroutine MASKIT. I can also use the index multiple lattices feature to get a whole slew of potential space group; however, integrating reflections leads to the same fatal error. Finally, Zanuda tells me that P212121 is the best space group, according to R-factors. However, I do not believe P212121 is the correct assignment. Best, Chris On 7/10/14, Isupov, Michail m.isu...@exeter.ac.uk wrote: I would recommend to run ZANUDA in the default mode from ccp4i or on CCP4 web server. ZANUDA has resolved several similar cases for me. Misha From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Chris Fage [cdf...@gmail.com] Sent: 10 July 2014 01:14 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Proper detwinning? Hi Everyone, Despite modelling completely into great electron density, Rwork/Rfree stalled at ~38%/44% during refinement of my 2.0-angstrom structure (P212121, 4 monomers per asymmetric unit). Xtriage suggested twinning, with |L| = 0.419, L^2 = 0.245, and twin fraction = 0.415-0.447. However, there are no twin laws in this space group. I reprocessed the dataset in P21 (8 monomers/AU), which did not alter Rwork/Rfree, and in P1 (16 monomers/AU), which dropped Rwork/Rfree to ~27%/32%. Xtriage reported the pseudo-merohedral twin laws below. P21: h, -k, -l P1: h, -k, -l; -h, k, -l; -h, -k, l Performing intensity-based twin refinement in Refmac5 dropped Rwork/Rfree to ~27%/34% (P21) and ~18%/22% (P1). Would it be appropriate to continue with twin refinement in space group P1? How do I know I'm taking the right approach? Interestingly, I solved the structure of the same protein in P212121 at 2.8 angstroms from a different crystal. Rwork/Rfree bottomed out at ~21%/26%. One unit cell dimension is 9 angstroms greater in the twinned dataset than in the untwinned. Thank you for any suggestions! Regards, Chris
Re: [ccp4bb] PDB passes 100,000 structure milestone
As long as it's just a Technical Comments section - an obvious concern would be the signal/noise in the comments themselves. I'm sure PDB would not relish having to moderate that lot. Alternatively PDB can overtly link to papers that discuss technical issues that reference the particular structure - wrong or fraudulent structures are often associated with refereed publications that point that out, and structures with significant errors often show up in that way too. I once did a journal club on Muller (2013) Acta Cryst F69:1071-1076 and wish that could be associated with the relevant PDB file(s). Phil Jeffrey Princeton On 5/14/14 1:37 PM, Gloria Borgstahl wrote: I vote for Z's idea On Wed, May 14, 2014 at 12:32 PM, Zachary Wood z...@bmb.uga.edu mailto:z...@bmb.uga.edu wrote: Hello All, Instead of placing the additional burden of policing on the good people at the PDB, perhaps the entry page for each structure could contain a comments section. Then the community could point out serious concerns for the less informed users. At least that will give users some warning in the case of particularly worrisome structures. The authors of course could still reply to defend their structure, and it may encourage some people to even correct their errors. Best regards, Z *** Zachary A. Wood, Ph.D. Associate Professor Department of Biochemistry Molecular Biology University of Georgia Life Sciences Building, Rm A426B 120 Green Street Athens, GA 30602-7229 Office: 706-583-0304 tel:706-583-0304 Lab: 706-583-0303 tel:706-583-0303 FAX: 706-542-1738 tel:706-542-1738 ***
Re: [ccp4bb] KD of dimerization, off topic
That's an extremely useful link - thanks to Will Stanley for posting that one. For a VP-ITC machine I'd guess that you need to load the injector with about 500ul of protein at a concentration of 80x the Kd or more. Notice that Alan Cooper was injecting 10 microliters of protein at 2mM with a 12 microMolar dissociation constant, per injection. You would probably want to maintain that approximate ratio - ~170 because it's mostly a question of measuring deltaH with a decent signal-to-noise per injection. I recall that it takes up to 500 microLiters to load the injection syringe on a VP-ITC without air gap between plunger tip and injection point - unless someone's got a nice trick to reduce that. The rule of thumb from the VP-ITC manual - and from practical experience on our machine here - for A+B = AB is using at least 10x the Kd in the sample chamber and about 80x the Kd in the injector. That's not exactly the same situation, but 80x vs 170x suggests the the considerations are much the same. Phil Jeffrey Princeton On 2/14/14 12:52 PM, Keller, Jacob wrote: What a nice idea this ITC dilution is--a great example of a wet lab technique learned en passant on the ccp4bb. I wonder what range of Kds could feasibly be measured with existing calorimeter sensitivities? JPK -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Will Stanley Sent: Friday, February 14, 2014 12:18 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] KD of dimerization, off topic Hi Careina, Since alternative methods are being suggested... ITC can be good for quantitating a monomer-dimer equilibrium by diluting dimers out from a concentrated solution (which obviously favours the dimer) - and presuming a reasonable Kon/Koff. Alan Cooper has kindly figured out the data fitting for the rest of us: http://www.chem.gla.ac.uk/staff/alanc/itcdil.pdf I think Alan was using a VP-ITC when he was doing this stuff. Lower volumes - and presumably concentrations if the KD is small enough - are feasible in an ITC200. The protein is recoverable anyway. All the best, Will.
Re: [ccp4bb] Fwd: undefined edensity blob at glutamine sidechain
Priyank, I think it's too far out from the Calpha to be Trp, and also not quite flat enough. The map contains more information than just the shape as to identity. Compare the max peak height of the 2Fo-Fc blob near the Gln to neighboring element heights. Much higher than C/N/O ? Probably not citrate, propanol etc Much higher than S ? Probably not Cl, Mg, S, SO4, PO4 K is only a little heavier than S. So you might cut down the range of possibilities considerably. Partially occupied elements can be an issue, but you could narrow your range of options. You can also do the expedient thing and drop an Au in there with e.g. 0.5 occupancy (or a range of occupancies) and see what the refinement does. Phil Jeffrey Princeton On 12/10/13 7:44 AM, PriyankMaindola wrote: dear members i am trying to solve this crystal structure but I am puzzled with an undefined blob that appeared at a glutamine residue after refinement. I have attached pics of that below. Is it a covalent modification of acid-amide side chain.. ... as there is no charged environment around and density seems continuous. please suggest following reagents were encountered by protein during purification, crystallization and soaking : phenyl methyl sulfonyl fluoride benzamidine tris dtt (could it be cyclized dtt?) k[au(cn)2] acidic pH isopropanol citrate sulfate phosfate K+, Na+, Cl- map contour: 2fo-fc: 1rmsd fo-fc (green): 3 rmsd -- *Priyank*
Re: [ccp4bb] How can I find the other molecule in the asymmetric unit?
Meisam: Probabilities are just that: many of us have had structures with large solvent contents that are statistically unlikely. Pedantic quibble: It scales in P21 Space group with 7% linear Rfactor. really means that it scales in primitive monoclinic with a reasonable Rsymm, and I hope you also checked P2 as well as P21 when doing molecular replacement. P2 is rare, but not unprecedented. When you say refine just the back bone do you mean you're refining just a poly-ALA model or a non-mutated one ? Because if so, absent the side-chains and any waters, an R-factor of 31% quite good. If so, add side-chains and then waters and continue refining and see how things look. 3-fold averaging across the current monomers plus your decent resolution should make the sequence interpretation straightforward. But: * if you can see interpretable density for the fourth molecule, build in secondary structure elements, refine those, repeat until you can see a substructure you recognize, then place the monomer manually. * use Arp/wArp to autobuild your structure - this has the benefit that often the map you'll get out will be a very good one even if you build the remaining monomer manually. If you're lucky it'll build it all for you. * Autobuild in Phenix can also do much the same (I would do the first two, and perhaps all three, in parallel until one emerges a clear winner) * if the above still doesn't resolve things, consider the possibility that the fourth molecule is not what you think it is, or may be statistically disordered Phil Jeffrey Princeton On 11/21/13 12:35 PM, Meisam wrote: Dear CCP4ers I have a data set that diffracts 1.96 Å. It scales in P21 Space group with 7% linear Rfactor. The Mattew coefficient gives 10% probability for 3 molecules in the asymmetric unit, 53% for 4 molecules, and 36% for 5 molecules. Molecular replacement just finds 3 molecules in the asymmetric unit. Running Phaser also gives a partial solution with 3 molecules. When I refine just the back bone of the protein for the 3 molecules the Rfree/Rwork does not go better than 34% / 31%, and when I run the molecular replacement on the refined structure again, and I fix it as a model to search for another molecule, it still does not find it. I have attached a photo to show the density for the fourth molecule in the asymmetric unit. What is the solution to this? Thanks in advance for your help Meisam
Re: [ccp4bb] Orientation of molecules
* Open molecular replacement solution in Coot * Display crystal packing (DrawCell Symmetry), perhaps as Calphas only * Find the symmetry-related instance of copyB that is in the correct position relative to copyA according to your preferences * Use FileSave Symmetry Coordinates to write the structure transposed by that operator (note: select the menu option, then click on the copyB instance) * Since Coot will write the entire structure transposed by that symop, assemble the desired solution from copyA from the mol.rep. solution and copyB from the transposed solution. I'm a Luddite so I use emacs and/or grep for this. Phil Jeffrey Princeton 11/21/13 1:11 PM, Appu kumar wrote: Dear All, I think i have not explained my problem precisely. This may be weird one but let me elaborate more. I have have a protein moleculeA, having N-term, and C-term end. Structurally, it is dimer with anti-parallel arrangement i.e N-terminal of one copyA of molecule form dimer in such a way that it copyB would be arranged in antiparallel fashioned (N-term of copyA is besides C-term of CopyB). So when i am searching for two copy of molecule in phaser it is giving me two copy of molecule in parallel arrangement. So my question is, how to tell phaser that after fixing the orientation of first copy, to change the orientation of 2nd copy with respect to first one so that their n-teminal and c-terminal lies beside each other. I am looking for your valuable suggestion. Thank you
Re: [ccp4bb] Weird MR result
Hello Niu, 1. We need extra information. What program did you use ? What's the similarity (e.g. % identity) of your model. What's your space group ? Did you try ALL the space groups in your point group in ALL the permutations (e.g. in primitive orthorhombic there are 8 possibilities). 1a. My best guess on limited info is that you've got a partial solution in the wrong space group with only part of the molecules at their correct position. 2. I recently had a very unusual case where I could solve a structure in EITHER P41212 or P43212 with similar statistics, but that I would see interpenetrating electron density for a second, partial occupancy molecule no matter which of these space groups I tried (and it showed this when I expanded the data to P1). Might conceivably be a 2:1 enantiomorphic twin, in retrospect, but we obtained a more friendly crystal form. I hope you don't have something like that, but it's possible. Phil Jeffrey Princeton On 11/14/13 5:22 PM, Niu Tou wrote: Dear All, I have a strange MR case which do not know how to interpret, I wonder if any one had similar experiences. The output model does not fit into the map at all, as shown in picture 1, however the map still looks good in part regions. From picture 2 we can see even clear alpha helix. I guess this could not be due to some random density, and I have tried to do MR with a irrelevant model without producing such kind of regular secondary structure. This data has a long c axis, and in most parts the density are still not interpretable. I do not know if this is a good starting point. Could any one give some suggestions? Many thanks! Best, Niu
Re: [ccp4bb] Problematic PDBs
From the original ABC transporter retraction: http://www.sciencemag.org/content/314/5807/1875.2.full The Protein Data Bank (PDB) files 1JSQ, 1PF4, and 1Z2R for MsbA and 1S7B and 2F2M for EmrE have been moved to the archive of obsolete PDB entries You can get your hands on them via URLs like: ftp://ftp.rcsb.org/pub/pdb/data/structures/obsolete/XML/js/1jsq.xml.gz Phil Jeffrey Princeton On 10/17/13 10:26 AM, Nat Echols wrote: On Thu, Oct 17, 2013 at 6:51 AM, Lucas lucasbleic...@gmail.com mailto:lucasbleic...@gmail.com wrote: I wonder if there's a list of problematic structures somewhere that I could use for that practice? Apart from a few ones I'm aware of because of (bad) publicity, what I usually do is an advanced search on PDB for entries with poor resolution and bound ligands, then checking then manually, hopefully finding some examples of creative map interpretation. But it would be nice to have specific examples for each thing that can go wrong in a PDB construction. This would be a good place to start: http://www.ncbi.nlm.nih.gov/pubmed/23385452 The retracted ABC transporter structures are also good, although less obvious to the untrained eye. I forget what the PDB IDs are but I'll see if I can dig them up. -Nat
Re: [ccp4bb] A case of perfect pseudomerehedral twinning?
Hello Yarrow, Since you have a refined molecular replacement solution I recommend using that rather than global intensity statistics. Obviously if you solve in P21 and it's really P212121 you should have twice the number of molecules in the asymmetric unit and one half of the P21 asymmetric unit should be identical to the other half. Since you've got decent resolution I think you can determine the real situation for yourself: one approach would be to test to see if you can symmetrize the P21 asymmetric unit so that the two halves are identical. You could do this via stiff NCS restraints (cartesian would be better than dihedral). After all the relative XYZs and even B-factors would be more or less identical if you've rescaled a P212121 crystal form in P21. If something violates the NCS than it can't really be P212121. Alternatively you can look for clear/obvious symmetry breaking between the two halves: different side-chain rotamers for surface side-chains for example. If you've got an ordered, systematic, difference in electron density between the two halves of the asymmetric unit in P21 then that's a basis for describing it as P21 rather than P212121. However if the two halves look nearly identical, down to equivalent water molecule densities, then you've got no experimental evidence that P21 with 2x molecules generates a better model than P212121 than 1x molecules. An averaging program would show very high correlation between the two halves of the P21 asymmetric unit if it was really P212121 and you could overlap the maps corresponding to the different monomers using those programs. Phil Jeffrey Princeton
Re: [ccp4bb] mmCIF as working format?
On 8/7/13 8:27 PM, Ethan Merritt wrote: That would be a bug. But it hasn't been true for any version of coot that I have used. As you say, this is a common thing to do and I am certain I would have noticed if it didn't work. I just checked that it isn't true for 0.7.1-pre. Thanks. Turns out I'm using 0.7 and 0.7-pre on the octacore Mac and the laptop I use for building - slightly different versions updated at different times. I'll change versions. Apropos the other point I invariably do segment reordering via xemacs cut and paste although clearly Peek2 needs a reorder command. Phil
Re: [ccp4bb] mmCIF as working format?
Questionable practice is writing an interpretation program for operations that can be handled simply at the command line. Programs that use the API that Eugene implicitly refers to are no panacea, e.g. Coot has strange restrictions on things like changing the chain label that can be fixed in a matter of seconds by editing the PDB file in e.g. xemacs. Which means that when I'm building a large structure with multiple chain fragments present during the build process, I've edited those intermediate PDB files tens of times in a single day. While alternative programs exist to do almost everything I prefer something that works well, works quickly, and provides instant visual feedback. CCP4 and Phenix are stuck in a batch processing paradigm that I don't find useful for these manipulations. While PDB is limited and has a lot of redundant information it's for the latter reason it's a rather useful format for quickly making changes in a text editor. It's certainly far faster than using any GUI, and it's also faster than the command line in many instances - and I have my own command line programs for hacking PDB files (and ultimately whatever formats come next) Using mmCIF as an archive format makes sense, but I doubt it's going to make building structures any easier except for particularly large structures where some extended-PDB format might work just as well or better. Phil Jeffrey Princeton On 8/5/13 9:53 AM, Pavel Afonine wrote: Editing (for example, PDB files) by hand is a questionable practice. If you know programming use either existing reliable parsers (available for both, PDB and CIF) or write your own jiffy.
Re: [ccp4bb] HKL2000 sigma cutoff
Ursula, I/sigI of -3 as I recall. Are you sure that the downstream programs you are using aren't the ones applying the cutoff ? Scalepack is, in general, perfectly happy to write negative intensities to output.sca and certainly is doing so as of HKL3000. Perhaps you need to use the TRUNCATE YES option in Truncate ? Does the output MTZ from Scalepack2mtz show the number of reflections you expect ? Phil Jeffrey Princeton On 7/5/13 3:24 PM, Ursula Schulze-Gahmen wrote: Sorry for the non-CCP4 question. I am confused about the sigma cutoff used by HKL2000 for scaling. I scaled a data set to 3.0 A resolution. I collected a complete dataset to 2.8A, but the I/sigma is about 1.0 at 3.0 A. The scaling logfile in HKL2000 shows 100% completeness in the highest resolution shell, but about 50% of the reflections are below I/sigma =0 in the highest resolution shell. I am guessing that these negative reflections are not being written out, because the output file from HKL200 does not have 100% completeness anymore. I would like to include these negative reflections. Is there a setting in HKL2000 that I can change or do I need to switch to a different program. Ursula
Re: [ccp4bb] High Rwork/Rfree values
Haiying, As far as I can tell you've got a successful solution in molecular replacement via Phaser and then gone and refined it in the wrong space group. Based on what you've told us: you took your initial data in primitive orthorhombic and solved for the structure in Phaser while sampling all possible space groups. Phaser is telling you that your *original* data indexing is truly space group P22(1)2(1) and if you take that m.r. solution/data combination and simple *assign* the space group it should work in Refmac. In fact Phaser should have written the correct space group in the PDB file header. If you refine your original MTZ native data file with the PDB file Phaser wrote, what do you get ? You seem to have reindexed the data but not rotated the model (or re-run molecular replacement). That makes the model and data out-of-sync. Phaser does not reindex the data internally, and that's why it tries eight space groups in primitive orthorhombic rather than just the minimal set P222, P222(1), P2(1)2(1)2, P2(1)2(1)2(1). The others that it tries are alternative settings of these space groups (where appropriate). If you want to refine in P2(1)2(1)2 then reindex the data (h,k,l) - (k,l,h) and re-run molecular replacement with the reindexed MTZ file. If the above is a misinterpretation of what you wrote, my alternative advice on this is: 1. throw the thing at Arp/wArp and look hard at the maps you get out. The structure might have changed more than you thought. 2. rescale the data in P1 and put it into Pointless and/or Xtriage to check for twinning and point group assignment 3. I'm fairly sure that the (72.6, 78.0, 112.5) and (66.5, 70.5, 137.0) cells are unrelated but #2 will show that. 4. If all else fails solve it in P1 and find the space group by inspection afterwards Phil Jeffrey Princeton
Re: [ccp4bb] refinement hanging--what am I missing?
Pat, Try TLS - I usually don't invoke it at this type of resolution but in one case I saw it make a surprisingly significant improvement. I would also be tempted to put the structures through Arp/wArp and see if it lowers the R-free any more - rightly or wrongly I view this as the lowest reasonably achievable R-factor with isotropic modeling - and especially look at the maps after it has finished in case it shows up anything you had missed. When I had P21 - P2x212x twinning the R-free held up in the mid-30's at 2 Angstrom resolution so absent any indications in Truncate or Xtriage I wouldn't suggest that. A final question is how much disordered structure is missing from your models ? Could a partly ordered but unmodeled segment be driving up R-free ? Cheers Phil Jeffrey Princeton On 4/26/13 5:38 PM, Patrick Loll wrote: Hi all, Here is a problem that's been annoying me, and demanding levels of thought all out of proportion with the importance of the project: I have two related crystal forms of the same small protein. In both cases, the data look quite decent, and extend beyond 2 A, but the refinement stalls with statistics that are just bad enough to make me deeply uncomfortable. However, the maps look pretty good, and there's no obvious path to push the refinement further. Xtriage doesn't raise any red flags, nor does running the data through the Yeates twinning server. Xtal form 1: P22(1)2(1), a=29.0, b=57.4, c=67.4; 2 molecules/AU. Resolution of data ~ 1.9 Å. Refinement converges with R/Rfree = 0.24/0.27 Xtal form 2: P2(1)2(1)2(1), a=59.50, b=61.1, c=67.2; 4 molecules/AU. Resolution of data ~ 1.7 Å. Refinement converges w/ R/Rfree = 0.21/0.26 As you would expect, the packing is essentially the same in both crystal forms. It's interesting to note (but is it relevant?) that the packing is quite dense--solvent content is only 25-30%. This kind of stalling at high R values smells like a twin problem, but it's not clear to me what specific kind of twinning might explain this behavior. Any thoughts about what I might be missing here? Thanks, Pat --- Patrick J. Loll, Ph. D. Professor of Biochemistry Molecular Biology Director, Biochemistry Graduate Program Drexel University College of Medicine Room 10-102 New College Building 245 N. 15th St., Mailstop 497 Philadelphia, PA 19102-1192 USA (215) 762-7706 pat.l...@drexelmed.edu
Re: [ccp4bb] Off-topic: PDB statistics
From my own db program: Number of entries in histogram: 711 Total number of instances : 78467 0 48249 0.6149 MOLECULAR REPLACEMENT 1 8557 0.1091 NULL 2 5632 0.0718 SAD 3 5128 0.0654 MAD 4 3600 0.0459 FOURIER SYNTHESIS 5 1762 0.0225 OTHER 6 1171 0.0149 MIR 7 511 0.0065 SIRAS 8 505 0.0064 DIFFERENCE FOURIER 9 392 0.0050 MIRAS 10 229 0.0029 AB INITIO 11 226 0.0029 MR 12 151 0.0019 RIGID BODY REFINEMENT 13 146 0.0019 ISOMORPHOUS REPLACEMENT 14 110 0.0014 AB INITIO PHASING 15 109 0.0014 MULTIPLE ISOMORPHOUS 1683 0.0011 N/A 1775 0.0010 SIR 1870 0.0009 RIGID BODY 1964 0.0008 DIRECT METHODS 2050 0.0006 RE-REFINEMENT USING 2137 0.0005 DIFFERENCE FOURIER PLUS 2236 0.0005 ISOMORPHOUS 2334 0.0004 REFINEMENT 2430 0.0004 MOLREP 2526 0.0003 SE-MET MAD PHASING 2625 0.0003 RIGID-BODY REFINEMENT 2724 0.0003 ISOMORPHOUS METHOD etc It's a very heterogeneous field, that REMARK 3 field, and the ones above are the most dominant entries (note the 8,557 that are NULL that are in fact crystal structures). At least in some versions of ADIT the guidance that RCSB gives about this field is very weak, which accounts for the variation. I'm interested in what ab initio phasing really means, but I've been too lazy to mine the actual entries for details. Phil Jeffrey Princeton On 4/15/13 9:48 AM, Raji Edayathumangalam wrote: Hi Folks, Does anyone know of an accurate way to mine the PDB for what percent of total X-ray structures deposited as on date were done using molecular replacement? I got hold of a pie chart for the same from my Google search for 2006 but I'd like to get hold of the most current statistics, if possible. The PDB has all kinds of statistics but not one with numbers or precent of X-ray structures deposited sorted by various phasing types or X-ray structure determination methods. For example, an Advanced Search on the PDB site pulls up the following: Total current structures by X-ray: 78960 48666 by MR 5139 by MAD 5672 by SAD 1172 by MIR 94 by MIR (when the word is completely spelled out) 75 by SIR 5 by SIR (when the word is completely spelled out) That leaves about 19,000 X-ray structures either solved by other phasing methods (seems unlikely) or somehow unaccounted for in the way I am searching. Maybe the way I am doing the searches is no good. Does someone have a better way to do this? Thanks much. Raji -- Raji Edayathumangalam Instructor in Neurology, Harvard Medical School Research Associate, Brigham and Women's Hospital Visiting Research Scholar, Brandeis University
Re: [ccp4bb] refinement protein structure
That's quite brave - shipping your entire structure to people that could be actual competitors. But it was fun to play at 1.4 Angstrom over lunch. Practical points: * not everyone loves 12Mb of attachments in one email in their inbox, so if you do this again please put the files on a webserver and point us there Structural points: * the map looks pretty good, but I think the sequence is misassigned in some regions (e.g. A118-A122 etc). Automation is a good tool but a poor master, and extreme caution is required before taking the results too literally. Usually you'd expect a 1.4 Angstrom to be easy to autobuild but I recently had a sequence misassignment at just that resolution. That map was trivial to interpret with the correct sequence however - one of the joys of working with Arp/wArp at 1.4 Angstrom. * the large number of positive difference density blobs and water molecules clustered in what otherwise would be the solvent void strongly suggest that there's a second molecule present. If I take redfluorescentprotein_refine_10.pdb (waters removed) and exptl_fobs_phases_freeR_flags.mtz and ask Phaser to look for two molecules, it finds them quite successfully. (for the record an LLG of 15111 using nominal sequence identity of 90%). I will send this to you off-list. Please note that Phaser is using a different origin for this molecular replacement solution so the coordinates and your previous map do not overlap. This rather nicely explains why your structure had an R-factor in the 40's despite being a half-way decent model. The new MR solution has an R-free in the 30's in the phenix.refine job I'm running right now. Going forward I suggest you utilize the Arp/wArp program to autobuild your structure for you, starting from the molecular replacement solution (or, perhaps with it stripped to ALA). While you could use Autobuild, this is the CCP4 list and so you should use CCP4 programs. Phil Jeffrey Princeton On 3/27/13 12:22 PM, Tom Van den Bergh wrote: Dear members of ccp4bb, I need some help with the refinement of my structure of a variant of mRFP (monomer red fluorescent protein, sequence in attachment). I have done molecular replacement with phaser with model 2VAD of protein database. Then i have done some model building phenix.autobuild. (2 pdb's (overall...), freeR flags and log file attached) When i refine with phenix.refine my structure i get a R-value of 0,42 which is still way too high. (redfluorescent protein.pdb, .mtz and logfile attached) When i look at the structure in coot i find many unmodelled blobs and many outliers in density analysis and rotamer analysis. The problem is that there are so many problems with my structure, that i dont know where to begin. Could you try some refinement for me, because this is first structure that i need to solve as a student and i dont have too many experience with it. Greetings, Tom
Re: [ccp4bb] vitrification vs freezing
Perhaps it's an artisan organic locavore fruit cake. Either way, your *crystal* is not vitrified. The solvent in your crystal might be glassy but your protein better still hold crystalline order (cf. ice) or you've wasted your time. Ergo, cryo-cooled is the description to use. Phil Jeffrey Princeton On 11/15/12 1:14 PM, Nukri Sanishvili wrote: s: An alternative way to avoid the argument and discussion all together is to use cryo-cooled. Tim: You go to a restaurant, spend all that time and money and order a fruitcake? Cheers, N.
Re: [ccp4bb] Compounded problems for molecular replacement
Hello Jose, Depending on what data integration program you used, trying XDS may help you out a little with spot overlap. Example #3 in my rather out-of-date page: http://xray0.princeton.edu/~phil/Facility/Guides/MolecularReplacement.html illustrates how you could find 8 domains, especially if you pay attention to the rotation angle values for the candidate domain solutions. This example did not have twinning but did have a little pseudo-centering. This is a 15 year-old example from back when I was using AMORE, so I should clearly rewrite that page. Additionally, if the inter-domain flexibility is restricted to rotation about a single axis, it would be a good idea to rotate your model so that this rotation axis is parallel to the Z axis. This was a method that was exploited with Fab structures (whose elbow angle is a fairly restricted rotation). If so oriented, rotation function peaks relating different domains in the same molecule should show very similar alpha, beta and differ in gamma. Good luck, Phil Jeffrey Princeton On 10/26/12 8:27 AM, Seijo, Jose A. Cuesta wrote: Hi all, I am dealing with a molecular replacement problem for a 60KDa protein composed of 2 rigid domains joined by a flexible linker which can move relative to each other. Sequence identity for my best model is 46% evenly spread, so in principle this should be a tractable problem. Then the problems start to pile up: a)The unit cell is 56.7Å, 288.5Å, 69.4Å, 90 93.5, 90. Spacegroup P21. Rmerge 12% to 2.4Å. The data also merges relatively well (Rmerge 16%) in P222 with the same a, c and b axes, now of course in that order. In the P21 case, that corresponds to 4 monomers in the asymmetric unit with a solvent content of approx. 50%, giving me 8 domains to find if I separate them. b)The 288 axis means that my data show some overlap in almost all orientations (might be corrected in the future with new datasets), so that my low resolution data are likely unreliable. c)Intensity distributions suggest twinning in either point groups. Actually, they are beyond the perfect twinning case, which I attribute to the reflection overlaps making the strong reflections weaker (integration box too small) and the small stronger (from tails of adjacent strong ones). Of course the latest would mean that the twin fraction estimation is unreliable, but all moments, etc show perfect twin statistics, so I am assuming that there is indeed perfect twinning of some sort. So, the question is, what is the best strategy to deal with this many (4 or 8) body / noisy / twinned problem? I am trying EPMR with many bodies, but I suspect the twinning would throw it out of the right track, and one domain seems to be too little of the diffracting matter to show any sort of discriminations between solutions and non-solutions if do the usual serial searches. I plan to let autotracing programs be the judge of success, but I am not sure of how well those can deal with twinning. Can Arp-Warp use twinned data? Thanks in advance for any tips. Jose. Jose Antonio Cuesta-Seijo, PhD Carlsberg Laboratory Gamle Carlsberg Vej 10 DK-1799 Copenhagen V Denmark Tlf +45 3327 5332 Email josea.cuesta.se...@carlsberglab.dk
Re: [ccp4bb] Determining Rpim value
If the alternative to reprocessing your data with XDS, iMosflm, Xia2, autoProc etc is unpalatable, might I suggest the nearly-as-unpalatable method as follows: If you can still run Scalepack on all your .x files, put the line NO MERGE ORIGINAL INDEX in the scalepack script file. Get the .sca or .hkl file out of that. Use the following - strictly no warranties - script: # Assumes scalepack.hkl is created with NO MERGE ORIGINAL INDEX # # pointless SCAIN scalepack.hkl HKLOUT scalepack.mtz EOF NAME PROJECT mydata CRYSTAL mydata1 CELL 75.2 75.2 135.890.00090.00090.000 EOF # # scala hklin scalepack.mtz hklout scala.mtz \ scales scala.scales \ rogues scala.rogues \ normplot scala.norm \ anomplot scala.anom EOF bins 20 resolution 2.9 run 1 all resolution run 1 high 2.9 name run 1 project AUTOMATIC crystal DEFAULT dataset scalepack scales constant exclude sdmin 2.0 sdcorrection fixsdb noadjust norefine both 1.0 0.0 anomalous off EOF Corrections, comments or outright repudiation of this script quite welcome - this was my first attempt. Phil Jeffrey Princeton On 9/4/12 6:14 PM, Michelle Deaton wrote: I am trying to obtain an Rpim (precision indicating merging Rfactor) value for a dataset that I have already processed with HKL2000/Scalepack and refined. Is there a straightforward way to obtain this value from my data? From what I understand, most of my options involve going back and obtaining unmerged intensities. I am hoping there may be a way for me to avoid having to backtrack that far, as this data is now very far along in the refinement process. Thank you, Michelle Deaton University of Denver Department of Chemistry and Biochemistry
Re: [ccp4bb] value of observed criterion sigma
HKL2000 does not have an observed criterion sigma (F) since Scalepack deals with intensities. Leave that entry blank. Scalepack uses observed criterion sigma (I) = -3 On question #2 you always want to quote the statistics (completeness, Rsym, I/sigI etc) for the highest resolution shell but I'm not sure it makes any sense to report it for the lowest resolution shell unless your data is unusually incomplete there. The default for PDB REMARK 200 is just the high resolution shell and the overall values for the entire dataset. Also be aware that last time I checked the I/sigI reported by Scalepack in the log file is I/sigma(I) and not I/sigma(I) for the shell. The PDB format in REMARK 200 wants the latter. One of these days one hopes RCSB might include Rmeas in REMARK 200. Phil Jeffrey Princeton On 7/31/12 8:54 AM, Faisal Tarique wrote: Dear all i have two basic queries 1) i have processed my data in HKL 2000 and during pdb submission i need to know the value of observed criterion sigma (F) and observed criterion sigma (I). 2) during entering data in category resolution shell whether one needs to mention the statistics of each and every resolution shell or only two entries i.e. the maximum resolution and minimum resolution entry is enough in the whole columns. -- Regards Faisal School of Life Sciences JNU
[ccp4bb] SCALA keywords for merging Scalepack (no merge original index) data ?
I'm not exactly a Scala veteran so am looking for advice as to what would be the best way to run Scala in the following scenario: * data integrated with Denzo * data scaled with Scalepack and output with NO MERGE ORIGINAL INDEX * .sca data imported into MTZ via Pointless Do I just use the SCALA keywords: run 1 all onlymerge anomalous off or would there be a better set of commands ? I've toyed with: sdcorrection norefine scales constant bfactor off reject merge 4 anomalous off which produces similar results. no merge original index data is already scaled and Lp-corrected so I want to avoid applying things twice. Thanks Phil Jeffrey Princeton
Re: [ccp4bb] P21221 to P21212 conversion
The program that does the indexing in HKL is Denzo. Denzo doesn't care about the space group. It cares about the point group (cf. Ethan's point) and the cell dimensions, because it integrates the data without regard to the symmetry expressed in the intensities - however it does take notice of the restrictions placed on cell dimensions by point groups. Denzo therefore picks primitive orthorhombic cells in abc. Scalepack scales the integrated data but does not reindex the data if you tell it the space group is P22121. Therefore unit cell choice in HKL is by default driven by cell edge size. Scalepack has the ability to reindex the data, for those of us that like to work in P21212 rather than P22121. On Mon, May 7, 2012 at 3:33 PM, Ethan Merritt Scaling is done in a point group, not a space group. My quibble with this statement is that the output reflection data from Scalepack differs depending on what space group you tell it, since systematic absences along h00, 0k0 and 00l in P2x2x2x are not written out. The number of reflections affected is quite small, of course. Phil Jeffrey Princeton On 5/7/12 4:48 PM, Jacob Keller wrote: Is it true that HKL adopts the naming convention of putting the screw axes first and then naming abc if possible, whereas CCP4 just makes the cell abc? E.g., would HKL ever output by default a p22121 dataset, or would it automatically be p21212? JPK On Mon, May 7, 2012 at 3:33 PM, Ethan Merritt merr...@u.washington.edu mailto:merr...@u.washington.edu wrote: On Monday, May 07, 2012 01:09:25 pm Shya Biswas wrote: Hi all, I was wondering if anyone knows how to convert the P21221 to P21212 spacegroup in HKL2000. I scaled the data set in P21212 in HKL 2000 but I got a correct MR solution in P21221 spacegroup. Shya: Scaling is done in a point group, not a space group. The point group P222 contains both space groups P2(1)22(1) and P2(1)2(1)2, so your original scaling is correct in either case. It is not clear from your query which of two things happened: 1) The MR solution kept the same a, b, and c axis assignments but made a different call on whether each axis did or did not correspond to a 2(1) screw. In this case you don't need to do anything to your files. Just make sure that you keep the new space group as you go forward into refinement. 2) The MR solution kept the orginal screw-axis identifications but permuted the axes to the standard setting (non-screw axis is labelled c). In this case you will need to construct a file containing the permuted indices. For example, the reflection originally labeled (h=1 k=2 l=3) is now (h=3 k=1 l=2). There are several programs that can help you do this, including the HKL2000 GUI. But you do not need to go back into HKL if you don't want to. You could, for example, use the ccp4i GUI to select - Reflection Data Utilities - Reindex Reflections Define Transformation Matrix by entering reflection transformation h=l k=h l=k Ethan I have a script file that runs with scalepack but was wondering if there is an easier way to do it with HKL2000 gui mode. thanks, Shya -- Ethan A Merritt Biomolecular Structure Center, K-428 Health Sciences Bldg University of Washington, Seattle 98195-7742 -- *** Jacob Pearson Keller Northwestern University Medical Scientist Training Program email: j-kell...@northwestern.edu mailto:j-kell...@northwestern.edu ***
Re: [ccp4bb] indexing(?) question in P21
Wolfram, Did you solve these structures independently by molecular replacement ? It sounds like your two solutions might be related by alternative origins (0,1/2 along a,c). If you translate the second example along the a axis by -a/2 does it refine with similar R-factors ? Phil Jeffrey Princeton On 4/19/12 1:35 PM, wtempel wrote: Hello all, I am puzzled by this situation: I have two different crystal of the same protein, in the presence, one in the absence of a ligand. Both structures refine nicely in space group P21. Cell constants (a,b,c,beta) are (i) 61,124,61,119 (ac by a hair) and (ii) 59,125,61,118. There is a SINGLE protein molecule in the ASU. To facilitate future analysis and comparison between both structures, I have tried (incl. reindexing) to refine both structures with as similar as possible translational/rotational states as possible. A failed to do any better than having them offset by approx. 32A exactly along the a-xis. Considering the pseudo-hexagonal cell and the extent of the offset being so close to a/2, c/2 or b/3, I have the feeling that I am missing something. What could it be? Thank you as always. Wolfram Tempel
Re: [ccp4bb] mtz2cif capable of handling map coefficients
Fc doesn't contain the weighting scheme used in the creation of the map coefficients, so Fc would require some sort of program to be run to recreate those for both 2Fo-Fc and Fo-Fc maps. By which time you might as well run a single cycle of the refinement program in question to generate new map coefficients - so I don't see the benefit of Fc. The map coefficients, on the other hand, are a checkpoint of the maps being looked at by the author at the time of deposition and don't require programs beyond a typical visualization program (i.e. Coot) to view. Phil Jeffrey Princeton On 4/5/12 12:00 PM, Ethan Merritt wrote: On Thursday, April 05, 2012 08:25:05 am Francis E Reyes wrote: It seems that deposition of map coefficients is a good idea. Does someone have an mtz2cif that can handle this? Maybe I missed something. What is accomplished by depositing map coefficients that isn't done better by depositing Fo and Fc? Ethan
Re: [ccp4bb] Using intrinsically bound Zn atoms for phasing
Self-referentially: I once used the structural Zn of p53 to do a Zn MAD structure of a p53:53BP1 complex at 2.5 Angstrom with one zinc per 450 residues. Apparently using 1.283, 1.282 and 1.262 Angstroms (i.e. the Zinc edge). http://genesdev.cshlp.org/content/16/5/583.long But of course do your own fluorescence scan. The advantage of structural metals is full occupancy and relatively lower B-factor. That map was actually pretty good, and since it came out of MLPHARE I don't doubt modern programs like SHARP could make it quite a lot better. Phil Jeffrey Princeton On 3/6/12 3:09 PM, Deepthi wrote: Hi I am trying to solve the structure of an engineered protein.The protein is crystallized with Zn bound to it .We collected a 1.5A0 data. Molecular Replacement didn't yield a good match for the protein. I want to try MAD taking advantage of the Zn atoms in protein. I am not sure what wavelength should i use to collect the diffraction data for Zn. any suggestions? Thank You Deepthi -- Deepthi
Re: [ccp4bb] Merging data collected at two different wavelength
Can I be dogmatic about this ? Multiwavelength anomalous diffraction from Hendrickson (1991) Science Vol. 254 no. 5028 pp. 51-58 Multiwavelength anomalous diffraction (MAD) from the CCP4 proceedings http://www.ccp4.ac.uk/courses/proceedings/1997/j_smith/main.html Multi-wavelength anomalous-diffraction (MAD) from Terwilliger Acta Cryst. (1994). D50, 11-16 etc. I don't see where the problem lies: a SAD experiment is a single wavelength experiment where you are using the anomalous/dispersive signals for phasing a MAD experiment is a multiple wavelength version of SAD. Hopefully one picks an appropriate range of wavelengths for whatever complex case one has. One can have SAD and MAD datasets that exploit anomalous/dispersive signals from multiple difference sources. This after all is one of the things that SHARP is particularly good at accommodating. If you're not using the anomalous/dispersive signals for phasing, you're collecting native data. After all C,N,O,S etc all have a small anomalous signal at all wavelengths, and metalloproteins usually have even larger signals so the mere presence of a theoretical d difference does not make it a SAD dataset. ALL datasets contain some anomalous/dispersive signals, most of the time way down in the noise. Phil Jeffrey Princeton On 1/18/12 12:48 PM, Francis E Reyes wrote: Using the terms 'MAD' and 'SAD' have always been confusing to me when considering more complex phasing cases. What happens if you have intrinsic Zn's, collect a 3wvl experiment and then derivatize it with SeMet or a heavy atom? Or the MAD+native scenario (SHARP) ? Instead of using MAD/SAD nomenclature I favor explicitly stating whether dispersive/anomalous/isomorphous differences (and what heavy atoms for each ) were used in phasing. Aren't analyzing the differences (independent of source) the important bit anyway? F - Francis E. Reyes M.Sc. 215 UCB University of Colorado at Boulder
Re: [ccp4bb] should the final model be refined against full datset
Let's say you have two isomorphous crystals of two different protein-ligand complexes. Same protein different ligand, same xtal form. Conventionally you'd keep the same free set reflections (hkl values) between the two datasets to reduce biasing. However if the first model had been refined against all reflections there is no longer a free set for that model, thus all hkl's have seen the atoms during refinement, and so your R-free in the second complex is initially biased to the model from the first complex. [*] The tendency is to do less refinement in these sort of isomorphous cases than in molecular replacement solutions, because the structural changes are usually far less (it is isomorphous after all) so there's a risk that the R-free will not be allowed to fully float free of that initial bias. That makes your R-free look better than it actually is. This is rather strongly analogous to using different free sets in the two datasets. However I'm not sure that this is as big of a deal as it is being made to sound. It can be dealt with straightforwardly. However refining against all the data weakens the use of R-free as a validation tool for that particular model so the people that like to judge structures based on a single number (i.e. R-free) are going to be quite put out. It's also the case that the best model probably *is* the one based on a careful last round of refinement against all data, as long as nothing much changes. That would need to be quantified in some way(s). Phil Jeffrey Princeton [* Your R-free is also initially model-biased in cases where the data are significant non-isomorphous or you're using two different xtal forms, to varying extents] I still don't understand how a structure model refined with all data would negatively affect the determination and/or refinement of an isomorphous structure using a different data set (even without doing SA first). Quyen On Oct 14, 2011, at 4:35 PM, Nat Echols wrote: On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang qqho...@gmail.com mailto:qqho...@gmail.com wrote: Sorry, I don't quite understand your reasoning for how the structure is rendered useless if one refined it with all data. Useless was too strong a word (it's Friday, sorry). I guess simulated annealing can address the model-bias issue, but I'm not totally convinced that this solves the problem. And not every crystallographer will run SA every time he/she solves an isomorphous structure, so there's a real danger of misleading future users of the PDB file. The reported R-free, of course, is still meaningless in the context of the deposited model. Would your argument also apply to all the structures that were refined before R-free existed? Technically, yes - but how many proteins are there whose only representatives in the PDB were refined this way? I suspect very few; in most cases, a more recent model should be available. -Nat
Re: [ccp4bb] Apparent twinning in P 1 21 1
Yuri, Detwinning relies on having both twin-related reflections present to calculate either/both of the the de-twinned data values. Therefore it magnifies incompleteness depending on where your missing data is with respect to the twin operator. I'd recommend against trying to do this with a twin fraction close to 0.5. From the DETWIN docs: Itrue(h1) = ((1-tf)*iTw(h1) -tf*iTw(h2)) / (1-2tf) i.e. tf = twin fraction, so 1/(1-2tf) becomes a large number and it's multiplying a weighted term of the form: (iTw(h1) - iTw(h2)) which becomes a very small number as the twin fraction approaches 0.5. The latter difference can easily be less than sigma(I), and so the signal/noise of your data plummets. Better to use REFMAC and phenix.refine's abilities to compensate for the twin fraction directly in refinement and leave your data as it is. Phil Jeffrey Princeton On 9/29/11 10:03 AM, Yuri Pompeu wrote: After I ran DETWIN with the estimated 0.46 alpha, my completeness for the detwinned data is now down to 54%!!! Is this normal behavior? (I am guessing yes since the lower symmetry untwinned dat is P1 21 1)
[ccp4bb] Postdoctoral position at Princeton University
Postdoctoral position at Princeton University A position is available in the laboratory of Prof. Fred Hughson to apply biochemical and structural approaches to the study of bacterial cell-cell communication, also known as quorum sensing. We are especially interested in the receptors bacteria use to detect small-molecule signals emitted by other cells, and in identifying and characterizing antagonists that block communication. A strong background in biochemistry and/or x-ray crystallography is essential. Please e-mail cover letter, c.v., and names of three references to hugh...@princeton.edu.
Re: [ccp4bb] image file extensions
find . -name '*.osc' -or -name '*.img' -type f -size +3000 -print -exec bzip2 '{}' \; is a personal favorite, along those lines, with ample opportunities for customization. (If the above command line wraps, it's all supposed to be on one line) Phil Jeffrey Princeton On 2/9/11 2:46 PM, David Schuller wrote: /bin/ls -lR | sort -nk5 | tail -40 will list the largest files in the directory tree. Those are probably the ones you need to compress.
Re: [ccp4bb] Let's talk pseudotranslational symmetry (or maybe it's bad data).
Is there a program that does ? I was under the impression that they were all equally good/bad at this, because any solution that agrees with the PTS has quite a high score and any solution that doesn't has a low score, irrespective of the correctness of the placement of the molecules. In one case that ritually defeats me with quite strong pseudo-centering, this seems to be true for heavy atom searches also. Phil Jeffrey Princeton On 2/9/11 5:08 PM, Jon Schuermann wrote: I would NOT use Phaser for MR with PTS present. It doesn't handle it correctly yet, since the likelihood targets don't account for PTS. Others may be able to explain it better.
Re: [ccp4bb] Looking for the following values...
On 1/13/11 2:48 PM, J. Fleming wrote: Hi All, I'm about ready to deposit my structure and have used pdb_extract to aid in the process. Unfortunately the following values were not found and are required by ADIT: 1) Under Data Collection, Reflections section: Observed criterion sigma(F) and Observed criterion sigma(I) There is no criterion for sigma F applied in Denzo/HKL2000. Not least of all because data processing programs like Denzo and Scalepack work with intensities and not structure factor moduli. The default Sigma(I) cutoff is -3 See: http://www.hkl-xray.com/hkl_web1/hkl/Scalepack_Keywords.html (keyword SIGMA CUTOFF) 2) Under Refinement, Refinement Statistics section: Number unique reflections (all) If your refinement program does not write it into the header of the PDB file, and the description of the value does not make immediate sense to you, omit it. Some of the requested values are defined rather vaguely. A field matching this name doesn't show up in the REMARK 3 refinement template for PHENIX-derived PDB files. (http://www.wwpdb.org/docs.html) I haven't deposited lately but if I were to hazard a *guess* it might approximate to the number of reflections you would have used in refinement if you hadn't applied magnitude or sigma(F) cutoffs and prior to PHENIX rejecting reflections as gross statistical outliers. One straightforward way to get this number would be to use CAD to write a new MTZ file containing only reflections within the resolution limits used in refinement, and look in the log file to see what the output reflection count was. Assuming, of course, that the cell dimensions defined in your MTZ file are the same ones that you used in refinement. Refinement programs vary in their policy about handling reflections with |F|=0. The loss of reflections would manifest in a difference between the completeness in data collection and the completeness in refinement. Phil Jeffrey Princeton
Re: [ccp4bb] finding I/Sigma(I) from HKL Scalepack
On 11/1/10 4:18 PM, Radisky, Evette S., Ph.D. wrote: Two questions: (1) Is this I/Sigma(I) what is generally reported in the literature for data processed with the HKL suite? Possibly. I say possibly because nobody appears to footnote their I/sigI rows in their data processing tables, so it's impossible to tell which statistic they are reporting. The editorial/proof-reading staff at journals aren't catching this ambiguity. I personally report I/sig(I) but wrote my own program to do it from the .sca files. Phil Jeffrey Princeton
[ccp4bb] Faculty Position, Dept of Molecular Biology, Princeton University
Faculty Position Department of Molecular Biology Princeton University The Department of Molecular Biology at Princeton University invites applications for a tenure-track faculty position at the assistant professor level. We are seeking an outstanding investigator in the area of biochemistry and structural biology. We are particularly interested in candidates whose plans to address fundamental biological questions include the use of X-ray crystallography. Applicants must have an excellent record of research productivity and demonstrate the ability to develop a rigorous research program. All applicants must have a Ph.D. or M.D. with postdoctoral research experience and a commitment to teaching at the undergraduate and graduate levels. Applications must be submitted online at http://jobs.princeton.edu, requisition #1000770, and should include a cover letter, curriculum vitae and a short summary of research interests. We also require three letters of recommendation. All materials must be submitted as PDF files. For full consideration, applications should be received by December 1, 2010. Princeton University is an Equal Opportunity Employer and complies with applicable EEO and affirmative action regulations.
Re: [ccp4bb] Lousy diffraction at home but fantastic at the synchrotron?
Often this reflect crystal size - a small crystal in a big beam (or one with a long path in air) on a home source would see the small diffraction signal drop below the noise level quite quickly - often at the low resolution intensity dip that sits very approximately around 6 Angstrom. On a synchrotron source with a tight low-divergence beam that matches more closely the crystal dimensions that same crystal will appear to do rather better. Also one is more likely to expose the crystal longer (in terms of total photon numbers) at a synchrotron, which itself begets better signal/noise. Alternatively: everyone tries harder before synchrotron trips Phil Jeffrey Princeton On 9/28/10 1:27 PM, Francis E Reyes wrote: Hi all I'm interested in the scenario where crystals were screened at home and gave lousy (say 8-10A) but when illuminated with synchrotron radiation gave reasonable diffraction ( 3A) ? Why the discrepancy? Thanks F
Re: [ccp4bb] Deposition of riding H: R-factor is overrated
On 9/15/10 3:54 PM, Ed Pozharski wrote: Don't you agree that using the riding model does not add additional refinable parameters? (snip) instance, when hydrogens are added, the average N-H distance is 1.1(5), but upon refinement the value is down to 0.85998(4). I So the riding hydrogen model is imperfect. At least with phenix.refine you can measure it, unlike the default behavior of REFMAC. (But you can tell it to write hydrogens out, I believe). Obviously this question is not one amenable to a simple answer. In some sense (as per George) riding hydrogens are merely a restraint. In some other sense they are fundamentally a part of the model - they have very directional properties via bumping restraints that most certainly alter the atomic model for the heavy atoms in a very direct way via collision. Since the nature of these atoms - locationally specific - differs from the more amorphous extended atom restraints (CH3E for methyl in CNS etc) it could make sense to include them in the model at deposition. As far as I know we do not delete atoms from the final model that contribute to scattering and geometric restraints under any other circumstances, except perhaps in the nearly-as-contentious how do I model my disordered side-chain case. Also not amenable to a simple answer. Both approaches (REFMAC-esque and PHENIX-esque) have their merits. I doubt I'm the only person here conflicted over what to do about it. However this thread appears to have reached the point where not much new ground is being broken. Phil Jeffrey Princeton
[ccp4bb] Question: Refmac5 stats reported in pdb REMARK 3
Compare these two lines from phenix.refine: REMARK 3 NUMBER OF REFLECTIONS : 46001 REMARK 3 FREE R VALUE TEST SET COUNT : 2339 with those from refmac, ostensibly using the same data and start pdb: REMARK 3 NUMBER OF REFLECTIONS : 43672 REMARK 3 FREE R VALUE TEST SET COUNT : 2339 I know there are 46011 reflections with |F|0 in the files I used. phenix.refine removes 10 of these as outliers. The 46001 remaining reported in REMARK 3 *include* the test set. With REFMAC, 43672+2339=46011 so it appears that Refmac reports just the *working* set count in that first line, excluding the test set. Is this is a bug with one program or the other, or a bug in the PDB definition of REMARK 3 ? http://www.wwpdb.org/documentation/format23/remark3.html This appears to be a source of inconsistency. phenix.refine 1.6-289 refmac5 5.4.0077 (I'm apparently a Luddite) Phil Jeffrey Princeton
Re: [ccp4bb] problem in calculation of elbow angle.
Inherently you want to calculate: 1. the approximately two-fold relationship between VH and VL 2. the approximately two-fold relationship between CL and CH1 You can use many programs for that (e.g. LSQMAN) but ideally you want a program that will report Direction Cosines for the rotation axis in this superimposition. Particularly wacky CDR conformations could conceivably confuse automatic alignment programs so you could delete those. You should check the superimposed alignments for sanity (e.g. correspondence of the disulfide bonds). Then calculate the elbow angle for the Fab from the dot product of the direction cosines of VL:VH and CL:CH1. Phil Jeffrey Princeton tarique khan wrote: Dear all, I am trying to calculate elbow angle of my fab structure using a online software developed by Robyn L. Stanfield /et. al/. but it is giving a solution with the following errors. WARNING: rotation matrix contains significant additional contributions WARNING: and deviates significantly from a pseudo-twofold0.721 WARNING: rotation matrix contains significant additional contributions WARNING: and deviates significantly from a pseudo-twofold :0.726 WARNING: there have been deviations from expected values - please read the log above!) No guarantee that the calulated elbow angle is meaningful The Elbow angle is probably 174.5 deg. *Kindly suggest some other way of accurately, calculating elbow angle.* regards. Tarique khan
Re: [ccp4bb] superimposing Mtz maps
This sounds a little like multi-crystal averaging without the averaging. If you were using the Uppsala program suite, you could perhaps do the following: [define one xtal as reference, the other as target] 1. Make mask in map grid for reference xtal (program MAMA) 2. Establish operator for reference - target transformation e.g. from protein superimposition with LSQMAN 3. Improve this operator for the two density maps (program MAVE) * do not average the maps, which would otherwise be the usual step 4. Expand the reference map into the target xtal, by lying to MAVE that the reference map is the averaged map. (program MAVE) There also appears to be an EZ skewing option in MAVE. Make sure you make the mask around the unknown density large enough so that MAVE/Improve has some density to work with for optimization that isn't just the unknown blob. You could always use a larger mask for this step and a smaller one for the MAVE/Expand step. Phil Jeffrey Princeton Rana Refaey wrote: Hi, I have two maps from two different crystals with the same space group, both show an unknown density in the same place. I wanted to superimpose the maps to see if it is the same/similar density. Any ideas how to do this ? Thank you, Regards Rana
Re: [ccp4bb] MAD wavelength
Always take the scan results ahead of the typical values unless they are obviously wrong. Only use the predicted values if the scan is broken or too weak (e.g. very small crystals) and in that case I'd be tempted to add 10-20 eV to the typical peak wavelength to make sure you weren't actually collecting the inflection point since they are typically very close in SeMet. In my NSLS X29-dominated data collections, I find I end up using something like this for non-oxidized SeMet: Peak: 12664 eV, 0.9790 Angstrom (usually in range 12662-12664) Infl: 12662 eV, 0.9792 Angstrom (usually in range 12660-12662) I also typically use high energy remote: 12860 eV, 0.964 Angstrom give or take a few eV. This tends to translate well between the relatively small number of beamlines that I personally end up using. But I always prefer to take the results from the Chooch analysis of the scan from the actual crystal. Cheers (and good luck) Phil Jeffrey Princeton Jerry McCully wrote: Dear All: Next week we are going to try some seleno-Met labeled crystals. We checked the literature to try to find out the peak wavelength that has been used for SAD or MAD data collection. But they are slightly different ( may be 50 ev) in different papers. I guess this is due to the discrepancy between the fluorescence scanning and the theoretical vaules of f' and f''. When we collect the data, which wavelength should we use? Should we trust the scanning results?
Re: [ccp4bb] multi-domain protein with identical tertiary structure
The cadmium-utilizing marine diatom carbonic anhydrase (CA) protein has three consecutive CA domains that have very similar structures but non-identical sequences. See: Structure and metal exchange in the cadmium carbonic anhydrase of marine diatoms. Xu Y, Feng L, Jeffrey PD, Shi Y, Morel FM. Nature. 2008 Mar 6;452(7183):56-61. Shankar Prasad Kanaujia wrote: Dear CCP4 users, Is there any multi-domain protein (with at least two domains) which has identical tertiary structure of each domain ? Thanking you. -regards shankar
Re: [ccp4bb] phasing with se-met at low resolution
However, do not get too excited if this resolution limit is 6 A. Although 6 A phases are better than no phases at all, have you ever LOOKED at a 6 A map? It can be very hard to tell if it is protein or not, even with perfect phases and all the right hand choices, etc. If the map is a 6 Angstrom SeMet map you may well be right, since if the signal goes to 6 Angstrom the data at 7 Angstrom isn't that hot either. However if this was a Ta6Br12 6 Angstrom map then it can look quite pretty for the resolution because the 7 Angstrom SAD data in that case can be pretty good. Case in point it the one we collected for PP2a ABC holoenzyme cleared up all sorts of things about the partial molecular replacement solution, including some reassurance that the desperation WD40 ensemble MR solution was actually correct. At 6A, the WD40 looked somewhat like a Bagel (or a Bundt Cake if one is familiar) but the helices in one of the other subunits (A) were actually nicely resolved. Excitement may be warranted, even at 6 Angstrom. Phil Jeffrey Princeton
Re: [ccp4bb] pointless question
It also has the same/analogous bug in space group P3 with Pointless v1.2.10 - I wasn't sure if I was missing something obvious and went back to using my default combination of REINDEX/SCALEIT for the tests and reindexing. Phil Jeffrey Princeton Robert Nolte wrote: An output file is created on hklout with or without the -copy flag. The problem is why is it picking the unity matrix (solution #2) for the reindexing operator rather than the first matrix (-h -k l) that it identifies under reindexing (which is clearly the correct answer). -Original Message- From: Jan Abendroth Sent: May 6, 2009 2:45 PM To: Robert Nolte Subject: Re: [ccp4bb] pointless question Hi Bob, including a -copy flag might not be totally pointless: pointless -copy hklin ... Jan 2009/5/6 Robert Nolte rtno...@earthlink.net mailto:rtno...@earthlink.net I'm hoping someone can help me with a pointless problem. I am trying to reindex data into an orientation that I used to solve the structure initially. While I can get pointless to give me the reindexing needed to make the new data match the old data for the project, when I ask it to write the data to HKLOUT it does not carry out the reindexing. I was under the impression from the documentation it would write out the reindexed solution. Am I doing something wrong or have I found a bug in my particular space group. I seem to recall getting this to work on a different project in the past. I have also tried a number of different versions of pointless, and all give me the same results. The output file is shown below. Thanks in advance for any help. Regards, Bob Nolte - pointless hklin input.mtz hklref reference.mtz hklout reindex.mtz pointless.log contents of pointless.log ### ### CCP4 6.1: POINTLESS version 1.2.23 : 26/09/08## ### User: unknown Run date: 28/ 4/2009 Run time: 13:48:05 Please reference: Collaborative Computational Project, Number 4. 1994. The CCP4 Suite: Programs for Protein Crystallography. Acta Cryst. D50, 760-763. as well as any specific reference in the program write-up. OS type: linux Release Date: 26th September 2008 ** ** * POINTLESS * * 1.2.23 * ** * Determine Laue group from unmerged intensities * * Phil Evans MRC LMB, Cambridge * * Uses cctbx routines by Ralf Grosse-Kunstleve et al.* ** ** --- Reading reference data set from file reference.mtz Maximum resolution in file reference.mtz:1.810 Columns for F, sigF (squared to I): F_881 SIGF_881 Number of valid observations read:18733 Highest resolution: 1.81 Unit cell: 72.6672.6665.9890.0090.00 120.00 Space group: P 3 2 1 Spacegroup information obtained from library file: Logical Name: SYMINFO Filename: /apps/ccp4/ccp4-6.1.0/lib/data/syminfo.lib Maximum resolution in file input.mtz:1.870 Columns for F, sigF (squared to I): F_880 SIGF_880 Number of valid observations read:17028 Highest resolution: 1.87 Unit cell: 72.5872.5866.2090.0090.00 120.00 Space group: P 3 2 1 Possible alternative indexing schemes Operators labelled exact are exact by symmetry For inexact options, deviations are from original cell (root mean square deviation between base vectors) Maximum accepted RMS deviation between test and reference cells (TOLERANCE) = 2.0 [h,k,l]exact [-h,-k,l]exact Normalising reference dataset Log(I) fit for intensity normalisation: B (slope) -18.82 Normalising test dataset Log(I) fit for intensity normalisation: B (slope) -18.21 Alternative indexing relative to reference file reference.mtz $TEXT:Result
Re: [ccp4bb] NCBHT: severe warning
Posting private emails on a public email list is rarely considered good form, in fact on some email lists it would get you thrown off fairly quickly, especially considering your intended purpose. (Who is the list admin here ?) If you're going to post semi-humorous way off-topic posts, you should consider tolerating a few ill-humored replies - at least that particular responder didn't post to the entire list. Phil Jeffrey Marius Schmidt wrote: Interesting, isn't it? :-), nice person. [rest of content removed]
Re: [ccp4bb] Mac pro
The Mac Pro is what I use for all my crystallography calculations. The vast majority of programs run, the one major sticking point being the older version of HKL but I believe that HKL2000 may well run on OSX now. I use XDS and/or MOSFLM if I want to reprocess on this machine. With most packages the bad old days of actually having to edit the Makefile to install programs is past - you can either install via Fink (especially with all the fine work from Bill Scott) or via .dmg files and failing that most packages will compile with not too much pain. Coot did make the CPU almost glow on my MacBook when I installed all the dependencies via Fink back in early 2008, however. The principal problem with the Mac Pro is that it is difficult to get it to run stereo - the one supported configuration last time I checked was absurdly expensive. If you need stereo (and can actually find a CRT monitor) Linux supports a wider array of options, but I enjoy the relatively seamless integration of a conventional desktop environment with Unix on the Mac. There are also options for virtualization of Windoze and Linux via the software Parallels although I have yet to test this out. OSX has minor quirks, like the patch to make OSX treat e.g. the filenames MyJunkData.sca and myjunkdata.sca as the same file rather than the expected Unix behavior. But in practice I rarely find this to be an issue. Phil Jeffrey Princeton Sheemei wrote: Dear all, I am thinking of getting a apple Mac pro desktop computer. I was wondering does all crystallography programs run on it? I think there are Mac OSX version of CCP4, CNS, SHELX etc. But how about programs in the Uppsala software factory etc?. Also is it difficult to install these programs - are there problems? Is linux still a safer choice? sheemei
Re: [ccp4bb] interface
Which brings up something about PISA. If I run PISA on pdb entry 2IE3, which I'm familiar with, I get the following numbers from PISA and CCP4's AREAIMOL (surface areas in Angstrom^2) for the A:C interface. PISA for 2IE3 Automatic A:C interface selection 907.9 (a crystal packing interface is larger than this, but this surface is the A:C interface) AreaIMol with some editing of 2IE3 to separate the chains Chain A25,604.4 Chain C11,847.4 Total 37,451.8 Chain AC 35,576.6 Difference 1,875.2 Difference/2 937.6 For buried S.A. I agree with Steve Darnell's definition. However PISA appears to be reporting half that value, or what it calls interface area. Potentially confusing. Phil Jeffrey Princeton Steven Darnell wrote: Sorry, that equation should read: Buried_Surface_Area = ASA_unbound1 + ASA_unbound2 - ASA_bound ASA = Accessible Surface Area The way I wrote it before would give you a negative value. Regards, Steve Darnell
[ccp4bb] Bug/feature in Phaser 2.1.1 over solution scoring/culling (long)
This is on OSX Tiger 10.4.11 on a G5 machine. Phaser 2.1.1, CCP4 6.0.2 Is anyone else seeing the following ? In a feature that seems new-ish in Phaser, intermediate solutions get culled after translation function and before packing tests: (begin snippet) Purge solutions according to highest LLG from TFs in other spacegroups Best Space Group so far: P 61 2 2 Percent used for purge = 0.75 Top LLG value for purge = 28.8034 Mean LLG used for purge = 6.03998 Cutoff LLG used for purge = 23.1125 Number of solutions stored before purge = 33 Number of solutions stored (deleted) after purge = 0 (33) Purging of the results of the translation function in this spacegroup using the highest LLG value so far (from searches in other spacegroups) deleted all solutions (end snippet) which is all well and good, except in this particular case the solutions for P6122 have LLGs in the 14-16 range. What Phaser appears to be doing is picking up the best case LLGs (and therefore LLG cutoffs) for translation function peaks from another space group (P622), none of which passed the packing criteria, and then inappropriately applying them to subsequent trial solutions in subsequent space groups that have lower LLGs may in fact be better candidate solutions and survive the packing test. In the normal course of events, you'd hope that the best LLG corresponds to the correct solution in the correct space group, but I'll gladly concede that I'm using a marginal model with unimpressive data, and it'll probably fail anyway. This feature in Phaser would seem to potentially speed up correct solutions with good models and data when using SGALTERNATIVE ALL but may in fact make the performance worse with bad models and poor data. Unless I've missed something here, the LLG score/cutoff test needs to be based on trial solutions that have survived the packing test, not peaks from the translation function before that test. I'm using a fairly conventional script (not the GUI) in this case. # phaser EOF MODE MR_AUTO HKLIN ptcr1680_pk_truncate.mtz LABIN F=F SIGF=SIGF TITLE Just a phaser script COMPOSITION PROTEIN MW 28000 NUMBER 3 RESOLUTION 10. 3.2 SGALTERNATIVE ALL ENSEMBLE ensemble1 PDBFILE helix_16.pdb IDENT 75.0 SEARCH ENSEMBLE ensemble1 NUMBER 1 ROOT phaser1 END EOF
Re: [ccp4bb] Finding NCS operator from one heavy atom site? (long)
This almost does what you want, but not quite. To quote from the NCS6D manual: NCS6D uses a set of BONES or PDB atoms as input and tries to find a set of rotations and translations which maximise the correlation coefficient between the density at the (BONES) atoms and those at the same atoms after application of the operator. So you cannot use a mask in NCS6D - you can in IMP. In the case where I did something like this, I could see a single helix near the SeMet sites, so I built this helix, then used the following script to find the first NCS relationship: #!/bin/csh -f # /usr/local/Uppsala/rave_osx/osx_ncs6d EOF eden_400.ext P ncs6d_probe.pdb 1 p21212.sym 30.5 6.5 23.0 Y 0 359 10 0 179 10 0 359 10 -10 10 2 -10 10 2 -10 10 2 L rt_best.o EOF Then I wrote a little C program that broke out each of the 100-or-so NCS operators that are in rt_best.o into files called rt_test_NN.o (NN=integer) and ran each and every one of them through Imp: #!/bin/csh -f # # foreach file (rt_test_*.o) \rm LOG /usr/local/Uppsala/rave_osx/osx_imp MAPSIZE 3500 EOF ! LOG eden_400.ext model.mask p21212.sym $file Automatic 1. .02 2.0 .1 .01 .0001 2 Proper Complete Quit rt_test_new.o EOF set cc = `grep Correlation Coefficient LOG` echo $file echo $cc end # I guess you could create a fur ball of Calpha positions for the initial model to force NCS6D to sort of a volume average - or peak-pick the map around the Se sites - I have not tried this. I found that without the IMP step there were too many similar and unimpressive solutions for the NCS operator and the top one was not in general the correct one. This approach has the potential to consume quite a lot of CPU but the initial map was relatively ugly and ultimately it worked rather well. Others might have more elegant ideas. Phil Jeffrey Princeton Partha Chakrabarti wrote: Hi, Apologies for a non CCP4 question in strict sense. I am trying to work out the NCS operators for a three wavelength Se-MAD data which has only one site. The map is hardly interpretable. I came across the USF Rave package and what I am aiming is creak a mask around the heavy atom site (found by SHELX or Solve) using mama or so, (ideally from resolve.mtz but not necessarily), translate it to the other heavy atom site(s), give a 6d search with NCS6d and perhaps refine the best CC a bit with imp. If it works, I could try use the NCS operator in DM or Resolve etc. I was wondering if someone has a C-shell scripts for dealing with such situation already. Of course if there are other programs for such a task within CCP4, could give it a try. Best Regards, Partha
Re: [ccp4bb] CNS 1.2.2 binary running out of memory
You need to use the syntax: unlimit stacksize unlimit datasize unlimit memoryuse and I have these in my .cshrc I can get this under OSX 10.5 (albeit on an old G5 chip machine): cputime unlimited filesize unlimited datasize unlimited stacksize65532 kbytes coredumpsize unlimited memoryuseunlimited descriptors 256 memorylocked unlimited maxproc 266 In the above there's an undesirable unlimited core dump size because I have this account set up for debugging. On 10.4 on similar hardware I get: cputime unlimited filesizeunlimited datasizeunlimited stacksize 65536 kbytes coredumpsize0 kbytes memoryuse unlimited descriptors 256 memorylockedunlimited maxproc 100 Hope this helps, Phil Jeffrey Princeton hari jayaram wrote: Hi Since I am not on the cnsbb yet I am posting this here. I downloaded the cns 1.2.2 intel build and was trying to run a simulated annealing refinement on my macbook pro ( Intel) running 10.5.2 . However the annealing job crashes roughly 40 minutes into the refinement with the following message There is not enough memory available to the program. This may be because of too little physical memory (RAM) or too little swap space on the machine. It could also be the result of user or system limits. On most Unix systems the limit command can be used to check the current user limits. Please check that the datasize, memoryuse and vmemoryuse limits are set at a large enough value. Unfortunately on Leopard it seems that unlimit and limit are not available under bash Further when I use csh , I get the following values for the limits [mango:~/aps_04_21_2008/p10_2] hari% limit cputime unlimited filesize unlimited datasize 6144 kbytes stacksize8192 kbytes coredumpsize 0 kbytes memoryuseunlimited descriptors 256 memorylocked unlimited maxproc 266 In the same csh shell unlimit returns [mango:~/aps_04_21_2008/p10_2] hari% unlimit unlimit: descriptors: Can't remove limit (Invalid argument) [snip]
Re: [ccp4bb] an over refined structure
Here I will disagree. R-free rewards you for putting in atom in density which an atom belongs in. It doesn't necessarily reward you for putting the *right* atom in that density, but it does become difficult to do that under normal circumstances unless you have approximately the right structure. However in the case of multi-copy refinement at low resolution, the refinement is perfectly capable of shoving any old atom in density corresponding to any other old atom if you give it enough leeway. Remember that there's a big difference between R-free for a single copy (45%) and a 16-fold multicopy (38%) in MsbA's P1 form, and almost the same amount (41% vs 33%) with MsbA's P21 form. (These are E.coli and V.cholerae respectively). Both single copy and multicopy refinements were NCS-restrained, as far as I know. So there's evidence, w/o simulation, that the 12-fold or 16-fold multicopy refinements are worth 7-8% in R-free, and I'm doubtful that NCS can generate that sort of gain in either crystal form. I've certainly never seen that in my own experience at low resolution. I've been meaning to put online the Powerpoint from the CCP4 talk with all these numbers in it, but I regret it's sitting on my iBook at home as of writing. Phil Jeffrey Dean Madden wrote: It is true that multicopy refinement was essential for the suppression of Rwork. However, the whole point of the Rfree is that it is supposed to be independent of the number of parameters you're refining. Simply throwing multiple copies of the model into the refinement shouldn't have affected Rfree, IF IT WERE TRULY FREE. It was almost certainly NCS-mediated spillover that allowed the multicopy, parameter-driven reduction in Rwork to pull down the Rfree values as well. The experiment is probably not worth the time it would take to do, but I suspect that if MsbA and EmrE test sets had been chosen in thin shells, then Rfree wouldn't have shown nearly the improvement it did. Dean Phil Jeffrey wrote: While NCS probably played a role in the first crystal form of MsbA (P1, 8 monomers), this is also the one that showed the greatest improvement in R-free once the structure was correctly redetermined (7% or 14% depending on which refinement protocols you compare). The other crystal form of MsbA and the crystal forms of EmrE didn't have particularly high-copy NCS (2 dimers, 4 monomers, dimer, 2 tetramers) and the R-frees were somewhat comparable in all cases (31-36% for the redetermined structures). The *major* source of the R-free suppression in all these cases with the inappropriate use of multi-copy refinement at low resolution. Phil Jeffrey Princeton Dean Madden wrote: Hi Dirk, I disagree with your final sentence. Even if you don't apply NCS restraints/constraints during refinement, there is a serious risk of NCS contaminating your Rfree. Consider the limiting case in which the NCS is produced simply by working in an artificially low symmetry space-group (e.g. P1, when the true symmetry is P2): in this case, putting one symmetry mate in the Rfree set, and one in the Rwork set will guarantee that Rfree tracks Rwork. The same effect applies to a large extent even if the NCS is not crystallographic. Bottom line: thin shells are not a perfect solution, but if NCS is present, choosing the free set randomly is *never* a better choice, and almost always significantly worse. Together with multicopy refinement, randomly chosen test sets were almost certainly a major contributor to the spuriously good Rfree values associated with the retracted MsbA and EmrE structures. Best wishes, Dean Dirk Kostrewa wrote: Dear CCP4ers, I'm not convinced, that thin shells are sufficient: I think, in principle, one should omit thick shells (greater than the diameter of the G-function of the molecule/assembly that is used to describe NCS-interactions in reciprocal space), and use the inner thin layer of these thick shells, because only those should be completely independent of any working set reflections. But this would be too expensive given the low number of observed reflections that one usually has ... However, if you don't apply NCS restraints/constraints, there is no need for any such precautions. Best regards, Dirk. Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf: It is important when using NCS that the Rfree reflections be selected is distributed thin resolution shells. That way application of NCS should not mix Rwork and Rfree sets. Normal random selection or Rfree + NCS (especially 4x or higher) will drive Rfree down unfairly. Doug Ohlendorf -Original Message- From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of Eleanor Dodson Sent: Tuesday, February 05, 2008 3:38 AM To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] an over refined structure I agree that the difference in Rwork to Rfree is quite acceptable at your resolution. You cannot
Re: [ccp4bb] an over refined structure
While NCS probably played a role in the first crystal form of MsbA (P1, 8 monomers), this is also the one that showed the greatest improvement in R-free once the structure was correctly redetermined (7% or 14% depending on which refinement protocols you compare). The other crystal form of MsbA and the crystal forms of EmrE didn't have particularly high-copy NCS (2 dimers, 4 monomers, dimer, 2 tetramers) and the R-frees were somewhat comparable in all cases (31-36% for the redetermined structures). The *major* source of the R-free suppression in all these cases with the inappropriate use of multi-copy refinement at low resolution. Phil Jeffrey Princeton Dean Madden wrote: Hi Dirk, I disagree with your final sentence. Even if you don't apply NCS restraints/constraints during refinement, there is a serious risk of NCS contaminating your Rfree. Consider the limiting case in which the NCS is produced simply by working in an artificially low symmetry space-group (e.g. P1, when the true symmetry is P2): in this case, putting one symmetry mate in the Rfree set, and one in the Rwork set will guarantee that Rfree tracks Rwork. The same effect applies to a large extent even if the NCS is not crystallographic. Bottom line: thin shells are not a perfect solution, but if NCS is present, choosing the free set randomly is *never* a better choice, and almost always significantly worse. Together with multicopy refinement, randomly chosen test sets were almost certainly a major contributor to the spuriously good Rfree values associated with the retracted MsbA and EmrE structures. Best wishes, Dean Dirk Kostrewa wrote: Dear CCP4ers, I'm not convinced, that thin shells are sufficient: I think, in principle, one should omit thick shells (greater than the diameter of the G-function of the molecule/assembly that is used to describe NCS-interactions in reciprocal space), and use the inner thin layer of these thick shells, because only those should be completely independent of any working set reflections. But this would be too expensive given the low number of observed reflections that one usually has ... However, if you don't apply NCS restraints/constraints, there is no need for any such precautions. Best regards, Dirk. Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf: It is important when using NCS that the Rfree reflections be selected is distributed thin resolution shells. That way application of NCS should not mix Rwork and Rfree sets. Normal random selection or Rfree + NCS (especially 4x or higher) will drive Rfree down unfairly. Doug Ohlendorf -Original Message- From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of Eleanor Dodson Sent: Tuesday, February 05, 2008 3:38 AM To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] an over refined structure I agree that the difference in Rwork to Rfree is quite acceptable at your resolution. You cannot/ should not use Rfactors as a criteria for structure correctness. As Ian points out - choosing a different Rfree set of reflections can change Rfree a good deal. certain NCS operators can relate reflections exactly making it hard to get a truly independent Free R set, and there are other reasons to make it a blunt edged tool. The map is the best validator - are there blobs still not fitted? (maybe side chains you have placed wrongly..) Are there many positive or negative peaks in the difference map? How well does the NCS match the 2 molecules? etc etc. Eleanor George M. Sheldrick wrote: Dear Sun, If we take Ian's formula for the ratio of R(free) to R(work) from his paper Acta D56 (2000) 442-450 and make some reasonable approximations, we can reformulate it as: R(free)/R(work) = sqrt[(1+Q)/(1-Q)] with Q = 0.025pd^3(1-s) where s is the fractional solvent content, d is the resolution, p is the effective number of parameters refined per atom after allowing for the restraints applied, d^3 means d cubed and sqrt means square root. The difficult number to estimate is p. It would be 4 for an isotropic refinement without any restraints. I guess that p=1.5 might be an appropriate value for a typical protein refinement (giving an R-factor ratio of about 1.4 for s=0.6 and d=2.8). In that case, your R-factor ratio of 0.277/0.215 = 1.29 is well within the allowed range! However it should be added that this formula is almost a self-fulfilling prophesy. If we relax the geometric restraints we increase p, which then leads to a larger 'allowed' R-factor ratio! Best wishes, George Prof. George M. Sheldrick FRS Dept. Structural Chemistry, University of Goettingen, Tammannstr. 4, D37077 Goettingen, Germany Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-2582 *** Dirk Kostrewa Gene Center, A 5.07 Ludwig-Maximilians-University Feodor-Lynen-Str. 25 81377 Munich Germany Phone: +49-89-2180-76845 Fax: +49-89-2180-76999 E-mail
Re: [ccp4bb] an over refined structure
If you think about it, there is an analogy to relaxing geometrical constraints, which also allows the refinement to put atoms into density. The reason it usually doesn't help Rfree is that the density is spurious. At least some of the incorrect structure determinations of the early 90's (that spurred the introduction of Rfree etc.) had high rms deviations, suggesting that this is how the overfitting occurred. Nevertheless, once hit with a bit of simulated annealing, the Rfree values of such models deteriorated significantly. If memory serves, the incorrect structures of the 1990's would have had relaxed geometry precisely because they needed to do that to reduce R, and R used to be the primary indicator of structure quality in the days before R-free was introduced. There's quite a big difference between the latitude afforded by relaxing geometry and the degree of freedom allowed by multicopy refinement. Simply increasing the RMS bond length deviations from 0.012 to 0.035 Angstrom would move atoms on average by only a fraction of a bond length, which is not really enough to jump between different atom locations. In any event, the MsbA statistics can be simply explained from an expectation of what happens if you overfit your (wrong) structure using techniques inappropriate for the resolution: R-work goes down R-free goes down less (R-free - R-work) goes up and this happens in general with use of multicopy refinement at anything less than quite high resolution - I'm thinking in particular of a comment in Chen Chapman (2001) Biophys J vol. 8, 1466-1472. So I see no reason to suggest NCS is having a particularly extreme, perhaps unprecedented, effect. Phil Jeffrey (still working on converting Micro$loth Powerpoint to html)
Re: [ccp4bb] Missing scatter deffinition in CNS
According to the error message your offending atom has a type: chemical=FPAF scatter.lib assigns scattering factors based on chemical type, and there are ones for F and F-1 but of course not FPAF - this would likely be the source of your problem. The quick fix is to make your own copy of scatter.lib and edit the files that reference it to pick up the local copy. Phil Jeffrey Princeton Jian Wu wrote: Dear all, I am refining a structure in which there is an fluorine atom in the inhibtor. When I go on the energy minimization in CNS, an unusual error happened to this atom: Program version= 1.1 File version= 1.1 CONNECt: selected atoms form 9 covalently disconnected set(s) list of isolated (non-covalently bonded) atoms: --none-- list of isolated (non-covalently bonded) di-atomic molecules: --none-- %XRASSOC-ERR: missing SCATter definition for ( $RX4 300 FAF ) chemical=FPAF %XRASSOC error encountered: missing SCATter definition for SELEcted atoms. (CNS is in mode: SET ABORT=NORMal END) * ABORT mode will terminate program execution. * Program will stop immediately. I have check the topology file, the paramter file, and the scatter.lib file, but found nothing is unusual in these files. Had anyone ever encountered this problem before? Any suggestion would be welcome and thank you in advance! Best Regards, Jian Wu -- Jian Wu Ph.D. Student Institute of Biochemistry and Cell Biology Shanghai Institutes for Biological Sciences Chinese Academy of Sciences (CAS) Tel: 0086-21-54921117 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
Re: [ccp4bb] B-factor Space gr questions!
Wouldn't the desirability of this depend on the extent to which the molecule has moved between the high-resolution and low-resolution datasets ? I would have thought that there was an effective information transfer between R-work and R-free once the rigid body movements became too large, which might provide one with an over-optimistic idea of what the R-free would be with the high-resolution model with the low-resolution data. Phil Princeton NJ Edward A Berry wrote: Even if the free-R set is not preserved for the new crystal, R and R-free tend to diverge rapidly once any kind of fitting with a low data/param is performed, so I think the new structure must not have been refined much beyond rigid body (and over-all B which is included in any kind of refinement). And that choice may be well justified. Ed cdekker wrote: Hi, Your reply to the ccp4bb has confused me a bit. I am currently refining a low res structure and realise that I don't know what to expect for final R and Rfree - it is definitely not what most people would publish. So the absolute values of R and Rfree are not telling me much, the only gauge I have is that as long as both R and Rfree are decreasing I am improving the model (and yes, at the moment that is only rigid body refinement). In your email reply you suggest that even though a refinement to convergence that will lead to an increased Rfree (and lower R? - a classic case of overfitting!) would be a better model than the rigid-body-refined only model. This is what confuses me. I can see your reasoning that starting with an atomic model to solve low-res data can lead to this behaviour, but then should the solution not be a modification of the starting model (maybe high B-factors?) to compensate for the difference in resolution of model and data? Carien On 4 Jun 2007, at 19:38, Edward A Berry wrote: Ibrahim M. Moustafa wrote: The last question: In the same paper, for the complex structure R and Rfree are equal (30%) is that an indication for improper refinement in these published structure? I'd love to hear your comments on that too. Several times I solved low resolution structures using high resolution models, and noticed that R-free increased during atomic positional refinement. This could be expected from the assertion that after refinement to convergence, the final values should not depend on the starting point: If I had started with a crude model and refined against low resolution data, Rfree would not have gone as low as the high-resolution model, so if I start with the high resolution model and refine, Rfree should worsen to the same value as the structure converges to the same point. Thinking about the main purpose of the Rfree statistic, in a very real way this tells me that the model was better before this step of refinement, and it would be better to omit the minimization step. Perhaps this is what the authors did. On the other hand it does not seem quite right submit a model that has simply been rigid-body-refined against the data- I would prefer to refine to convergence and submit the best model that can be supported by the data alone, rather than a better model which is really the model from a better dataset repositioned in the new crystal. Ed
Re: [ccp4bb] Stop Refmac from refining B factors?
Harry M. Greenblatt wrote: You should be refining an overall temperature factor at that resolution. It's one of the choices in the list, instead of isotropic. I disagree with this. At that (3.2 Angstrom) resolution I've often found than a tightly restrained individual B-factor refinement gives a significantly lower R-free than a single overall B-factor. I also prefer it to grouped B-factors in CNS, because the latter are not geometrically restrained and show a lot of physically unreasonable waywardness (although often, similar R-free as B-individual). Individual B's can also be restrained by non-crystallographic symmetry and as far as I can tell grouped B's are not. I think one has to explore all possibilities rather than take one fixed approach to working at modest resolutions, and the optimal solution is likely to be different for different structures. Phil Jeffrey Princeton, NJ Hi, I have a little problem with B-factor refinement. I'm using the CCP4i interface, Refmac 5.2.0019, a resolution of 30-3.2 A (I tried 8-3.2 A as well, it doesn't make a big difference for this problem), and a current Rfree of 30.4%. Refmac refines the B-factors so that they are nearly the same for main chain and side chain, and I don't like that (or could it make sense in any way?). Moreover, my structure is a protein complex, and Refmac is mainly doing this for one component of the complex. If I take the B-factors from the original uncomplexed protein (around 18, 1.75 A) and add 44 to them with moleman to get them in the range they are in the complex, Refmac flattens them remarkably in only 5 cycles of restricted refinement. Does anyone have an explanation for this? I am pretty sure that the complex components are in the right place, I see beautiful density and everything I should see at this resolution. Here is what I tried further: * I de-selected Refine isotropic temperature factors in the Refmac interface. There was no REFI BREF ISOT any more in the com file. But there was also no difference in the B-factors compared to when there _was_ REFI BREF ISOT in the com file... So does Refmac just _ignore_ my wish not to refine B-factors? (The REFI keywords were as follows: type REST - resi MLKF - meth CGMAT - is there any B-factor-thing hidden in this?) * I played around with the geometric parameters. If I select the B-factor values there (the keywords are TEMP|BFAC wbskalsigb1sigb2sigb3sigb4), it does not make _any_ difference, what values I fill in there, the resulting B-factors are always the same (but different from when I don't use the TEMP keyword, and even flatter). Default for WBSCAL is 1.0, I tried 10, 1.0, 0.1, 0.01, and the equivalent numbers for the sigbs. Thanks for any thoughts on this, Eva - Harry M. Greenblatt Staff Scientist Dept of Structural Biology [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Weizmann Institute of SciencePhone: 972-8-934-3625 Rehovot, 76100 Facsimile: 972-8-934-4159 Israel
Re: ccp4bb on new site
As far as the subject header line is concerned, ye olde ListServ command: SET CCP4BB SUBJECTHDR would probably work if one emailed it to the server (i.e. [EMAIL PROTECTED] *not* CCP4BB@JISCMAIL.AC.UK) or you can do it via the web interface. It appears that the mail/web command interface will not let you change the Reply-To feature. Phil Jeffrey Kjeldgaard Morten wrote: Unfortunately, It appears that JISCMAIL is using the outdated LISTSERV software to run it's mailing lists, so there is not much hope of getting such things as the [ccp4bb] subject tag and reply to sender features back :-( Morten --Morten Kjeldgaard, asc. professor, MSc, PhD