Re: [ccp4bb] extracting PHENIX structures
On Sun, Jan 25, 2015 at 12:25 AM, Kay Diederichs kay.diederi...@uni-konstanz.de wrote: Here, I subtracted the Phenix and Refmac entries from the total Phenix count, because it seems likely that Phenix was used for other purposes than refinement. Actually, based on my skimming of methods sections in recent publications, it is relatively common for users to run multiple refinement programs over the course of a project. In fact it's far more common than the combinations in the software field indicate, because the PDB deposition process will automatically use whatever is indicated by the submitted PDB file - i.e. the last program run - and users usually won't update this to include previously run programs. One consequence of this is that CNS ends up being underrepresented in the software field, because it tends to be used earlier in refinement (DEN, for example), with Phenix or Refmac substituted when the model is closer to convergence. The other consequence is that you can't assume that entries tagged as just Phenix OR Refmac were really refined with only one program rather than a combination. The moral: when you deposit your structures, please indicate all software used, not just the very last program that you ran. (And, of course, include the relevant citations in the actual publication, which sadly doesn't always happen.) -Nat
Re: [ccp4bb] chloride or water
On Wed, Jan 21, 2015 at 10:17 AM, Keller, Jacob kell...@janelia.hhmi.org wrote: I see your point about not knowing that it’s a chloride, but I think you would agree that it is certainly more likely a chloride than map-noise, and perhaps more likely than water as well. Would you agree that chloride is the best guess, at least? No, I think I don't know is the most honest and scientifically robust answer. For those who insist on annotating every density blob, UNX atoms are the PDB's officially supported method for doing so (unless this has changed recently), or UNK/UNL for unknown amino acids and ligands. These are not without their own problems but they at least make both the presence of an atom and the uncertainty about its identity explicit. Since the PDB is certainly tainted by structures modeled in accordance with the “most likely” outlook, one now has to be cautious about all structures. This is true, but everyone else is just as sloppy is a poor excuse for further polluting the database. -Nat
Re: [ccp4bb] chloride or water
On Wed, Jan 21, 2015 at 12:16 AM, Engin Özkan eoz...@uchicago.edu wrote: Carbon in chloride's coordination sphere? To me, it looks like you have serious vdW violations, and neither water nor chloride could go there. Halides can interact with carbon too - discussed in Dauter Dauter (2001) - although I think this is more common with iodide than chloride. But this instance is totally unconvincing without anomalous data. It would be better to leave it entirely empty than to put in something wildly speculative - there are far too many spurious chlorides in the PDB already, which of course makes it even more difficult to come up with general rules about binding patterns. -Nat
Re: [ccp4bb] chloride or water
On Wed, Jan 21, 2015 at 9:05 AM, Keller, Jacob kell...@janelia.hhmi.org wrote: Not sure why there is this level of suspicion about the poor halide when waters generally get assigned so haphazardly. I would say that there are probably more “wrong” waters in the PDB than wrong chlorides, but there’s not much fuss about that. Great, so leave it empty instead of just making something up. Perhaps future generations will figure out a more rigorous and quantitative method for handling such features than guessing based on screenshots posted to a mailing list. At this resolution water placement is difficult to justify anyway - and since neither the scattering properties nor the coordination distances are especially accurate, trying to assign chemical identity in the absence of any supporting information (for example anomalous data) is especially futile. (Although at least in this case the resolution is an obvious red flag - to a crystallographer, anyway - indicating that any lighter ions shouldn't be taken very seriously. Other biologists, of course, may be more trusting.) -Nat
Re: [ccp4bb] chloride or water
On Wed, Jan 21, 2015 at 12:24 PM, Keller, Jacob kell...@janelia.hhmi.org wrote: I think this will probably never happen, but maybe there could be a confidence value associated with each atom in structures a posteriori, although it might be difficult to find the right automatable criteria for this value. The element would be assigned by being the most likely one, and confidence assigned thereafter. Too much of a pain to implement, probably, and maybe not worth the trouble. Perhaps, though, Nat’s program could be used to do this. [ doi:10.1107/S1399004714001308 http://dx.doi.org/10.1107/S1399004714001308 ]. Unfortunately that implementation isn't really quantitative in the sense you describe - not for lack of interest on our part, but rather because coming up with a unified score based on many disparate and arguably incomplete criteria is quite difficult. The initial goal was to go after the low-hanging fruit of ions that are relatively obvious and/or whose identity is well-known, which would otherwise need to be placed manually. But as we tried to make clear in the paper, this isn't a substitute for common sense and prior knowledge. (By the way, for what it's worth, I think both Na and Cl are simultaneously over- and under-represented in the PDB - there are many spurious atoms, but at least as many that were overlooked and labeled as water. From the perspective of a methods developer, however, the false positives are much more of a pain to deal with in an automated workflow.) -Nat
Re: [ccp4bb] 3 letter code for pyridoxine
On Wed, Dec 31, 2014 at 10:11 AM, Faisal Tarique faisaltari...@gmail.com wrote: I request you to please tell me the three letter code for pyridoxine.. You can find the appropriate residue code for any molecule previously deposited in the PDB - which includes pyridoxine - by simply searching for the conventional name on the RCSB PDB web site (and probably the other PDB sites too, but I'm less familiar with them). -Nat
Re: [ccp4bb] unknown densities
On Mon, Dec 8, 2014 at 7:24 PM, Keller, Jacob kell...@janelia.hhmi.org wrote: FYI, sometimes native nucleotides can make it through protein purifications if binding is tight. This is especially true for G-proteins, since tight binding to GDP is an essential part of their function. I don't know what to expect from E. coli proteins, but human Ras co-purifies with GDP at close to 1:1. At low resolution, it might be easiest to superimpose the closest nucleotide-bound homolog and see how well the density aligns with the nucleotide. -Nat
Re: [ccp4bb] Consensus on modeling unidentifiable lipids and detergents in membrane protein structures?
On Thu, Nov 6, 2014 at 5:20 PM, Oliver Clarke oc2...@columbia.edu wrote: I wonder if a solution might be to create new residues containing alkyl chains of various lengths, named something like U01, U02, U(n), where N is the length of the alkyl chain that fits the density. Sort of similar to the way one might leave UNK residues in a peptide of unidentified sequence. Or maybe there is a better way of doing this? Would appreciate any suggestions. This would be UNL, which is a catch-all for unidentified ligands. The best example I can think of is PDB ID 3arc, which is a relatively high-resolution structure of Photosystem II containing a number of partially ordered mystery lipids. This is a bit of a mess because they're a mix of unbranched alkyl chains and branched esters, so the chemistry is not internally consistent, but at least the scattering types are defined and the UNL designation makes the ambiguous identity explicit. I think this is the generally accepted convention. I wouldn't worry too much about nomenclature; you can just generate restraints for a generic ligand of this type with as many carbon atoms as you need - for instance, using a SMILES string like this: CCC and hopefully that will be sufficient for all of the fragments in your structure. -Nat
Re: [ccp4bb] water at the same exactly position
On Wed, Oct 29, 2014 at 8:53 PM, luzuok luzuo...@126.com wrote: I think it is better for COOT to solve this issue. Coot already can be used to solve this issue - I think the automation is somewhat lacking, but it's vastly preferable to anything involving a text editor or shell commands. 1. Load molecule and electron density maps in Coot 2. From the Validate menu, select Check/Delete waters... 3. Just select for waters with very close distances, for example 0.2Å; I've attached a screenshot of what it should look like. 4. This will give you a list of overlapping waters - then you just need to delete one of each pair. (It doesn't matter which one - the waters will be renumbered later anyway.) Alternately, you can set Action to Delete, which is much less effort, but that will delete both copies. If you are just going to run a program (or Coot function) to place more waters automatically (my preference), this won't matter, but if they're atoms you really care about, you should delete them manually. -Nat [image: Inline image 1]
Re: [ccp4bb] R/Rfree gap with pseudotranslational symmetry
On Fri, Sep 26, 2014 at 8:17 PM, Kimberly Stanek kas...@virginia.edu wrote: Before refinement in phenix the R/Rfree gap is rather small, however even after one round of refinement I am finding that this gap increases to almost 0.06. I have a feeling that the high symmetry present has something to with this R/Rfree gap but was hoping some of you may have some helpful suggestions for how to deal with it. It's normal for the R/R-free gap to increase during the first round of refinement in molecular replacement - in fact, unless you are solving a near-identical crystal form and keeping the original R-free flags, this is almost guaranteed to happen. MR will use all reflections and the limited refinement Phaser does uses very coarse parameterization (rigid-body and group B-factor), so the R-free will usually be quite low and sometimes even lower than R-work. Restrained refinement will immediately start to open the gap, but if it's working properly, it won't keep expanding throughout refinement. At this resolution a gap in the range of 0.02-0.04 would be normal - less than this is unusual. My guess is you just need to change the relative weights of the X-ray target and geometry restraints so that the latter are stronger. Also, use NCS restraints if you aren't already. -Nat
Re: [ccp4bb] Problem regarding twin detection and refinement
The tests that output twinning fractions are *not* diagnostic for twinning; they merely estimate what the twin fraction would be if the data were in fact twinned, which can only be decided on the basis of abnormal intensity statistics. (Any version of Xtriage since July should state this more clearly, since we've seen so many users make this mistake in the past.) I'm not sure what you mean by the L-test plot being sigmoidal - usually this is a diagnostic feature of the NZ plot. If you could post images of these plots (Xtriage will let you save them, probably ccp4i loggraph will too), that might help. Given the fact that it scales in P622 despite the ASU being too small, it may be the case that it really is twinned (I'm not exactly sure how to interpret the Refmac results), but you need to absolutely rule out other possibilities before resorting to twinned refinement. It is certainly possible to solve such a structure but very tricky to refine without fooling yourself. At 4Å resolution you will already have a difficult time coping with model bias in the 2Fo-Fc map, and twin refinement will make this even worse (whether or not you actually have twinning). -Nat On Wed, Sep 17, 2014 at 11:23 PM, Sudipta Bhattacharyya sudiptabhattacharyya.iit...@gmail.com wrote: Dear Community, Recently, we could solve a structure of DNA/protein complex through MR phasing. The data was initially indexed and scaled in P622 space group, however owing to the incompatibility of a single DNA/protein complex to fit in the ASU, it could not be solved in that space group (according to solvent content analysis). Since it was an indication of possible twining event, we tried MR in P321, P312 and P6 space groups respectively and finally got a very good solution in P65. According to phenix xtriage analysis, the data may be a near perfect merohedrally twinned one (twining fraction 0.425, Britton analysis; 0.468 H test; 0.478 ML method; with a possible twin operator h,-h-k,-l), however, the L test rather suggest no such twining event (but the graph of acentric observed, appeared slightly, sigmoidal compared to the straight acentric theoretical in the L test). The same phenomenon happened when we checked the data for possible twining in Truncate (twin fraction: L test: No; H test, 0.42, Murray Rust, 0.35; Britton ML, 0.47; possible twin operator: h+k,-k,-l). On the other hand, while refining the data in Refmac5 with intensity based twin option ON, Refmac5 suggested a perfect merohedral twining with the fraction of 0.49/0.50. In the context of these confusing situation, now my questions are - 1. Is the data twinned or not? 2. With such a high twining fraction, is it solvable? 3. What refinement programs will be the best choices for refinement of such a twinned data? 4. In one of the tutorials of Refmac5 it has been suggested, in such twin refinement cases, to choose Rfree set in higher space group (in our case P6522) then expand it to lower space group (in our case P65), could anybody please let me know how to do that in CCP4 or else? 5. It is a data of 4A resolution could anyone tell me what final R/Rfree one could expect from a 4A data (although it may sound a dumb question...) Any help will be highly appreciated. With my best regards, Sudipta. Sudipta Bhattacharyya, Postdoctoral Research Fellow, Colorado State University, Fort Collins, Colorado, USA.
Re: [ccp4bb] Coot - way to move the center pointer to a specific x,y,z coordinate?
In Python scripting (Calculate menu): set_rotation_centre(x, y, z) I assume there's a Scheme equivalent. -Nat On Tue, Sep 9, 2014 at 1:17 PM, Alejandro Virrueta alejandro.virru...@yale.edu wrote: Does anyone knowhow to move the center pointer to a specific x,y,z coordinate? Or to place some kind of marker at a specific x,y,z coordinate? Thanks, Alex
Re: [ccp4bb] Reliable criteria to tell Anomalous or not?
On Thu, Sep 4, 2014 at 4:05 PM, CPMAS Chen cpmas...@gmail.com wrote: Do you guys have some recommendation of the criteria? phenix reported anomalous measurability, CCP4/aimless has RCRanom. Sometimes, they are not consistent. The measurability isn't always useful - it's definitely correlated with how easy it will be to find sites and experimentally phase the data, but it's very dependent on the sigmas being estimated accurately. I'm not sure what RCRanom is, but both Aimless and the new version of Xtriage (if using unmerged data as input) will report CC(anom), which is like CC1/2 for anomalous differences, and that should be pretty reliable. -Nat
Re: [ccp4bb] random half data sets
On Tue, Aug 12, 2014 at 10:28 PM, Keller, Jacob kell...@janelia.hhmi.org wrote: A somewhat similar question, with a quick answer I hope: when programs output CC's of 1/2 datasets, are several random halvings compared/averaged, and if not, does this make a difference, or are the scores so similar there's no point? The latter, I think. It probably only matters for data where you have a lot of erroneous observations (like ice rings) or at the fringes where there's almost no signal anyway. In my hands, a dataset with mulitplicity of 3.9 has an outer shell with a CC1/2 between 0.497 and 0.503 depending on random seed, which isn't worth worrying about. -Nat
Re: [ccp4bb] correlated alternate confs - validation?
On Wed, Jul 23, 2014 at 3:25 AM, MARTYN SYMMONS martainn_oshioma...@btinternet.com wrote: The practice at the PDB after deposition used to be to remove water alternate position indicators - although obviously to keep their partial occupancies. This has not been my experience - see for example: http://www.rcsb.org/pdb/files/2GKG.pdb which was deposited as a PDB file, not mmCIF. -Nat
Re: [ccp4bb] Proper detwinning?
On Wed, Jul 9, 2014 at 5:14 PM, Chris Fage cdf...@gmail.com wrote: Despite modelling completely into great electron density, Rwork/Rfree stalled at ~38%/44% during refinement of my 2.0-angstrom structure (P212121, 4 monomers per asymmetric unit). Xtriage suggested twinning, with |L| = 0.419, L^2 = 0.245, and twin fraction = 0.415-0.447. However, there are no twin laws in this space group. I reprocessed the dataset in P21 (8 monomers/AU), which did not alter Rwork/Rfree, and in P1 (16 monomers/AU), which dropped Rwork/Rfree to ~27%/32%. Xtriage reported the pseudo-merohedral twin laws below. ... Performing intensity-based twin refinement in Refmac5 dropped Rwork/Rfree to ~27%/34% (P21) and ~18%/22% (P1). Would it be appropriate to continue with twin refinement in space group P1? It sounds like you have pseudo-symmetry and over-merged your data in P212121. I would try different indexing for P21 before giving up and using P1 (you may be able to just re-scale without integrating again, but I'm very out of date); the choice of 'b' axis will be important. If none of the alternatives work P1 may be it, but I'm curious whether the intensity statistics still indicate twinning for P1. -Nat
Re: [ccp4bb] twin or untwinned
On Thu, Jul 3, 2014 at 7:50 AM, Nat Echols nathaniel.ech...@gmail.com wrote: On Thu, Jul 3, 2014 at 6:53 AM, Dirk Kostrewa kostr...@genzentrum.lmu.de wrote: yes - unfortunately, in my hands, phenix.xtriage reads the XDS_ASCII.HKL intensities as amplitudes, producing very different output statistics, compared both to the XDS statistics and to an mtz file with amplitudes created from that XDS file. This is incorrect. It does read it correctly as intensities - the confusion probably arises from the fact that Xtriage internally converts everything to amplitudes immediately, so that when it reports the summary of file information, it will say xray.amplitude no matter what the input type was (the same will also be true for Scalepack and MTZ formats). However, the data will be converted back to intensities as needed for the individual analyses. Obviously this isn't quite ideal either since the original intensities are preferable but for the purpose of detecting twinning I hope it will be okay. In any case the incorrect feedback confused several other users so it's gone as of a few weeks ago, and the current nightly builds will report the true input data type. (The actual results are unchanged.) Tim: I have no reason to think we handle unmerged data poorly; I'm not sure who would have told you that. In most cases they will be merged as needed upon reading the file. I'm a little concerned that you're getting such different results from Xtriage and pointless/aimless, however. Could you please send me the input and log files off-list? Dirk, same thing: if you have an example where XDS and Xtriage are significantly in disagreement, the inputs (and logs) would be very helpful. In both cases, I suspect the difference is in the use of resolution cutoffs and absolute-scaled intensities in Xtriage versus other programs, but I'd like to be certain that there's not something broken. I stand corrected: unmerged XDS files (but not other formats) were not being handled appropriately in Xtriage; this was fixed several weeks ago, so the nightly builds should behave as expected. -Nat
Re: [ccp4bb] twin or untwinned
On Thu, Jul 3, 2014 at 6:53 AM, Dirk Kostrewa kostr...@genzentrum.lmu.de wrote: yes - unfortunately, in my hands, phenix.xtriage reads the XDS_ASCII.HKL intensities as amplitudes, producing very different output statistics, compared both to the XDS statistics and to an mtz file with amplitudes created from that XDS file. This is incorrect. It does read it correctly as intensities - the confusion probably arises from the fact that Xtriage internally converts everything to amplitudes immediately, so that when it reports the summary of file information, it will say xray.amplitude no matter what the input type was (the same will also be true for Scalepack and MTZ formats). However, the data will be converted back to intensities as needed for the individual analyses. Obviously this isn't quite ideal either since the original intensities are preferable but for the purpose of detecting twinning I hope it will be okay. In any case the incorrect feedback confused several other users so it's gone as of a few weeks ago, and the current nightly builds will report the true input data type. (The actual results are unchanged.) Tim: I have no reason to think we handle unmerged data poorly; I'm not sure who would have told you that. In most cases they will be merged as needed upon reading the file. I'm a little concerned that you're getting such different results from Xtriage and pointless/aimless, however. Could you please send me the input and log files off-list? Dirk, same thing: if you have an example where XDS and Xtriage are significantly in disagreement, the inputs (and logs) would be very helpful. In both cases, I suspect the difference is in the use of resolution cutoffs and absolute-scaled intensities in Xtriage versus other programs, but I'd like to be certain that there's not something broken. thanks, Nat
Re: [ccp4bb] Lysine coordinated ions
On Tue, Jul 1, 2014 at 3:10 PM, Katherine Sippel katherine.sip...@gmail.com wrote: My google-fu has failed me once again so I am turning to the collective knowledge of the bb. I'm working on a blobology challenge at the moment and have hit a wall. Is anyone aware of an ion that coordinates to lysine and prefers octahedral geometry. The mystery ion seems to have perfect octahedral geometry with bond distances of ~2.1 angstrom, but the only direct side chain interaction is to a lysine NZ, the rest are waters. Lysine can coordinate a cation if the chemical environment is favorable - usually this means a high-pH buffer (what was the pH of your crystals?). The same is true for N-termini; I may be able to dig up a published example of this. (I think it is effectively impossible for Arg, however.) These interactions are certainly exceedingly rare (and I doubt they are ever present in vivo), but if the nitrogen loses a proton the lone pair will be able to coordinate a compatible ions. Since magnesium can be coordinated by the nitrogen histidine it seems like the most likely candidate - but I would still be very, very careful before assigning it, especially if the only other coordinating atoms are waters. -Nat
Re: [ccp4bb] PDB Storage of Diffraction Images
On Fri, May 16, 2014 at 7:12 AM, esse...@helix.nih.gov wrote: Short of storing images, which is the ultimate preservation of primary information, I have always been puzzled by the fact that the PDB only stores unique reflections i.e. no Friedel pairs even when provided. Is this outdated perhaps ? I remember that my deposited SFs in the past where reduced to not contain Friedel pairs. If there had been a concern about increasing the storage space by actually less than twice the space for unique SFs, this may be invalid today and is still far less than the space required for images. However, it is possible that the information content in Friedel pairs is deemed insignificant compared to their extra costs. I for one would appreciate having access to Friedel pairs very much. They definitely store Friedel pairs! Maybe you're confused by the layout of the mmCIF file, which (like MTZ) usually lists just the unique (non-anomalous) indices, but with separate values for F+/F- when they are available. I've been making extensive use of anomalous data depositions - unfortunately there aren't as many as we would like, either because many people do not realize that this is useful information even when the experiment was not specifically looking for anomalous signal, or because the complexity of PDB deposition discourages providing the most complete data. An even more useful improvement would be to make deposition of unmerged intensities straightforward - the JCSG does this somehow but it is non-trivial for the average user. Hopefully this will also change soon. -Nat
Re: [ccp4bb] PDB passes 100,000 structure milestone
On Thu, May 15, 2014 at 9:53 AM, Patrick Shaw Stewart patr...@douglas.co.uk wrote: It seems to me that the Wikipedia mechanism works wonderfully well. One rule is that you can't make assertions yourself, only report pre-existing material that is attributable to a reliable published source. This rule would be a little problematic for annotating the PDB. It requires a significant amount of effort to publish a peer-reviewed article or even just a letter to the editor, and none of us are being paid to write rebuttals to dodgy structures. -Nat
Re: [ccp4bb] PDB passes 100,000 structure milestone
That is an extraordinary case, and it certainly took a huge amount of work. What about structures that are obviously wrong based on inspection of the density, but no one has bothered to challenge yet? The TWILIGHT database helps some, if that counts, but it doesn't catch everything. -Nat On Thu, May 15, 2014 at 10:48 AM, Patrick Shaw Stewart patr...@douglas.co.uk wrote: I may be missing something here, but I don't think you have to rebut anything. You simply report that someone else has rebutted it. Along the lines of Many scientists regard this published structure as unreliable since a misconduct investigation by the University of Alabama at Birmingham has concluded that it was, more likely than not, faked [1] [1] http://www.nature.com/news/2009/091222/full/462970a.html On 15 May 2014 18:00, Nat Echols nathaniel.ech...@gmail.com wrote: On Thu, May 15, 2014 at 9:53 AM, Patrick Shaw Stewart patr...@douglas.co.uk wrote: It seems to me that the Wikipedia mechanism works wonderfully well. One rule is that you can't make assertions yourself, only report pre-existing material that is attributable to a reliable published source. This rule would be a little problematic for annotating the PDB. It requires a significant amount of effort to publish a peer-reviewed article or even just a letter to the editor, and none of us are being paid to write rebuttals to dodgy structures. -Nat -- patr...@douglas.co.ukDouglas Instruments Ltd. Douglas House, East Garston, Hungerford, Berkshire, RG17 7HD, UK Directors: Peter Baldock, Patrick Shaw Stewart http://www.douglas.co.uk Tel: 44 (0) 148-864-9090US toll-free 1-877-225-2034 Regd. England 2177994, VAT Reg. GB 480 7371 36
Re: [ccp4bb] PDB passes 100,000 structure milestone
On Wed, May 14, 2014 at 10:26 AM, Mark Wilson mwilso...@unl.edu wrote: Getting to Eric's point about an impasse, if the PDB will not claim the authority to safeguard the integrity of their holdings (as per their quoted statement in Bernhard's message below), then who can? I think this may in part boil down to a semantic dispute over the meaning of integrity. I interpreted it to mean integrity (and public availability) of the data as deposited by the authors, which by itself is quite a lot of work. Safeguarding the integrity of the peer-review process is supposed to be the job of the journals, some of which - unlike the PDB - are making a tidy profit from our efforts. Since they justify this profit based on the value they supposedly add as gatekeepers, I don't think it's unreasonable for us to expect them to do their job, rather than leave it to the PDB annotators, who surely have enough to deal with. I do share some of the concern about 2hr0, but I am curious where the line should be drawn. This is an extraordinary case where the researcher's institution requested retraction, but I think everyone who's been in this field for a while has a list of dodgy structures that they think should be retracted - not always with justification. -Nat
Re: [ccp4bb] TER in PDB file
On Tue, May 13, 2014 at 9:20 PM, Felix Frolow mbfro...@post.tau.ac.ilwrote: Phenix does even more, it adds TER after ions and ligands, so again manual messing is needed. However they may have a jiffy to fix it. phenix.sort_hetatms will remove them for you, although why this problem was apparently beyond the capability of the PDB itself to handle is a mystery. When I encountered this a couple of years ago I spent longer arguing with the annotator than writing the code. Fortunately mmCIF deposition does indeed seem to work more smoothly. -Nat
Re: [ccp4bb] PyMol and Schrodinger
On Wed, Apr 23, 2014 at 8:43 AM, Cygler, Miroslaw miroslaw.cyg...@usask.cawrote: I have inquired at Schrodinger about the licensing for PyMol. I was surprised by their answer. The access to PyMol is only through a yearly licence. They do not offer the option of purchasing the software and using the obtained version without time limitation. This policy is very different from many other software packages, which one can use without continuing licensing fees and additional fees are only when an upgrade is needed. At least I believe that Office, EndNote, Photoshop and others are distributed this way. I also remember very vividly the Warren’s reason for developing PyMol, and that was the free access to the source code. He later implemented fees for downloading binary code specific for one’s operating system but there were no time restrictions on its use. As far as I recollect, Schrodinger took over PyMol distribution and development promising to continue in the same spirit. Please correct me if I am wrong. I find the constant yearly licensing policy disturbing and will be looking for alternatives. I would like to hear if you have had the same experience and what you think about the Schrodinger policy. This is no different than the licenses that for-profit companies are required to purchase for most crystallography software. In fact, it's actually considerably more liberal than most software*, because as Jim notes you can still obtain (and redistribute) most of the source code for free. From what I can tell Schrodinger has continued to make improvements to the open-source core; some of the newer features (and the native Mac GUI) are proprietary, but that was true ten years ago. -Nat (* although I believe ccp4mg is truly open-source like Coot, and unlike CCP4 etc. which still require a license for commercial use. Or am I misinformed?)
Re: [ccp4bb] metal ion coordination
On Wed, Apr 23, 2014 at 6:15 AM, World light bsub...@btk.fi wrote: This discussion is very informative to fresher like me. Moreover, with most of the reading suggested in this discussion I read about the positively charged metal ions like Na, Ca, Mg and many more. I am curious about Cl in specific which could occur as a result of salt used in different crystallization condition. Any information for the Cl ion co-ordiantes? From Dauter Dauter (2001) (http://www.ncbi.nlm.nih.gov/pubmed/11250204): The coordination geometry of halide ions is not specific. . . Halide ions usually accept hydrogen bonds from various donor groups from the protein and neighboring water molecules. In addition, they make van der Waals contacts with non-polar protein atoms. . . The halide anions are monoatomic and polarizable, and consequently able to engage in both polar and hydrophobic interactions. . . Of the sites that are best for phasing, most contain halide ions that are hydrogen-bonded to amide nitrogen atoms, either from the protein mainchain or asparagine and glutamine sidechains. In addition, good sites often make ionic pairs with arginine or lysine residues. Sometimes, hydrogen bonds to the hydroxyl groups of threonine or serine residues can also be observed. All halide ions are in contact with water molecules, which can be ordered or in the bulk solvent region. I believe chloride is a little more predictable in this respect than the other halides (especially iodine). Also worth quoting: All halide ions share their sites with water molecules. Their coordination, appearance in electron-density maps and behavior during structure refinement is almost identical to that of fully occupied water molecules and only rarely is it possible to differentiate bromide or chloride ions from waters, especially if the sites are partially occupied. These ions can, however, be easily identified by their anomalous scattering signal. -Nat
Re: [ccp4bb] anomalous signal for Mg and Calcium
On Mon, Apr 21, 2014 at 3:36 PM, Faisal Tarique faisaltari...@gmail.comwrote: Just in the continuation of my previous mail i again want to ask few question on the metalloprotiens..Apart from factors like occupancy, B factor, coordination sphere and metal ion-ligand distances to distinguish Mg or calcium, can anomalous signal tell the identity and the type of metal ion bound to the protein, specifically in the case of Mg and Calcium. Short answer: if you see a peak in the anomalous difference map, it's almost certainly calcium, but if you don't see a peak, you still can't rule out calcium. Longer answer: magnesium almost never has observable anomalous signal at the wavelengths we normally use for data collection. The exception is if you collect extremely redundant data; Wayne Hendrickson has a very convincing example of this (I saw it in a talk, but I'll see if I can find a reference). Calcium anomalous signal depends on the data quality, but with good data and full occupancy it can show up in the anomalous difference map even at the SeMet K edge (~0.9794Å). However, this is not guaranteed, especially if it's not very tightly bound. At 2.6Å resolution it may be more difficult to distinguish, especially if you have other stronger anomalous scatterers. Collecting very redundant data will help a lot. .An anomalous data analyzed through Xtriage (phenix) gives a signal of 0.097 with Magnesium while the same gives a signal of 0.1062 with Caclium ( both data sets showing Anomalous flag as true )..can anybody shed some light on which is more true ?? I don't understand this - what exactly is the difference between the datasets? Anyway, that number is really not intended to be interpreted this way. the data has maximum resolution of 2.6A and i had kept Mg atom at the active site ( protein was incubated with 5mM MgCl2)..just because it is not matching a typical octahedral geometry and exact metal ion-oxygen distance as represented by Cambridge structural database (CSD) my reviewer has asked me to check anomalous signal for both Mg and Ca and ( he is expecting that scattering metal ion it to be Ca ) give appropriate reason for putting Mg there..please give suggestions. In addition to the anomalous maps, check the difference map (Fo-Fc) and B-factors after refinement with either element at full occupancy. If it is correctly identified, the difference map should be relatively flat and the B-factor should be similar to the coordinating atoms. Negative difference map peaks and/or a high B-factor suggest that the element is too heavy; positive peaks and/or low B-factors indicate the opposite. -Nat
Re: [ccp4bb] EDS server - R-value
On Fri, Apr 4, 2014 at 1:57 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: A more up-to-date reason is that programs calculate R values very differently. If you take a PDB file refined with program X and put it into program Y you easily get discrepancies greater than 5%. This is actually pretty rare - usually it's only 1-2% at most. Discrepancies like 16.5% versus 30.9% usually indicate that there's something wrong or misleading in the annotation of the entry, and often mean that you can't even reproduce the R-factor with the specified program. -Nat
Re: [ccp4bb] EDS server - R-value
On Fri, Apr 4, 2014 at 9:36 AM, Alastair Fyfe af...@ucsc.edu wrote: The topic brings up a question that I've been wondering about for some time, perhaps someone can enlighten me. Why is it not standard practice to deposit map coefficients along with structure factors ? Unlike image deposition there are no significant storage or file format issues. This would preserve a record of the final refinement used for publication, bypassing the impossible task of recording/reconstructing the program version and options used. There *are* file format issues, they're just very silly. I think the problem is that the PDB deposition service ignores most columns in MTZ files, even with standard labels that have not changed for years. If you deposit the reflections as mmCIF instead, and use the designated mmCIF dictionary items for your map coefficients (or Fcalc, phases, etc.), it will preserve them. For instance: http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=structfactstructureId=4OW3 I still don't think this solves the problem of faithfully recording the refinement protocol - how do you know what method was used to calculate the maps? -Nat
Re: [ccp4bb] EDS server - R-value
On Fri, Apr 4, 2014 at 10:39 AM, Alastair Fyfe af...@ucsc.edu wrote: Reconstructing the refinement may be necessary in some cases but there are other applications (pdb-wide map statistics, development of map analysis tools, quick model vs map checks) where access to the depositor's final map would be sufficient. I think these kinds of bulk analyses will be less effective if the maps are not calculated consistently. For instance, the question of how to handle missing reflections can make a big difference. Perhaps the coefficients are in fact included in many of the available mmCIF files? I should check.. No, because most of those mmCIF files were probably converted from MTZ format by the deposition server(s). -Nat
Re: [ccp4bb] Add an atom in Coot
On Tue, Mar 18, 2014 at 6:59 PM, Remie Fawaz-Touma remiefa...@gmail.comwrote: how do you place the pointer if there is no bond there? (just density) I am trying to connect 2 sugars creating 2 bonds to one oxygen that I have to add (oxygen does not exist now). On my Mac, I can change the pointer position by holding down the Control key and left-dragging the mouse. I would be surprised if this didn't work on Linux too - not sure about Windows. Editing the PDB file by hand is very risky and far too much work for what you want to accomplish. -Nat
Re: [ccp4bb] Validity of Ion Sites in PDB
On Thu, Mar 6, 2014 at 11:45 AM, Keller, Jacob kell...@janelia.hhmi.orgwrote: I was curious whether there has been a rigorous evaluation of ion binding sites in the structures in the pdb, by PDB-REDO or otherwise. I imagine that there is a considerably broad spectrum of habits and rigor in assigning solute blobs to ion X or water, and in fact it would be difficult in many cases to determine which ion a given blob really is, but there should be at least some fraction of ions/waters which can be shown from the x-ray data and known geometry to be X and not Y. This could be by small anomalous signals (Cl and H2O for example), geometric considerations, or something else. Maybe this does not even matter in most cases, but it might be important in others... A couple of references: http://www.ncbi.nlm.nih.gov/pubmed/18614239 http://www.ncbi.nlm.nih.gov/pubmed/24356774 Anecdotally, it is not difficult to find incorrect structures; in fact, one of mine has magnesium ions at crystal contacts with big Fo-Fc and anomalous map peaks, and I know it's not the only such structure in the PDB. However, while there are plenty of examples that are clearly wrong, it is difficult to come up with strict rules that apply to typical MX data - at 3Å resolution (or even better), a native Zn-binding site might have bond valences that are just awful. This doesn't mean the metal assignment is wrong. The placement of waters alone has a huge impact on such calculations. -Nat
Re: [ccp4bb] Table in NSMB
On Tue, Feb 18, 2014 at 8:19 AM, Jan van Agthoven janc...@gmail.com wrote: I'm filling out my table for NSMB, about a structure of protein ligand bound to a receptor. They ask for 3 different lines regarding number of atoms bfactor. 1) Protein 2) Ligand/Ion 3) Water. Does my protein ligand belong to Protein or Ligand/Ion? Why not list them each explicitly? In my experience the recommended table of crystallography statistics for most journals is just a suggestion, not a strict format. If you leave out information they might complain, but surely they won't object if you include additional details. (They usually just exile it to the unformatted supplementary materials anyway.) -Nat
Re: [ccp4bb] Sister CCPs
One comment (not a complaint) on all this: it seems like the same questions get asked over and over again. If there is a good place for a general crystallography FAQ list it is well past time for one to be put together - or maybe it just needs to be better advertised? At a minimum, for instance: - what cryoprotectant should I use? - how do I get big single crystals? - how do I improve diffraction? - how can I tell if I've solved my structure? - why is my R-free stuck? - is pick random statistic suitable for publication? Some of the other common queries (name my blob!) still need to be handled on a case-by-case basis, but it would be much more efficient for everyone if the standard answers were collected somewhere permanent. -Nat On Thu, Feb 13, 2014 at 7:05 AM, Eugene Valkov eugene.val...@gmail.comwrote: I absolutely agree with Juergen. Leaving aside methods developers, who are a completely different breed, there is no such thing as a crystallographer sitting in a dark room solving structures all day. If there are, these are anachronisms destined for evolutionary demise. More and more cell biologists, immunologists and all other kinds of biologists are having a go at doing structural work with their molecules of interest themselves without involving the professionals. Typically, they learn on the job and they need advice with all kinds of things ranging from cloning and protein preps through to issues with tetartohedrally-twinned data and interpreting their structures. So, a modern structural biologist is one who is equipped for the wet lab and has some idea of how to go about solving structures. CCP4BB is a wonderful resource that is great for both the quality of the advice offered to those that seek it and for the variety of topics that are addressed in the scope of structural biology. I have learnt greatly from reading posts from very skilled and knowledgeable scientists at this forum and then implemented these insights into my own research. I am very grateful for this. In short, please do not discourage your colleagues, particularly very junior ones, from posting to the CCP4BB. Some of the questions may appear quaint or irrelevant but it is easy to simply ignore topics that are of no interest! Eugene On 13 February 2014 14:41, Bosch, Juergen jubo...@jhsph.edu wrote: Let me pick up Eleanor's comment: is there something like a crystallographer today ? I mean in the true sense ? I think as a crystallographer you won't be able to survive the next decade, you need to diversify your toolset of techniques as pointed out in this article http://www.nature.com/naturejobs/science/articles/10.1038/nj7485-711a And I'm not quite sure how software developers see themselves, as I would argue they are typically maybe not doing so much wet lab stuff related to crystallography (I may be wrong here) but rather code these days. What type of crystallographer is a software developer ? I think like our beloved crystals we come in different flavors. And we need to train the next generation of students with that perspective in mind. Just my two cents on a snowy day (30cm over night) Jürgen .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://lupo.jhsph.edu On Feb 13, 2014, at 6:41 AM, Eleanor Dodson eleanor.dod...@york.ac.uk wrote: I agree with Frank - it keeps crystallographers modest to know how challenging wet lab stuff still is.. Eleanor On 12 February 2014 19:23, Robbie Joosten robbie_joos...@hotmail.com wrote: It's not an e-mail bulletin board, but Researchgate seems to be quite popular for wet lab questions. IMO the QA section of the social network is a bit messy. That said, the quality seems to improve gradually. Cheers, Robbie Sent from my Windows Phone Van: Paul Emsley Verzonden: 12-2-2014 19:23 Aan: CCP4BB@JISCMAIL.AC.UK Onderwerp: Re: [ccp4bb] Sister CCPs On 12/02/14 15:59, George Sheldrick wrote: It would be so nice to have a 'sister CCP' for questions aboud wet-lab problems that have nothing to do with CCP4 or crystallographic computing, The is clearly a big need for it, and those of us who try to keep out of wet-labs would not have to wade though it all. FWIW, the remit of CCP4BB, held at jiscmail-central, is describes as: /The CCP4BB mailing list is for discussions on the use of the CCP4 suite, and macromolecular crystallography in general./ Thus wet-lab questions are not off-topic (not that anyone recently described them as such). Having said that, Jiscmail mailing lists are easy to set-up (providing that you can reasonably expect that the mailing list will improve knowledge sharing within the UK
Re: [ccp4bb] Cryo solution for crystals grown in magnesium formate
On Mon, Dec 16, 2013 at 1:36 PM, Xiao, Junyu jx...@mail.ucsd.edu wrote: Dear all, sorry if this topic does not interest you. I wonder whether anyone has experience with freezing crystals grown in ~0.2 M Magnesium Formate. Garman and Mitchell suggested that A major anomaly is solution 44, 0.2 M magnesium formate, which requires 50% glycerol for cryoprotection in their 1996 paper (J Appl. Cryst. 29, 584-587). Since 50% glycerol is kind of harsh, I wonder whether anyone has tried alternative cryo protectant. Your kind help will be highly appreciated. Another good reference: http://journals.iucr.org/j/issues/2002/05/00/do0015/index.html It suggests 35% PEG 400, 30% ethylene glycol, or 30% of whatever PG means (based on the rest of the paper I suspect propanediol, but the abbreviation doesn't really make sense - perhaps Eddie Snell can clarify). There are of course many other good cryoprotectants beyond those evaluated in the paper; personally, I'm a big fan of xylitol (which I believe will work in lower concentrations - at least with some conditions), but what really matters is what the crystals can tolerate. Note that these estimates are using very strict criteria - you can often get away with less cryoprotection if you are very good at freezing crystals and/or willing to tolerate some increased background. But I wouldn't try this until you've determined that your crystals can't handle the recommended amounts. -Nat
Re: [ccp4bb] Comparison of Water Positions across PDBs
On Wed, Nov 6, 2013 at 12:39 AM, Bernhard Rupp hofkristall...@gmail.comwrote: Hmmm….does that mean that the journals are now the ultimate authority of what stays in the PDB? I find this slightly irritating and worthy of change. http://www.wwpdb.org/UAB.html It is the current wwPDB (Worldwide PDB) policy that entries can be made obsolete following a request from the people responsible for publishing it (be it the principal author or journal editors). I'm not sure I understand why things should be any different; the PDB is not advertising itself as anything other than an archival service, unlike the journals which are supposed to be our primary mechanism of quality control. -Nat
Re: [ccp4bb] Comparison of Water Positions across PDBs
On Tue, Nov 5, 2013 at 12:22 AM, Bernhard Rupp hofkristall...@gmail.comwrote: Given their otherwise almost paranoid sensitivity to ultimate author authority (resulting in things like still having 2hr0 etc in the bank because certain authors go AWOL or ignore major issues) In defense of the PDB, it's not just the authors who went AWOL in that case - it is ultimately the responsibility of the journals to retract clearly fraudulent publications. -Nat
Re: [ccp4bb] MacBook Pro graphics card options
On Wed, Oct 23, 2013 at 1:10 PM, Kristin Low kristin@queensu.ca wrote: I’m looking at upgrading my current laptop to a newer MacBook Pro. I’m torn as to whether I need integrated vs discrete graphics for structural biology, including molecular modelling, especially since the latest advances by Intel in terms of integrated graphics. Right now with the new releases, the options are between Intel Iris Pro (5200 series) and Intel Iris Pro + Nvidia GT 750M. It depends on how demanding your graphics needs are. I have no experience with the Iris Pro chips, but I've been using a MacBook Air from late 2011 almost exclusively for most of the last two years, including heavy use of Coot and PyMOL, and the only times I've been annoyed by slow graphics is when I'm trying to visualize a very large and/or high-resolution region of density. And even that doesn't happen too often. Most of the time it runs very smoothly. However, there are some persistent glitches with the graphical display in Coot - sometimes I get weird visual artifacts, or messed up depth perception. Whether this is the fault of Coot, XQuartz, or Intel is unknown. But speed is not an issue. The premium for the model with the Nvidia chip is quite steep at $600 (perhaps it's less with academic discount?). I don't think the graphics upgrade alone is worth it - but I'd be very tempted by the faster processor, doubled memory, and doubled SSD, all of which will come in handy when refining or rendering. Disclaimer: I know absolutely nothing about the availability of stereo options with any of these systems. It's possible the NVidia card has additional capabilities in that respect. -Nat
Re: [ccp4bb] Problematic PDBs
On Thu, Oct 17, 2013 at 6:51 AM, Lucas lucasbleic...@gmail.com wrote: I wonder if there's a list of problematic structures somewhere that I could use for that practice? Apart from a few ones I'm aware of because of (bad) publicity, what I usually do is an advanced search on PDB for entries with poor resolution and bound ligands, then checking then manually, hopefully finding some examples of creative map interpretation. But it would be nice to have specific examples for each thing that can go wrong in a PDB construction. This would be a good place to start: http://www.ncbi.nlm.nih.gov/pubmed/23385452 The retracted ABC transporter structures are also good, although less obvious to the untrained eye. I forget what the PDB IDs are but I'll see if I can dig them up. -Nat
Re: [ccp4bb] OT: Who's Afraid of Peer Review?
On Wed, Oct 9, 2013 at 6:56 PM, Marco Lolicato chimbio...@gmail.com wrote: Anyway, for those reasons and more, I was wondering if maybe is nowadays needed to revisit the peer-review process. Apologies for the lengthy response, but I really do think the current publication system is broken, just not for the same reasons as others. One of the interesting suggestions put forth by Michael Eisen (PLoS co-founder, and author of the previously-linked rant), among others, is that post-publication peer review should become more widely used and partially substitute for the current system. This has always been done informally at conferences, journal clubs, ccp4bb emails, and so on, but this is all very unstructured and not always public. The only formal outlets are by writing a letter to the editor of a journal, which is a very time-consuming process, or by writing a more thorough follow-up article dissecting the problems. There are some advantages to this - the discovery of fraudulent structures would probably not have been as widely noticed if the analysis took the form of blog comments. However, the overhead (in time and effort) is so massive as to deter all but the most determined scientists. At a minimum, I'd like to see a more structured but very lightweight way to discuss *and track* problems with the literature, ideally at the source(s). For instance, if I go to this PDB entry, there is absolutely no indication of anything suspicious: http://www.rcsb.org/pdb/explore/explore.do?structureId=2hr0 If I follow the publication links, I can see that there is a brief communication arising associated with the article: http://www.nature.com/nature/journal/v444/n7116/full/nature05258.html but since Nature's editors watered down the letter and accepted a response that did nothing to address any of the questions, it's difficult for an non-expert to reach any conclusions. Few of the multiple derived databases have anything either; Proteopedia is the big exception. One has to do a surprising amount of digging to find out that the senior author's university publicly disclaimed the structure as fraudulent. There is also a very large volume of comment on the ccp4bb, including some excellent specific (illustrated!) examples of problems with the structure. But none of this is centrally available, because the journal (and databases) do not provide any mechanism other than the lengthy formal route. It's true that a little Googling will quickly uncover problems with this particular paper. However, from what I've seen there are depressingly many scientists who are unable to use Google even for the questions they already know to ask. And this is an exceptional case; there are many other problematic structures (anyone working in methods development probably has a long list) for which no such information is available, because we don't have time to write a formal letter to the journal editors, especially since there's no guarantee that they'll even pay attention to us. It would be far more efficient if I could simply post a comment at the source saying ligand density does not support binding - see attached image. In the long term, if there was a better system for this kind of peer review, the current system could mostly go away. Post a manuscript on arXiv (or equivalent), let the community comment on it and rate it, and the eventual consensus determines its credibility and importance. Scientists would stop wasting months tailoring the paper to fit into arbitrary and obsolete length restrictions or to impress the editors of high-profile journals or please a handful of anonymous reviewers. There would be no disincentive to publish negative results or brief technical write-ups. Both publication and review would be immediate, inexpensive, and public. The scientific literature would become truly self-correcting over any time scale. Undoubtedly there are issues with this, and I'm sure there are other approaches that could work too. But the present system is both horribly inefficient and too permissive of outright junk, and I think it's really holding us back. -Nat
Re: [ccp4bb] השב: [ccp4bb] Why nobody comments about the Nobel committee decision?
Levitt also contributed to DEN refinement (Schroder et al. 2007, 2010). -Nat On Wed, Oct 9, 2013 at 2:29 PM, Boaz Shaanan bshaa...@bgu.ac.il wrote: Good point. Now since you mentioned contributions of the recent Nobel laureates to crystallography Mike Levitt also had a significant contribution through the by now forgotten Jack-Levitt refinement which to the best of my knowledge was the first time that x-ray term was added to the energy minimization algorithm. I think I'm right about this. This was later adapted by Axel Brunger in Xplor and other progrmas followed. Cheers, Boaz הודעה מקורית מאת Alexander Aleshin aales...@sanfordburnham.org תאריך: 10/10/2013 0:07 (GMT+02:00) אל CCP4BB@JISCMAIL.AC.UK נושא [ccp4bb] Why nobody comments about the Nobel committee decision? Sorry for a provocative question, but I am surprised why nobody comments/congratulations laureates with regard to recently awarded Nobel prizes? However, one of laureates in chemistry contributed to a popular method in computational crystallography. CHARMM - XPLOR - CNS - PHENIX-… Alex Aleshin
Re: [ccp4bb] Rmerge of the last shell is zero
On Wed, Aug 14, 2013 at 10:31 PM, Edward A. Berry ber...@upstate.eduwrote: If you refine once in phenix you can use phenix.cc_star to calculate cc* and compare with R and R-free; from the output mtz file and your unmerged .sca file. FYI, this should also work with structures refined in Refmac, assuming it can recalculate the R-factors to within a reasonable margin of error. -Nat
Re: [ccp4bb] mmCIF as working format?
On Wed, Aug 7, 2013 at 12:54 PM, James Stroud xtald...@gmail.com wrote: All that needs to happen is that the community agree on 1. What is the finite set of essential/useful attributes of macromolecular structural data. 2. What is the syntax of (a) accessing and (b) modifying those attributes. 3. What is the syntax of selecting subsets of structural data based on those attributes. The resulting syntax (i.e. language) itself should be terse, easy to learn, easy to use, and preferably easy to implement. Ah, but the nice thing about mmCIF is that it isn't truly finite - the PDB may limit what tags are actually included in the distributed files, but there is nothing preventing other developers from including their own tags, and there is a community process for extending the officially defined tags. Item (2) is very well-established, unlike the current chaos of REMARK records. I think (3) will be left to the various libraries to deal with. -Nat
Re: [ccp4bb] mmCIF as working format?
On Wed, Aug 7, 2013 at 2:36 PM, James Stroud xtald...@gmail.com wrote: Although it is likely the best library for working with structural data, CCTBX requires a loop just to change a specific chain ID (to the best of my knowledge): ... I don't intend to pick on CCTBX specifically (because the CCTBX developers have specific needs to which they program), but loop/test mechanisms are awkward for selecting and modifying structural data, and get much more awkward as selections get more complex (e.g. selecting the C-alpha of every alanine of chain A, etc.). True - it's really an issue of what purpose the libraries were designed for. CCTBX wasn't intended to be a general-purpose tool for users to perform quick manipulations of a model; the goal was to build large, complex, and more-or-less automated crystallography applications on top of it. (The same applies to the CCP4 libraries, mmdb, clipper, etc.; BioPython I guess is designed for bioinformatics.) The design of CNS (for example) reflects an era where it was much more likely that the average crystallographer knew some programming, worked exclusively on the command line, built new models manually, and didn't have access to a large number of convenient tools for purposes like this. (Or so I've heard; I was in still in high school.) Personally, if I need to change a chain ID, I can use Coot or pdbset or many other tools. Writing code for this should only be necessary if you're processing large numbers of models, or have a spectacularly misformatted PDB file. Again, I'll repeat what I said before: if it's truly necessary to view or edit a model by hand or with custom shell scripts, this often means that the available software is deficient. PLEASE tell the developers what you need to get your job done; we can't read minds. -Nat
Re: [ccp4bb] mmCIF as working format?
On Mon, Aug 5, 2013 at 11:11 AM, Phil Jeffrey pjeff...@princeton.eduwrote: While alternative programs exist to do almost everything I prefer something that works well, works quickly, and provides instant visual feedback. CCP4 and Phenix are stuck in a batch processing paradigm that I don't find useful for these manipulations. Speaking as a developer, it's probably much easier and faster for us to write software that *does* do what you want, instead of piling on hacks to keep the PDB format alive another 30+ years. While PDB is limited and has a lot of redundant information it's for the latter reason it's a rather useful format for quickly making changes in a text editor. It's certainly far faster than using any GUI, and it's also faster than the command line in many instances - and I have my own command line programs for hacking PDB files (and ultimately whatever formats come next) Most complaints of this sort seem to be based on an unrealistic expectation that your own experiences and skills are representative of the rest of the community. The vast majority of crystallographers don't have their own command-line programs, aren't familiar with the intricacies of PDB format, and as often as not botch the job when they attempt to edit their PDB files by hand. (I get a lot of bug reports like this.) They're not going to care whether they can use 'awk' on their structures. Using mmCIF as an archive format makes sense, but I doubt it's going to make building structures any easier except for particularly large structures where some extended-PDB format might work just as well or better. There is a lot of information that can't easily be stored simply by making the ATOM records wider. Right now some of this gets crammed into the REMARK section, but usually in an unstructured and/or poorly documented format. This isn't just problematic for archival - it limits what information can be transferred between programs. mmCIF has none of these limitations. I have some reservations about the current specification (for instance, the fact that the original R-free flags are not stored separately in deposited structure factor files, and are instead mixed into the status flag, which can have multiple other meanings), but at least there is a clear process for extending this in a way that does not (or should not, anyway) break existing parsers. -Nat
Re: [ccp4bb] mmCIF as working format?
On Mon, Aug 5, 2013 at 12:37 PM, Boaz Shaanan bshaa...@bgu.ac.il wrote: There seems to be some kind of a gap between users and developers as far the eagerness to abandon PDB in favour of mmCIF. I myself fully agree with Jeffrey about the ease of manipulating PDB's during work, particularly when encountering unusual circumstances (and there are many of those, as we all know). And how about non-crystallographers that are using PDB's for visualization and understanding how their proteins work? I teach many such students and it's fairly easy to explain to them where to look in the PDB for particular pieces of information relevant to the structure. I can't imagine how they'll cope with the cryptic mmCIF format. I think the only gap is between developers and *expert* users - most of the community simply wants tools and formats that work with a minimum of fiddling. Again, if users are having to examine the raw PDB records visually to find information, this is a failure of the software. -Nat
Re: [ccp4bb] Where to cut the data in this medium resolution dataset
On Mon, Jul 22, 2013 at 10:19 AM, Stefan Gajewski sgajew...@gmail.comwrote: The maps shows signs of over fitting, the B-factors do not look correct in my opinion. What do correct B-factors look like? What refinement strategy did you use for them? Note that the R-free value in the 3.4A shell is lower than the R-work (and also the Rpim in that shell!) which clearly indicates this refinement was not stable. I don't think it indicates anything about the stability of refinement - my guess would be that the NCS is biasing R-free. I suppose it could also indicate that the data in the 3.6-3.4 range are basically noise, although if the maps look better then that would suggest the opposite. The structure contains no beta sheets and refinement also profits greatly from very rigid high-order NCS. The maps are very detailed, in fact better than some 2.8A maps I've seen before. The 0.2A in question here are actually quite helpful to increase the map quality, so I keep wondering if I should deposit the structure with them or keep them only for my own interpretation. I would deposit the data to 3.4Å in any case; what cutoff you refine the structure to is a separate decision. Before I continue optimizing the integration/refinement I would like to hear suggestions from the experts where to make the resolution cut-off in this case? Do I have all information I need to make that decision? What arguments should I present when dealing with the reviewers? I mean, the Rrim/Rmerge values are really very high. Do what Karplus Diederichs suggest: take the structure refined to 3.4Å, and recalculate the R-factors for that model with the data cut to 3.6Å. If the R-free calculated this way is below the R-free for the model refined to only 3.6Å, then the extra 0.2Å is contributing real information and improving the quality of your model, which is the best justification for extending to higher resolution. -Nat
Re: [ccp4bb] Refinement of partly occupied water molecules
On Fri, Jul 12, 2013 at 1:08 AM, Stefan Krimmer krim...@staff.uni-marburg.de wrote: in some of my macromolecular crystal structures with resolutions between 1.1 - 1.4 Å, several round positive Fo-Fc electron density blobs are detectable which show after assignment of a water molecule to these blobs and subsequent refinement with Phenix.refine a good-looking 2Fo-Fc electron density. However, there also occurs a small negative Fo-Fc electron density detectable inside the 2Fo-Fc density blob. The negative Fo-Fc electron density disappears if the occupancy of the water molecule is automatically refined by Phenix.refine (occupancy manually set to a value below 100% followed by refinement) or manually set to 50% and fixed for this value (Fix occupancy option in phenix.refine). Therefore, I think these positions are partly occupied by water molecules, but I am not sure how I should handle it/how it is generally handled. Which one of the two options described above is the better one? I would be thankful for any advice and/or literature about this topic. When I had to deal with this in the past, I followed this advice (from Thomas Schneider): http://www.embl-hamburg.de/~tschneider/shelxl/shelxl_faq/shelxlfaq.html#Q16 This is especially true at the resolutions you're working with; even with subatomic resolution data I believe that the observation in the FAQ (that refining the occupancies doesn't improve R-factors and may even make them worse) will be true in most cases - and regardless of program used, btw. (I can't remember if I ever tried comparing the outcomes myself, though.) -Nat
Re: [ccp4bb] ctruncate bug?
On Sat, Jun 22, 2013 at 3:18 PM, Frank von Delft frank.vonde...@sgc.ox.ac.uk wrote: In what scenarios would these improved estimates make a significant difference? Perhaps datasets where a unusually large number of reflections are very weak, for instance where TNCS is present, or where the intensity falls off quickly at lower resolution (but remains detectable much further)? -Nat
Re: [ccp4bb] Concerns about statistics
On Thu, Jun 13, 2013 at 8:15 AM, Andrea Edwards edwar...@stanford.eduwrote: I have some rather (embarrassingly) basic questions to ask. Mainly.. when deciding the resolution limit, which statistics are the most important? I have always been taught that the highest resolution bin should be chosen with I/sig no less than 2.0, Rmerg no less than 40%, and %Completeness should be as high as possible. However, I am currently encountered with a set of statistics that are clearly outside this criteria. Is it acceptable cut off resolution using I/sig as low as 1.5 as long as the completeness is greater than 75%? Another way to put this.. if % completeness is the new criteria for choosing your resolution limit (instead of Rmerg or I/sig), then what %completeness is too low to be considered? Also, I am aware that Rmerg increases with redundancy, is it acceptable to report Rmerg (or Rsym) at 66% and 98% with redundancy at 3.8 and 2.4 for the highest resolution bin of these crystals? I appreciate any comments. A (probably) better way: http://www.ncbi.nlm.nih.gov/pubmed/22628654 Short version: don't try to use simplistic rules, instead use all data that actually improve the model. In practice, what I've noticed in some recent articles is (paraphrasing) data extend to 2.5Å with an I/sigma of 2 in the highest-resolution shell, but we used data to 2.2Å as suggested by Karplus Diederichs. This allows you to actually use as much data as possible while still (hopefully) pleasing any pedantic reviewers. (Substitute 90% completeness or R-merge of whatever for the I/sigma cutoff if you prefer, the end result will still be the same.) -Nat
Re: [ccp4bb] Extracting .pdb info with python
On Fri, Jun 7, 2013 at 8:37 AM, Pete Meyer pame...@mcw.edu wrote: On the other hand, programming an implementation of something is a good way to make sure that you really understand it - even if you end up using another program. I would argue that it's not really necessary to understand the column formatting of a PDB file, any more than it's necessary to understand how binary data is arranged in an MTZ file. (Especially since the long-term plan is to migrate to mmCIF, which is more flexible and can store far more information.) We're ultimately trying to answer questions of biology and chemistry, not informatics, and writing a parser that actually handles all of the variety in the PDB (let alone the garbage produced by some programs) is far more difficult than it sounds. -Nat
Re: [ccp4bb] Off-topic: PDB statistics
On Mon, Apr 15, 2013 at 11:47 AM, James Holton jmhol...@lbl.gov wrote: However, I'm sure the day is not far off when phenix.refine or the like will check if the starting R factor is too high and just automatically invoke a run of MR to see if something clicks. I think the latest Phaser code actually does the reverse: if the R-factor is already relatively low, it just outputs the search model. The more problematic (and very common) situation is where the structures are slightly isomorphous and rigid-body plus restrained refinement alone could work, but MR might work better - I don't think anyone has comprehensively evaluated this. We usually just run Phaser because compared to rebuilding and refinement, it's simply not that much of a bottleneck. -Nat
Re: [ccp4bb] CCP4 Update victim of own success
On Fri, Apr 12, 2013 at 10:27 AM, James Holton jmhol...@lbl.gov wrote: But, when it comes to GUIs, I have always found them counterproductive. In my humble opinion, the purpose of computers and other machines is to DO work for me, not create work for me, and I already have enough buttons to push each day. This is a very defensible position with regards to your normal workflow (or mine) - but beamline scientists (or software developers) are not very representative of crystallographers as a group. I've seen a lot of reflexive anti-GUI mentality from users who don't fall into either category, presumably because a senior postdoc or PI told them real crystallographers use the command line, when in reality they'd be better served by figuring out on their own what workflow is most efficient for them. -Nat
Re: [ccp4bb] CCP4 Update victim of own success
On Fri, Apr 12, 2013 at 2:45 PM, Boaz Shaanan bshaa...@exchange.bgu.ac.ilwrote: Whichever way the input file for the run is prepared (via GUI or command line), anybody who doesn't inspect the log file at the end of the run is doomed and bound to commit senseless errors. I was taught a long time ago that computers always do what you told them to do and not what you think you told them, which is why inspecting the log file helps. I agree in principle - I would not advocate that anyone (*especially* novices) run crystallography software as a black box. However, whether or not a program constitutes a black box has nothing to do whether it runs in a GUI or not. The one advantage a GUI has is the ability to convey inherently graphical information (plots, etc.). That it is still necessary to inspect the log file(s) carefully reflects the design of the underlying programs; ideally any and all essential feedback should also be displayed in the GUI (if one exists). Obviously there is still much work to be done here. -Nat
Re: [ccp4bb] delete subject
On Thu, Mar 28, 2013 at 11:28 AM, mjvdwo...@netscape.net wrote: Although it is hard to imagine, there could be a mechanism by which you make all your data public, immediately when you get it and this public record shows who owns it. http://deposit.rcsb.org (or international equivalent) The advantage (in my mind) of such a system would be that you would also make public the data that does not make sense to you (it does not fit your scientific model) and this could (and has) lead to great discoveries. The disadvantage to the method is that you will sometimes post experiments that are just completely wrong There is a further problem: since as Frank pointed out, structures are increasingly less valuable without accompanying non-crystallographic experiments, there is a risk of other groups taking advantage of the availability of data and performing the experiments that *you* had hoped to do. Or, similarly, a group who already has compelling biochemical data lacking a structural explanation would immediately have everything they needed to publish. Either way, you would be deprived of what might have been a thorough and genuinely novel publication. Since most employment and funding decisions in the academic world are made on the basis of original and high-profile research and not simply number of structures deposited in the PDB, this puts the crystallographer at a distinct disadvantage. This isn't purely hypothetical - a grad school classmate who worked on genome sequences complained about the same problem (in her case, the problem was bioinformatics groups analyzing the data - freely available on the NCBI site, as mandated by the funding agencies - before the sequencing was even complete). Of course the same argument has been used in the past against immediate release of PDB entries upon publication - and the community (quite appropriately, IMHO) rejected it as nonsense. I actually like the idea of releasing data ASAP without waiting to publish, but it has a lot of practical difficulties. -Nat
Re: [ccp4bb] Need specific molecular replacement test cases
On Fri, Mar 8, 2013 at 11:38 AM, Raji Edayathumangalam r...@brandeis.edu wrote: I am looking for two specific test cases (below) and appreciate anyone pointing me to known structures/examples for the same. (1) For a successful case of molecular replacement in which the search model has an overall sequence identity to the target in the twilight zone or worse (25% or less). There are some good examples here: http://journals.iucr.org/d/issues/2011/04/00/ba5163/index.html http://journals.iucr.org/d/issues/2004/07/00/gx5015/index.html -Nat
Re: [ccp4bb] How to slow down crystallization? Need hep!
On Mon, Feb 25, 2013 at 8:02 AM, lei feng spartanfeng...@hotmail.com wrote: I need your suggestion for slowing down crystallization for my protein my protein got hit in PEG/ION #5 ( 0.2 M MgCl2, 20% PEG 3350, pH 5.9), but it crystallize too fast. In 1 hr I can see tons of tiny needles. Can anyone give me some suggestion on how to slow down the process? I used lower conc. of potein, lower conc. of PEG ( 10%), it helped a little bit, giving me small rod crystal. but no improvement after that. Sometimes you can do this by adding a tiny amount of glycerol to your protein solution - I've seen 0.5% make the difference between awful plate clusters and nice individual crystals. I think it's also possible to use a combination of low concentrations and micro-seeding, but I've never done this personally. -Nat
Re: [ccp4bb] Improving Homology Models
On Wed, Feb 20, 2013 at 12:39 PM, Jacob Keller j-kell...@fsm.northwestern.edu wrote: it has been my experience that homology modelling programs get folds pretty well, but sometimes the details are pretty obviously bad, like too-close contacts. One might think that the modelling software would put in a sort of polishing step, but they don't seem to. Is there any way to trick the CCP4 or other software to fix these things, such as by simulated annealing or otherwise, I guess without any weight on the [non-existent] structure factors? What software were you using? There must be dozens of papers (at least) on this subject, and assessment of refinement and model quality is a major part of the CASP competition. The Rosetta relax protocol is one of the best known, but there are many other approaches (including MD), some of which are definitely integrated into modeling pipelines. I'd start here: https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d6/d41/relax_commands.html and also: https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d5/d4e/comparative_modeling.html Of course if the model is too awful, there isn't much that can be done to relieve gross errors without completely rebuilding. I don't know what the radius of convergence of the various protocols is; Rosetta relax certainly can't fix some of the truly awful models in the PDB (but it's by no means the only option). -Nat
Re: [ccp4bb] protein crystals or salt crystals
If SPG buffer is what I think it is, that means you have a significant concentration of inorganic phosphate, which forms salt crystals when mixed with divalent metal ions. -Nat On Thu, Feb 7, 2013 at 2:24 PM, amro selem amro_selem2...@yahoo.com wrote: Hallo my colleagues. i hope every one doing ok . i did screening since two weeks . i noticed today this crystals. i don`t know either it salt or protein crystal . my protein has zero tryptophan so i could distinguish by UV camera. the condition was conditions: 0.1M SPG buffer pH 8 and 25%PEG 1500. in addition to Nickle chlorid 1mM. best regards Amr
Re: [ccp4bb] Fwd: Strange Density
On Mon, Feb 4, 2013 at 12:24 PM, Roger Rowlett rrowl...@colgate.edu wrote: It's possibly a transition metal ion. Zinc is a common adventitious contaminant of solutions. Typical Zn-O distances (tetrahedral or pseudo-tetrahedral coordination) are 2.0 A. ICP-OES or ICP-MS of the protein solution might offer a clue to the possible identity of the metal ion, since it appears to be nearly stoichiometric with your protein. Zinc tends not to bind carbonyl oxygens, however, but calcium does quite frequently. Also, the presence of calcium acetate in the crystallization solution strongly suggests that this is what is actually bound (especially if it's at a concentration around 200mM, as is common in many crystallization screens). Jared: what do you mean by proper coordination? Surface ions bound non-specifically frequently don't have recognizable coordination, and they become even more vague as resolution decreases. As always, it would be worth looking at the anomalous difference map, although whether you'll actually see anything depends on the element and on wavelength, data quality, anomalous completeness, etc. -Nat
Re: [ccp4bb] off topic: DSSP
On Mon, Jan 28, 2013 at 8:04 AM, Antony Oliver antony.oli...@sussex.ac.uk wrote: If you don't mind using the ksDSSP implementation, it is already installed with the phenix suite if you have it. Correct, but although the method is supposed to be the same, the output is not, and there are bugs in how it presents helix annotations. So I'm not sure it's a reliable substitute for the original DSSP - we use it in Phenix to calculate secondary structure restraints, with some extra filtering to catch the buggy annotations. (Unfortunately it was the only open-source program I could find for this purpose.) -Nat
Re: [ccp4bb] off topic: DSSP
On Mon, Jan 28, 2013 at 8:39 AM, Robbie Joosten robbie_joos...@hotmail.com wrote: DSSP recently went open source with a very liberal license. So you can consider using the real DSSP now. This may also be the moment to integrate DSSP in CCP4. Based on the info here: http://swift.cmbi.ru.nl/gv/dssp/ the license isn't very liberal - it looks more like the proprietary-with-source-code licenses used by CCP4 and Phenix (among others), which preclude redistribution or anything resembling commercial use without permission. I also saw this: The new DSSP, is distributed as executable only. You can get the new DSSP source code only if you can convince us that it is really needed for a worthy scientific cause. Or am I looking in the wrong place? -Nat
Re: [ccp4bb] refmac5 vs phenix refine mixed up
On Fri, Jan 25, 2013 at 2:24 AM, Robbie Joosten robbie_joos...@hotmail.com wrote: Phenix however needs to deal with the CCP4 type reflection binning. Now the size of the sets cannot be used which means that you have find a smarter solution. So I wonder how this is implemented. Does Phenix use the (reasonable) assumption that the test set is labeled 1.00 or 0.00? Or does it also check the sets with other labels? I forget the exact rules, but the general assumption is that if you have multiple flag values (such as 0 through 19), the test set is marked by the lowest value. If you have just two values, the test set is whichever is less common. (For SHELX files this would typically be -1, for CNS files it would be 1, but you could just as easily swap flag values and it would still pick the correct set.) I'm sure someone can figure out a way to break this (for instance, by assigning the flags with CCP4, but using 7 instead of 0 as the test set), but in practice nearly every file we've seen obeys these rules, and it can of course be overridden by the user. Anyway this is all open-source, so you can check (and re-use!) the logic for yourself here: http://cctbx.svn.sourceforge.net/viewvc/cctbx/trunk/iotbx/reflection_file_utils.py?revision=16491view=markup -Nat
Re: [ccp4bb] refmac5 vs phenix refine mixed up
On Thu, Jan 24, 2013 at 10:34 AM, Leonid Sazanov saza...@mrc-mbu.cam.ac.uk wrote: Most likely scenario is that Phenix by default assigns Rfree flag as 1, while ccp4/refmac - as 0. That would explain your Rfree going down - because your Rfree reflections were refined by refmac. According to Garib, the current version of Refmac will automatically switch to the proper flags, so this problem should go away. -Nat
Re: [ccp4bb] B-factors
On Thu, Jan 24, 2013 at 3:52 PM, Urmi Dhagat udha...@svi.edu.au wrote: If Rfree reflections are refined my refmac upon switching from phenix to refmac then does this contaminate the Rfree set ? Should swiching between refinement programs Phenix and Refmac be avoided? Repeating what was said earlier today: if you use the newest version of Refmac, either in CCP4 6.3 or downloaded from Garib's homepage, you will not have any problem switching back and forth with Phenix (any version). -Nat
Re: [ccp4bb] Mac mini advice
On Tue, Jan 22, 2013 at 9:59 AM, Cara Vaughan c.vaug...@mail.cryst.bbk.ac.uk wrote: I've seen from the archive that some people do use the Mac Mini for crystallography and I've got two questions: 1. Do I need the Quad core or is a Dual core processor enough? You can survive with the dual, but I would definitely spring for the quad if you can afford it - but I suppose it depends on how you work. I like to run multiple of jobs at once if possible, and still have a core left for the web browser, etc. Some programs will also take advantage of multiple processors, for that matter. I would definitely recommend maxing out the memory, but don't buy it from Apple - we were able to get 16GB from CDW for less than $100. 2. Is the intergrated Intel HD graphics card OK for crystallography requirements? It depends on your requirements, but I've been using Coot and PyMOL frequently on a MacBook Air for the last year, and usually the graphics chip (also Intel HD) isn't the bottleneck. -Nat
Re: [ccp4bb] Mac mini advice
On Tue, Jan 22, 2013 at 10:05 PM, James Stroud xtald...@gmail.com wrote: On Mac v. Linux where calculations come secondary to office-type calculations, you have to weigh your level of vendor lock-in. Do you run Libreoffice or Microsoft Office? Inkscape or Illustrator? Gimp or Photoshop? Etc. If you are locked-in to commercial products and haven't migrated to open source, then you may want to think twice about a Linux box. Macs are very seamless for an office environment, but I don't know if they are appropriate for heavy-duty calculations given that you'll trade horsepower for the Mac experience. In my experience, yes they are (depending on your definition of heavy-duty - everything I work with is either small or low-resolution). The real difficulty is integrating Macs into a Linux-centric environment, for example configuring NFS, NIS, etc. Far, far more painful than it needs to be, and for this reason I would avoid Macs for shared workstations or (even worse) servers. But they make excellent standalone systems, are very easy to maintain, and while they may be relatively pricey, some of the premium features (like SSDs) really do make a big difference, and the performance is quite adequate even for the low-end laptops like the Air. A $400 Celeron PC laptop, on the other hand, is probably large, heavy, and a piece of junk. -Nat
Re: [ccp4bb] how many metal sites
On Wed, Jan 16, 2013 at 2:53 PM, Roger Rowlett rrowl...@colgate.edu wrote: When you are a building a metalloenzyme model you should really have some solid evidence that a metal ion is present by (1) inclusion in the crystallization medium, (2) direct determination by an analytical technique, (3) UV-visible spectroscopy (when appropriate--obviously Zn(II) is d10 and silent in the visible d-d transition wavelength range) and/or (4) appropriate coordination geometry and bond lengths. What about: (5) anomalous scattering (i.e. anomalous difference map)? Even on a home source I suspect Zn should still be visible, and at shorter wavelengths this should certainly be the case if the anomalous data are reasonably good and complete. The coordination geometry and bond lengths aren't necessarily going to be definitive at this resolution, although I agree that it should be approximately tetrahedral. -Nat
Re: [ccp4bb] a challenge
On Mon, Jan 14, 2013 at 11:18 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: I admit not having read all contributions to this thread. I understand the John Henry Challenge as whether there is an 'automated way of producing a model from impossible.mtz'. From looking at it and without having gone all the way to a PDB-file my feeling is one could without too much effort from the baton mode in e.g. coot. This should be even more possible if one also uses existing knowledge about the expected structure of the protein: a kinase domain is quite distinctive. So, James, how much external information from homologous structures are we allowed to use? Running Phaser would certainly be cheating, but if I take (for instance) a 25% identical kinase structure, manually align it to the map and/or a partial model, and use that as a guide to manually rebuild the target model, does that meet the terms of the challenge? -Nat
Re: [ccp4bb] Fwd: Re: [ccp4bb] Convert cbf to png/tiff?
I think the help message refers to another program. Anyway, it's an extremely simple script - having examined the code, the command-line invocation is: labelit.png input_file [output_file] and that's it - no other options available. But Nick or I will fix it so it prints something more useful if run without arguments. -Nat On Fri, Jan 11, 2013 at 9:33 AM, Frank von Delft frank.vonde...@sgc.ox.ac.uk wrote: I got that error blurb too when I run without an image on the commandline. Not very elegant. Try: labelit.png --help On 11/01/2013 16:34, Tim Gruene wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Nat! How recent is recent? From today's 'phenix-online.org': New Phenix version 1.8.1 now available, but tg@slartibartfast:~/uni/datasets/nk/xds_run3$ labelit.png_1.8.1-1168 DX-CORRECTIONS.cbf Traceback (most recent call last): File /xtal/Suites/Phenix/phenix-1.8.1-1168/build/intel-linux-2.6-x86_64/../../labelit/command_line/png.py, line 27, in module OV = overlay_plain(infile,graphics_bin) File /xtal/Suites/Phenix/phenix-1.8.1-1168/build/intel-linux-2.6-x86_64/../../labelit/command_line/png.py, line 6, in __init__ OverlayDriverClass.__init__(self,infile,graphics_bin) File /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/command_line/overlay_distl.py, line 19, in __init__ self.I = GenericImageWorker(infile,binning=graphics_bin) File /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/graphics/support.py, line 12, in __init__ images = ImageFiles(imagenames,labelit_commands) File /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/command_line/imagefiles.py, line 184, in __init__ self.filenames = FileNames(arg_module,phil_params) File /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/command_line/imagefiles.py, line 70, in __init__ self.interface3_parse_command() File /xtal/Suites/Phenix/phenix-1.8.1-1168/cctbx_project/spotfinder/diffraction/imagefiles.py, line 138, in interface3_parse_command self.interface3_FN_factory(os.path.abspath(file),error_message=File name not accepted) File /xtal/Suites/Phenix/phenix-1.8.1-1168/cctbx_project/spotfinder/diffraction/imagefiles.py, line 127, in interface3_FN_factory raise Exception(Input error: +error_message) Exception: Input error: File name not accepted Compared to: tg@slartibartfast:~/uni/datasets/nk/xds_run3$ adxv -sa DX-CORRECTIONS.cbf DX-CORRECTIONS.tiff Adxv Version 1.9.8 Copyright (C) 1994-2011 by Andrew Arvai, Area Detector Systems Corporation Recognized CBF format data. Warning: Could not find diffrn_frame_data or diffrn_data_frame tg@slartibartfast:~/uni/datasets/nk/xds_run3$ identify DX-CORRECTIONS.tiff DX-CORRECTIONS.tiff TIFF 768x768 768x768+0+0 8-bit PseudoClass 256c 592KB 0.000u 0:00.000 Cheers, Tim On 01/10/2013 09:59 PM, Frank von Delft wrote: Brilliant - thanks Nat!! Easy to work around that feature. And thanks Nick!! Original Message Subject: Re: [ccp4bb] Convert cbf to png/tiff? Date: Thu, 10 Jan 2013 12:47:21 -0800 From: Nat Echols nathaniel.ech...@gmail.com To: Frank von Delft frank.vonde...@sgc.ox.ac.uk Using any recent Phenix distribution: labelit.png file_name For reasons unknown to me, the output is named plain.png - I will bug Nick about this. On Thu, Jan 10, 2013 at 12:36 PM, Frank von Delft frank.vonde...@sgc.ox.ac.uk wrote: Hello all - anybody know an easy way to convert CBF images (Pilatus) into something lossless like tiff or png? Ideally *easy* as in r e a l l y e a s y and not requiring extensive installation of dependencies and stuff. Because then I might as well write my own stuff using cbflib and PIL in python. Thanks! phx - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFQ8D8kUxlJ7aRr7hoRAkcsAJ9Z4JmZ3zAvFYDRMaGbzunMYTjLNACfQWBP s3YM17LqF9zvCsAK8ezG8xQ= =hXlq -END PGP SIGNATURE-
Re: [ccp4bb] About NCS and inhibitors
On Mon, Jan 7, 2013 at 1:28 AM, Xiaopeng Hu huxp...@mail.sysu.edu.cn wrote: We recently resolved an enzyme/inhibitor complex structure. The enzyme contains two NCS related active site and we did find extra density in both of them.However we observed that the two inhbitor moleculors are not NCS related, but partly overlaped if make a NCS moleculor. Has anyone else observed this before? Thanks for any help and suggestion! If I'm not misunderstanding the question, some HIV protease inhibitor complexes do this too. PDB ID 2fxe is a good example. -Nat
Re: [ccp4bb] Acceptable Clash Score
On Thu, Nov 8, 2012 at 12:20 AM, Mark J van Raaij mjvanra...@cnb.csic.es wrote: Depends on what you call a solved structure. For deposition to the pdb ideally there should be very little clashes like Nat writes. But perhaps you are referring to the clash score just after molecular replacement, like that output by Phaser or Molrep? A word of caution: if he's referring specifically to the MolProbity clash score, this won't take crystal symmetry into account, so depending on how the placed copies of the search model are arrange with respect to NCS and crystal symmetry operators, it may actually undercount the clashes after MR. (Of course the packing score from Phaser will properly account for symmetry, and I assume Molrep displays something similar.) -Nat
Re: [ccp4bb] Acceptable Clash Score
On Wed, Nov 7, 2012 at 4:02 PM, Meisam Nosrati meisam.nosr...@gmail.com wrote: I want to know what is considered an acceptable Clash Score for a solved structure. The recommendation from MolProbity is less than 10. If you have low-resolution data and don't have a high-resolution starting model, it could be a little higher, but I would really put 20 as the maximum, and I think it should still be possible to lower it without too much effort. -Nat
Re: [ccp4bb] how to find and add water molecules in electron density map in coot??
On Tue, Nov 6, 2012 at 12:06 PM, saleem raza mysaleemr...@hotmail.com wrote: I have to put water molecules in my model but It's difficult to judge that electron density is for water of something else. How to differentiate How the electron density look like for metal ions like Ca and Na??? Sodium can't be distinguished from water on the basis of electron density. You can use the bond valence method (e.g. a program like WASP) to identify likely sodium ions, but this requires good data and relatively high resolution. Calcium has nearly twice as many electrons as water, and should be very obvious in the Fo-Fc map (assuming you've built in a water to start with) unless it's only present at half occupancy. Depending on the wavelength you collected at and the quality of the data, you may also be able to see a peak in the anomalous difference map. The chemical environment tends to be more distinctive than sodium too. -Nat
Re: [ccp4bb] Ca or Zn
On Tue, Oct 30, 2012 at 12:12 PM, Jim Pflugrath jim.pflugr...@rigaku.com wrote: How would you distinguish between a mixture of Ca and Zn in the same locations? How often would they be likely to bind in the same place? Some of the other transition metals are difficult to tell apart, but Ca and Zn have very different coordination preferences. -Nat
Re: [ccp4bb] Convention on residue numbering of fusion proteins?
On Tue, Oct 23, 2012 at 9:55 AM, Meindert Lamers mlam...@mrc-lmb.cam.ac.uk wrote: Is there any convention on the numbering of residues in a fusion protein? I have a structure of two domains fused together but would like to keep the biological numbering intact. 1st domain: residue 200-300 (protein A). 2nd domain: residue 170-350 (protein B). The fusion is between A300 and B170 Is it OK to label them chain A and B and create a LINK between the two (thus keeping the biological residue number intact). Or do I have to start the 2nd domain with residue number 301 (and loose all biological information). You could use the insertion code: the first domain could be residues 200A - 300A, the second domain would be residues 170B - 350B, e.g. ATOM 2743 CA THR A 300A -9.899 6.476 21.720 1.00 27.53 C ATOM 2750 CA VAL A 170B -6.589 4.599 21.939 1.00 32.82 C but the chain ID stays the same, with no BREAK or TER record (and no LINK required). The insertion code can be a pain to deal with from a programmer's perspective, and it makes it more difficult to specify residue ranges, but I think this is exactly what it's supposed to be used for. -Nat
Re: [ccp4bb] Etiquette on publishing if there is a crystallization report from someone else.
On Tue, Sep 25, 2012 at 6:51 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: I would assume that someone who publishes crystallisation conditions has given up solving the structure or some other reason to encourage others to pick up the project, i.e., no, I don't see much point NOT publishing your data. I always assumed that the point of publishing crystallization conditions was to establish priority, and apparently there was once such an unspoken rule about publishing the structure. Or so I'm told; from what I've seen it's long abandoned. A bit of historical perspective (about a very high-profile project): http://www.sciencemag.org/content/285/5436/2048.full -Nat
Re: [ccp4bb] Off-topic: Best Scripting Language
On Wed, Sep 12, 2012 at 7:32 AM, Jacob Keller j-kell...@fsm.northwestern.edu wrote: since this probably comes up a lot in manipulation of pdb/reflection files and so on, I was curious what people thought would be the best language for the following: I have some huge (100s MB) tables of tab-delimited data on which I would like to do some math (averaging, sigmas, simple arithmetic, etc) as well as some sorting and rejecting. It can be done in Excel, but this is exceedingly slow even in 64-bit, so I am looking to do it through some scripting. Just as an example, a sort which takes 10 min in Excel takes ~10 sec max with the unix command sort (seems crazy, no?). Any suggestions? Anything but Fortran. Seriously, there are probably a dozen (or more) good solutions, and it depends on whose syntax you prefer, what external libraries you need, whether you want to someday apply your new programming skills to another project, and whether you want anyone else to be able to read your code. For me, Python wins easily, but the suggestions of Octave or R are probably just as good for a one-time script of the sort you describe. -Nat
Re: [ccp4bb] Off-topic: Best Scripting Language
On Wed, Sep 12, 2012 at 12:49 PM, James Stroud xtald...@gmail.com wrote: Also, python (aka python 2) and python 3000 (aka python 3) are considered two different languages. It's not reasonable to consider them one language and then complain that they are incompatible. Python 3 was created as a new language (and should be treated as such) precisely because it breaks compatibility with python 2. That was the intent of the language authors. Actually, despite having endorsed Python, I have to agree with the complaints about Python 3, for several reasons: 1) It doesn't actually introduce many fundamentally new features that would have changed how we code for it. (Like getting rid of self or the Global Interpreter Lock, or writing the interpreter in C++ and improving the API for writing extensions.) The only really huge change is Unicode support, which is probably good but doesn't really make it a different programming language. 2) The changes that really break code compatibility - like getting rid of the print statement - seem to have been done on a whim rather than because of any pressing need. Maybe this was done to try to force everyone to migrate immediately (since module developers couldn't easily maintain code that works with 2.x and 3.x), but it has had the opposite effect. 3) Development on Python 2 is being shut down. Despite all this, I would still choose Python over nearly anything else for scripting (and most other purposes, but eventually C++ will be necessary too). You blame the authors for recognizing limitations of a language and inventing a new one to overcome those limitations. If the FORTRAN authors would have done that about 30 years ago, we all might be programming in FORTRAN. I think this is what Fortran 90 was supposed to do (unsuccessfully, at least in the world of crystallography) - but F77 code is still valid F90 code, just like ANSI C is still valid C++. -Nat
Re: [ccp4bb] Calculating I/sig when sig = 0
On Thu, Aug 23, 2012 at 10:44 AM, Jim Pflugrath jim.pflugr...@rigaku.com wrote: Singly-measured reflections should have a sigma from Poisson counting statistics, so that should not be a problem. A problem might occur if the X-ray background is exactly zero and the observed (sic) intensity is exactly zero. Or the data-processing program truncates the sigma to 0.0 because it writes it out in %.1f format... but maybe this has been fixed since the last time it happened to me? (For what it's worth, Xtriage throws out reflections where sigma=0; I imagine other programs do the same thing.) -Nat
Re: [ccp4bb] Various OSes and Crystallography
On Thu, Aug 9, 2012 at 6:55 AM, Jacob Keller j-kell...@fsm.northwestern.edu wrote: one. Are there any really reasonable arguments for preferring Mac over windows (or linux) with regard to crystallography? What can Mac/Linux do that windows cannot (especially considering that there is Cygwin)? What wonderful features am I missing? Mac vs. Linux: mostly a matter of personal preference, but I agree with Graeme. Most programs run equally well on either - with Coot a partial exception, apparently due to problems with the X11 implementation (but once you get used to these, it's not a big deal). Windows, on the other hand, simply doesn't support the full range of modern crystallography software. And in my experience, it has crippling flaws that mean some programs will always work better on Mac/Linux. I wouldn't ever endorse trying to use Windows for serious scientific computing unless you need to run an application that won't work on any other OS, and as far as I know there isn't a single (macromolecular) crystallography program that falls into this category. -Nat
Re: [ccp4bb] Various OSes and Crystallography
On Thu, Aug 9, 2012 at 8:14 AM, Quentin Delettre q...@hotmail.fr wrote: I have seen that in the last Mac Os, X11 have been removed... But can still be used with some package installation. I guess it isn't distributed with the OS any more - but it is still available: http://xquartz.macosforge.org/landing/ -Nat
Re: [ccp4bb] Mac or PC?
On Thu, Aug 9, 2012 at 12:52 PM, Lee, Ting Wai twlee.scie...@gmail.com wrote: May I ask a very general question? I am going to buy a laptop. I am going to do a lot of structural biology work on it using programs such as CCP4, Phenix, Coot and Pymol. Mac or PC, which is better? See this morning's thread. Short answer: either works, just avoid Windows. I have never installed this kind of programs and done structural biology work on laptops except using Pymol. Will these programs cause any problems when they are run on laptops? I mean, will they slow down very much or even freeze the laptops? Can the programs finish the jobs at an OK speed? I mean, maybe not as fast as desktops, but not taking too long like days or weeks. It depends on how big the structures you work with are, and what you're trying to run. I have a MacBook Air and it is quite adequate for crystallography, but I've only worked with small and/or low-resolution structures where there's no danger of exceeding the 4GB memory limit. (Of course, one can buy far more powerful laptops, but the price goes up steeply.) The important thing is *not* to buy the cheapest PC laptop you can find, because the really low-end hardware probably won't work very well. -Nat
Re: [ccp4bb] MR with Phaser
On Wed, Aug 1, 2012 at 11:27 AM, Uma Ratu rosiso2...@gmail.com wrote: The protein is in tetramer form. I define this by using the residue number (1332) which is 4 x monomer. After run, Phaser only gave 9 partial solutions, and no solution with all components. The resulted PDB contains only dimer form of the protein, not the tetramer. And the first TFZ score is around 2.5, which is too low for MR. I have the report file of data processing and the summary of Phaser attached. Could you please advice which part is wrong, why can I get the tetramer form of the protein? Your data are processed as P2, which is much less common (for proteins) than P21, and it looks like you haven't told Phaser to try P21 too. There are many other reasons why MR might not work, but I think it's very likely that the space group is wrong. -Nat
Re: [ccp4bb] How to identify unknow heavy atom??
On Tue, Jul 24, 2012 at 10:14 AM, Haytham Wahba haytham_wa...@yahoo.com wrote: 1- if i have anomalous peak of unknown heavy atom, How can i identify this heavy atom in general. (different methods) 2- in my case, i see anomalous peak in heavy atom binding site (without any soaking). preliminary i did mass spec. i got Zn++ and Cu, How can i know which one give the anomalous peak in my protein. 3- there is way to know if i have Cu+ or Cu++. You may be able to identify the element based on the coordination geometry - I'm assuming (perhaps incorrectly) that it is actually different for Cu and Zn. Marjorie Harding has written extensively on the geometry of ion binding: http://tanna.bch.ed.ac.uk/ The only way to be certain crystallographically, if you have easy access to a synchrotron, is to collect data above and below the K edge of any candidate element, and compare the difference maps. (For monovalent ions it is more complicated, since they don't have accessible K edges.) On a home source, Cu should have a larger anomalous map peak, but I'm not sure if this will be enough to identify it conclusively. -Nat
Re: [ccp4bb] How to identify unknow heavy atom??
On Tue, Jul 24, 2012 at 10:33 AM, Ethan Merritt merr...@u.washington.edu wrote: As to the home source - no. Neither Cu nor Zn has appreciable anomalous signal when excited with a Cu K-alpha home source. http://www.bmsc.washington.edu/scatter An element's emission edge (Cu K-alpha in this case) is about 1 keV below the corresponding absorption edge. This makes sense, because after absorbing a photon it can only emit at an equal or lower energy, not a higher energy. So you can't reach the Cu absorption edge, where the anomalous signal is, by exciting with Cu K-alpha. Oops, sorry, I was of course comparing the wrong numbers. -Nat
Re: [ccp4bb] Structure Refinement Program
On Mon, Jul 23, 2012 at 9:50 AM, Scott Foy s...@mail.umkc.edu wrote: We are computationally averaging several homologous protein structures into a single structure. This of course will lead to a single protein structure that possesses poor biophysical characteristics of bond lengths, bond angles, steric hindrance, etc. Therefore, we will need a refinement program that is very rapid and that will restore optimal protein parameters upon input of a single PDB coordinate file. We are considering several programs such as Phenix and CNS and would appreciate any comments or opinions as to recommendations, advantages, and disadvantage for these, or other, programs. We will need to refine thousands of PDB files so speed is a significant consideration. Can you clarify whether you intended to refine against experimental data, or just clean up the model geometry? -Nat
Re: [ccp4bb] harvesting in cold room (was: cryo for high salt crystal)
On Fri, Jul 13, 2012 at 2:19 PM, Radisky, Evette S., Ph.D. radisky.eve...@mayo.edu wrote: Several have mentioned harvesting in the cold room to reduce evaporation. I used to do this also as a postdoc, but I worried whether I risked nitrogen gas poisoning from liquid N2 boil-off, since the cold room did not seem very well-ventilated. I’ve also hesitated to recommend it to trainees in my current lab for the same reason. Does anyone have solid information on this? I would like to be convinced that such fears are unfounded … Aside from safety concerns, won't this reduce the solubility? I hated harvesting high-salt conditions in the cold room for exactly this reason. -Nat
Re: [ccp4bb] Rfactors stuck very high
On Sun, Jul 8, 2012 at 2:11 PM, James Garnett j.garn...@imperial.ac.uk wrote: I have found a molecular replacement solution in I212121 using an NMR structure of the same protein and MR-ROSETTA/PHENIX (PHASER LLG=128 TFZ=12.3), although I can not refine this below R ~45% and Rfree ~50%. The maps look OK in parts but in other regions the connectivity is much reduced. In case of model bias I have used density modification and also used simulated annealing etc in case it is stuck in a local minima - these did not help. This protein is an Ig-like fold (potential for pseudo-internal symmetry) and so I have also played around with rotations of the structure but this has not helped. Although twinning analysis in all spacegroups suggest there is no twinning I have tried refinement in PHENIX and REFMAC using twin laws but this does not help. Several questions: 1) Are you certain you crystallized the protein you're interested in and not a contaminant? This is unlikely to be the culprit, but it's always good to check. (An R-free of 50% does not guarantee that the model is correct.) 2) What happens if you delete the regions of the model where the map connectivity is poor, and refine the partial model? 3) Did you try rebuilding the model completely from scratch, i.e. starting from the map without an input PDB file? I'm pretty sure MR-Rosetta will only do this if the sequence is significantly different than the template. I'd recommend trying several different programs to do this (AutoBuild, ARP/wARP, Buccaneer), as the methods involved are quite different, and you may be able to combine different fragments together afterwards. -Nat
Re: [ccp4bb] help regarding structure solution
On Wed, Jun 20, 2012 at 11:13 AM, sonali dhindwal sonali11dhind...@yahoo.co.in wrote: I am working on a protein for last so many years and for which i have got crystal now in a tray which i kept 1 years ago. It diffracts well and resolution is 2.2A, which is good. I indexed in HKL2000, mosflm and automar and it shows P21 space group in all data reduction packages. But when I tried using molrep or phaser then I do not get any solution. The sequence of my protein is having 46% identity with other available crystal structure. Also when I tried to get matthews coffecient, it calculates its molecular mass less ( about 35 kDa) than which should be (original 54kDa) with solvent content 47%. I have also run the silver staining gel of the protein which contained crystal that shows about 45 kD protein band which is 10 less than the original. Also I tried to run gel on crystal but it did not give anything as it was a small crystal. I have tried all combinations of the search model and tried to break available pdb many ways to make different search models but have not got any good solution. Molrep gives contrast even 10 or more but no good electron density map yet. Free R and figure of merit becomes 52% and 42% respectively in Refmac with all the solutions. Have you tried using an automated building program on the best solutions you have so far? Refinement programs will often get stuck quickly if the MR solution is poor, but rebuilding from scratch can sometimes do a much better job. Other things to try in cases like this are DEN refinement or MR-Rosetta - both require significant computational resources but also have a wider radius of convergence. The other question to ask yourself in situations like this is did I really crystallize the protein I'm interested in, or something else? It's surprisingly easy to crystallize minor contaminants; at last count, I've met at least four different people who've done this. (One of them actually ended up with a decent paper describing a structure he'd never intended to solve.) I suspect there are dozens if not hundreds of datasets lying abandoned because they couldn't be phased or reproduced because they weren't what the researcher thought they were. If there is any way to obtain the mass spec of the crystallized protein, this will be the most useful confirmation either way. The next thing to try is searching for similar unit cells in the PDB, although this doesn't take into account changes in space group that result in a different unit cell without actually changing the lattice. (There are probably multiple tools that can account for this; I can point to one if you're interested.) As a last resort, I would recommend a brute-force approach: https://portal.sbgrid.org/d/apps/wsmr/ -Nat
Re: [ccp4bb] Model submission
On Tue, Jun 19, 2012 at 8:35 AM, RHYS GRINTER r.grinte...@research.gla.ac.uk wrote: There's no significant difference between the high res and low res proteins in the shared region (amino acid 38+) (r.m.s.d 0.46 A), and the while there is broken density for the first 38aa from the full length data it's too poor to model into. I want to present a figure which shows the density corresponding to the first 38aa and where that fits with the rest if protein molecule. What I'm unsure of it whether I will be required by the journal to submit a model from the lower resolution data to the PDB in order to present this figure. Bearing in mind the density doesn't allow any additional residues to be modelled compared to the high res. structure. This may be true today, but there is no guarantee that it will still be the case in five years, or ten, or however long it takes for the software to improve. I'd argue that anything you illustrate in the paper should end up in the PDB anyway, but if there is any chance that someone could improve on your structure in the future and possibly learn something new as a result, it's worth depositing for that reason as well. Otherwise the data will probably be lost, and we'll never know if those extra residues could have been modeled. (Although I suspect that it would be more helpful if the images were also available, instead of having to start from the processed data.) -Nat
Re: [ccp4bb] how to get phase of huge complex
On Tue, Jun 12, 2012 at 8:53 PM, Frank von Delft frank.vonde...@sgc.ox.ac.uk wrote: Finding 111 sites should be feasible without other tricks than very careful data collection (see below); if you have two or more copies in the ASU, you may find you need to do what the ribosome guys did, namely use other derivatives (e.g TaBr clusters) to locate your seleniums, and then phase. With 40% of the complex having homologues in the PDB, you may be able to place those subunits by MR, then use the phases from the incomplete model to locate the seleniums. -Nat
Re: [ccp4bb] metal modelling in coot
On Sat, May 5, 2012 at 2:23 PM, Pavel Afonine pafon...@gmail.com wrote: may be I'm missing something but I think all you need to do is to place (add to PDB file) a Zn2+ into a blob of density that you believe that Zn belongs to, and then most of refinement tools will take care of it automatically. So I'm not seeing why you need files for a Zn atom I guess the task is as simple as I just wrote, isn't it? Not quite - as Roger noted, the charge would need to be set separately. (Actually, having Coot do this automatically for ions would be a very nice feature, and hopefully not difficult to add - or alternately, make this an option in the pointer atoms dialog.) This probably won't make a huge difference at most resolutions since the B-factor will soak up some of the discrepancy, but it may result in cleaner maps. -Nat
Re: [ccp4bb] Refmac executables - win vs linux in RHEL VM
On Sat, Apr 7, 2012 at 9:50 AM, Roger Rowlett rrowl...@colgate.edu wrote: I don't know the state of current software, because I haven't tried recently, but when I set up my student crystallography workstations a few years back I noticed many packages (e.g. EPMR, Phaser) that had potentially long run times (where it is really noticeable) would run on the identical hardware about 2-3 times faster in Linux than in Windows XP. Memory swapping wasn't the issue. I was astounded there could be that much overhead in Windows. A Linux VM on a windows machine being faster than native Win7 is pretty weird, though. Different compiler implementations will often have a huge effect on runtimes. I recently spent some time trying to get a large amount of C++ code (converted from F77) to compile under Visual C++ 9.0, and I had to disable optimization of at least ten different functions to prevent cl.exe from crashing. This was not especially complex code (and g++ never complains) - just nested 'for' loops over three dimensions. I did not attempt to compare runtimes since I was running Windows in a virtual machine on a Mac, but I would be surprised if the resulting Windows binaries were not slower on identical hardware. And even if the compiler isn't broken, the math libraries may be; one of my colleagues found (on Linux) that the exp() function provided by g77 was 20-fold slower than the equivalent in the Intel math library. So I suspect it is related to the compilers (and optimization flags) used by CCP4 for these platforms. Another good reason to avoid Windows! -Nat
Re: [ccp4bb] very informative - Trends in Data Fabrication
On Mon, Apr 2, 2012 at 11:00 AM, Maria Sola i Vilarrubias msv...@ibmb.csic.es wrote: About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. This will often be insufficient, I'm afraid. We generally assume good faith on the part of the authors: if the caption says the 2mFo-DFc map is shown contoured at 1.5sigma, we assume that this is an honest statement, but we also have no way of verifying it until the experimental data are available. I know of at least one case offhand where the maps could not possibly have been contoured at that level - the ligands are not misfit, they are simply not present in the crystals, and the paper is misleading (deliberately or not, I don't know). Most reviewers do not have the patience to spend weeks pursuing these issues. (Although it would certainly help if reviewers insisted that the density around ligands not be shown in isolation.) That aside, I completely understand why someone would be reluctant to share their data with potential competitors. Someone once suggested making the model and maps viewable via a web applet (AstexViewer or similar), but even that sounds like it could be prone to abuse. -Nat
Re: [ccp4bb] Crystal Structures as Snapshots
On Fri, Feb 10, 2012 at 12:29 PM, James Stroud xtald...@gmail.com wrote: How could they not be snapshots of conformations adopted in solution? Packing billions of copies of an irregularly-shaped protein into a compact lattice and freezing it to 100K isn't necessarily representative of solution, especially when your solution contains non-physiological amounts of salt and various organics (and possibly non-physiological pH too). -Nat
Re: [ccp4bb] Crystal Structures as Snapshots
Just to clarify - I actually think the original assumption that Jacob posted is generally reasonable. But it needn't necessarily follow that the conformation we see in crystal structures is always representative of the solution state; given the extreme range of conditions in which crystals grow, I would be surprised if there weren't counter-examples. I'm not familiar enough with the literature on domain swapping (e.g. diptheria toxin) to know if any of those structures are crystal packing artifacts. On Fri, Feb 10, 2012 at 1:04 PM, George gkontopi...@vet.uth.gr wrote: Packing billions of copies into a compact lattice Not so compact there is 40-80% water freezing it to 100K We have frozen many times protein solutions in liquid nitrogen and then thaw and were working OK non-physiological amounts of salt and various organics What is the amount of salt and osmotic pressure in the cell?? non-physiological pH too What is the non-physiological pH too? I am sure that some enzymes they are not working in pH 7. Also most of the proteins they have crystallized in pH close to 7 so I would not say non-physiological. George PS There are lots of solution NMR structures as well supporting the physiological crystal structures -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Nat Echols Sent: Friday, February 10, 2012 10:35 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Crystal Structures as Snapshots On Fri, Feb 10, 2012 at 12:29 PM, James Stroud xtald...@gmail.com wrote: How could they not be snapshots of conformations adopted in solution? Packing billions of copies of an irregularly-shaped protein into a compact lattice and freezing it to 100K isn't necessarily representative of solution, especially when your solution contains non-physiological amounts of salt and various organics (and possibly non-physiological pH too). -Nat
Re: [ccp4bb] Soaking Kinase Crystals with ATP analogues
On Wed, Feb 1, 2012 at 11:17 AM, Dianfan Li l...@tcd.ie wrote: I am working on a kinase and would like to get an ATP analogue into the crystals. When soaked with AMP-PCP, the kinase crystals crack in about 15 min at 4 C. This isn't too surprising; most kinases undergo global conformational changes (domain closure) when binding ATP. I could try other analogues like AMP-PNP etc, but those would probably behavour in a same way as AMP-PCP. Is it a good idea of trying quick soaks at high concentrations of AMP-PCP? Co-crystallization is another option I have but AMP-PCP is a substrate of the kinase (with low rate). What are other ways of getting ATP analogues into a crystal? I'd recommend trying ATP-gammaS - it could also be a substrate, but it's worth a look. (Is there any reason to believe that AMP-PNP is a substrate?) I've noticed that the various analogues have been known to result in different conformations in the crystal structure, so it may be a good idea to try more than one anyway. -Nat
Re: [ccp4bb] New Faster-than-fast Fourier transform
On Tue, Jan 24, 2012 at 1:38 AM, Adam Ralph adam.ra...@nuim.ie wrote: CUDA is a set of extensions for C which will allow you to access hardware accelerators (certain NVidia cards in this case). CUDA has been around for a while and there are CUDA libraries for FFT and BLAS. I have not used cuFFT myself, I know that its APIs are based on those of FFTW. The capabilities and ease of use of these cards are improving with each generation. If you are in the game of speeding up your FFTs then I recommend you take a look. Unfortunately this isn't going to make refinement programs much faster either. I found that cuFFT was about 20x faster on a state-of-the-art NVidia accelerator versus a single Intel Xeon core - but the memory transfer knocks it down to 4x. OpenMP parallelization can give a similar speedup without spending $2500 extra on the GPU, and with much less butchering of the code. (And even that doesn't help much, because FFTs still take up less than half the runtime during refinement, at least in Phenix - I would be surprised if other programs were significantly different in this respect.) -Nat
Re: [ccp4bb] writing scripts-off topic
On Tue, Jan 24, 2012 at 10:24 AM, Ian Tickle ianj...@gmail.com wrote: reassuring air of finality! Maybe a Python expert will answer this but I've often wondered, what happens if as some editors do (particularly if as I do you have to use different editors at different times depending on where you are working, such as on Windows working remotely from home or Linux at work), you could have a mixture of space and tab characters in the file? Yes, this can happen. In practice, one learns very quickly to configure the text editors to prevent this kind of mix-up, i.e. by always inserting spaces instead of tab characters. In vim, for instance, you can do this: set expandtab set tabstop=2 For CCTBX, the rule is to use two spaces as the indentation (tabs strictly forbidden), and despite having multiple active contributors, this is almost never a problem. (The rule for most other Python modules appears to be four spaces, which I personally find too wide.) It becomes second nature after a while, just like adding a semicolon at the end of each statement in C/C++/Java/etc. I agree that it seems annoying and confusing at first, but if you've ever tried to edit someone else's C or Perl code where the indentation was totally inconsistent, you'll quickly learn to appreciate Python's style. So does Python automatically expand the tabs to the equivalent number of spaces or (as in data input) are they treated as single characters? And anyway how does Python know what tab stops my editors are set to and indeed how exactly my editors treat tabs? Answers are no and it doesn't, at least I don't think so. The safest thing (especially if you need to copy and paste code from any other module) is to never use literal tabs. -Nat