[ccp4bb] Web service for the calculation of maximum entropy maps
Dear All, We have prepared a fully automated web-based service for the calculation of maximum entropy (2mFo-DFc) protein density maps. The service is currently in its beta-testing stage and you are welcome to use it at : http://orion.mbg.duth.gr/graphent The only input required is an mtz file containing either the FWT-PHWT or 2FOFCWT-PH2FOFCWT columns as produced, for example, from refmac. The output is an mtz file (containing the maxent coefficients) which is suitable for direct input to, say, coot. To maintain privacy and data safety, no e-mail address (or any other type of registration) is necessary and all files (including the results) will be deleted from the server upon job completion. Comments, suggestions, flames and bug reports should be directed to me and not, for example, to the innocent mailing lists that I've misused for advertising this service. Kind regards, Nicholas -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
[ccp4bb] Naturally occuring alpha-Methyl-L-Proline
Dear All, We have a structure at its final stages of refinement (R~0.215, Rfree~0.235) whose 1.9A 2mFo-DFc and mFo-DFc maps are a pleasure to have and to share, with the notable exception of a 'feature' present on one of the prolines: https://dl.dropbox.com/u/11392126/alpha-Me-proline.jpg [the positive (green) contour of the difference map stands at a hefty 5 sigma above mean]. It looks as if this is an alpha-Methyl-L-Proline but my attempts to locate natural occurences of this modification in proteins have all failed. Any ideas or suggestions or pointers ? All the best, Nicholas -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Naturally occuring alpha-Methyl-L-Proline
N-methyl-Pro is a known protein modification. This is not an N-methyl-Pro. This is a Ca-methyl-Pro (with loads of synonyms, see http://www.chemicalbook.com/ChemicalProductProperty_EN_CB9127522.htm ).
Re: [ccp4bb] Historic books [Was: Fun Question ...]
The book (actually proceedings of a 1984 symposium) entitled Patterson and Pattersons : Fifty years of the Patterson Function had a historical section that made excellent reading. On Wed, 6 Jun 2012, aaleshin wrote: I wonder if anyone attempted to write a historic book on development of crystallography. That generation of crystallographers is leaving this world and soon nobody will be able to say how the protein and non-protein structures were solved in those days. Alex -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
[ccp4bb] Bug in sftools 'write' command ?
Dear Developers, Recent versions of sftools appear to ignore the order of columns given in the 'write' command (and keep the order of the input mtz file in the output as well). The problem persists even with the executable distributed with v.6.2.0. Example : __ # sftools .. .. read test.mtz selected: READ .. .. TYPE LABEL === P 6_SOLOMON_PHIB W 6_SOLOMON_FOM F 6_SOLOMON_F Q 6_SOLOMON_SIGF now sorting the reflections now merging the reflections 10737 reflections read from file 0 reflections appended to existing data 10737 reflections newly created 10737 reflections now stored in memory give your option (or hit return to list options) write test2.mtz column 3 4 1 2 == selected: WRITE Writing file : test2.mtz With format : MTZ Columns used : 3 4 1 2 The following columns will be written : TYPE LABEL === F 6_SOLOMON_F Q 6_SOLOMON_SIGF P 6_SOLOMON_PHIB W 6_SOLOMON_FOM WRITTEN OUTPUT MTZ FILE Logical Name: test2.mtz Filename: test2.mtz give your option (or hit return to list options) quit selected: QUIT Normal end program sftools __ # mtzdump hklin test2.mtz ### ### ### ### CCP4 6.1: MTZDUMP version 6.1 : 03/03/08## ### User: glykos Run date: 2/ 6/2012 Run time: 17:47:49 .. .. .. OVERALL FILE STATISTICS for resolution range 0.002 - 0.095 === Col SortMinMaxNum % Mean Mean Resolution Type Column num order Missing complete abs. LowHigh label 1 ASC 0 21 0 100.00 6.9 6.9 24.86 3.25 H H 2 NONE 0 22 0 100.00 7.7 7.7 24.86 3.25 H K 3 NONE 0 52 0 100.00 19.4 19.4 24.86 3.25 H L 4 NONE0.0 359.9 0 100.00 178.05 178.05 24.86 3.25 P 6_SOLOMON_PHIB 5 NONE 0.010 1.000 0 100.000.7600.760 24.86 3.25 W 6_SOLOMON_FOM 6 NONE 18.0 2006.7 0 100.00 215.72 215.72 24.86 3.25 F 6_SOLOMON_F 7 NONE0.922.3 0 100.00 6.02 6.02 24.86 3.25 Q 6_SOLOMON_SIGF No. of reflections used in FILE STATISTICS10737 .. .. .. __ Regards, Nicholas ps. Please do keep my mail address in the CCs, I'm subscribed to daily digests. -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Calculating ED Maps from structure factor files with no sigma
Hi Francisco, I'll play devil's advocate, but a measurement without an estimate of its error is closer to theology than to science. The fact that the standard deviations are not used for calculating an electron density map via FFT is only due to the hidden assumption that you have 100% complete, error-free data set, extending to sufficient high (infinite) resolution. When these assumptions do not apply (as is usually the case with physical reality), then the simple-minded FFT is not the correct inversion procedure (and the data do not univocally define a single map). Under these conditions other inversion mathods are needed (such as maximum entropy) for which the standard deviations are actively being used for calculating the map. My twocents, Nicholas On Tue, 22 May 2012, Francisco Hernandez-Guzman wrote: Hello everyone, My apologies if this comes as basic, but I wanted to get the expert's take on whether or not the sigmaF values are required in the calculation of an electron density map. If I look at the standard ED equation, sigma's don't appear to be a requirement but all the scripts that I've looked at do require sigma values. I wanted to calculate the electron density for PDB id: 1HFS but the structure file only lists the Fo's, Fc's and Phases, but no sigmas. Would such structure factor file be considered incomplete? Thank you for your kind explanation. Francisco -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Calculating ED Maps from structure factor files with no sigma
Hi Ed, I may be wrong here (and please by all means correct me), but I think it's not entirely true that experimental errors are not used in modern map calculation algorithm. At the very least, the 2mFo-DFc maps are calibrated to the model error (which can be ideologically seen as the error of experiment if you include model inaccuracies into that). This is an amplitude modification. It does not change the fact that the sigmas are not being used in the inversion procedure [and also does not change the (non) treatment of missing data]. A more direct and relevant example to discuss (with respect to Francisco's question) would be the calculation of a Patterson synthesis (where the phases are known and fixed). I have not done extensive (or any for that matter) testing, but my evidence-devoid gut feeling is that maps not using experimental errors (which in REFAMC can be done either via gui button or by excluding SIGFP from LABIN in a script) will for a practicing crystallographer be essentially indistinguishable. It seems that although you are not doubting the importance of maximum likelihood for refinement, you do seem to doubt the importance of closely related probabilistic methods (such as maximum entropy methods) for map calculation. I think you can't have it both ways ... :-) The reason for this is that model errors as estimated by various maximum likelihood algorithms tend to exceed experimental errors. It may be that these estimates are inflated (heretical thought but when you think about it uniform inflation of the SIGMA_wc may have only proportional impact on the log-likelihood or even less so when they correlate with experimental errors). Or it may be that the experimental errors are underestimated (another heretical thought). My experience from comparing conventional (FFT-based) and maximum-entropy- related maps is that the main source of differences between the two maps has more to do with missing data (especially low resolution overloaded reflections) and putative outliers (for difference Patterson maps), but in certain cases (with very accurate or inaccurate data) standard deviations do matter. All the best, Nicholas -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Refmac executables - win vs linux in RHEL VM
Hi Bernhard, Maybe the paranoia-checkers in windows slow everything down although I did not see any resources overwhelmed... I wonder whether the windoze refmac binaries can be used through wine in a GNU/Linux environment. If yes, then you could possibly differentiate between the operating-system-dependent and compiler-specific hypotheses. Nicholas -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Refmac executables - win vs linux in RHEL VM
Hi Nat, one of my colleagues found (on Linux) that the exp() function provided by g77 was 20-fold slower than the equivalent in the Intel math library. I do not know whether this has recently been changed, but the license for icc-produced executables used to be rather restrictive. If I remember correctly, you were not allowed to distribute the binaries, full stop. This together with the fact that until recently (icc v.11.0.074) the icc-produced executables would not run on specific AMD-based hardware, had made me return to the safety of gcc. My twocents, Nicholas -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Refining Against Reflections?
Hi Jacob, Therefore, I was thinking an exposure-dependent parameter might be introduced into the atomic models, as an exposure-dependent occupancy of sorts. However, this would require refinement programs to use individual observations as data rather than combined reflections, effectively integrating scaling into refinement. It seems to me that this approach would only be valid if the atomic (pdb) model is a valid representation of the crystal structure irrespectively of how much radiation damage has suffered. To put it differently, the suggested approach would only be valid if the data collected remain strictly isomorphous for the length of the experiment (within a scale and overall B factor). But if the data are indeed isomorphous throughout the data collection procedure, then the current treatment (which -through scaling- essentially interpolates to zero radiation damage) would be equivalent to your suggested procedure. If on the other hand, radiation damage causes non-isomorphism (but you still deposit one atomic model), you would be absorbing unknown model errors in yet another set of adjustable parameters. My twocents, Nicholas -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Movements of domains
Hi Filip, Would it be a worth-while exercise to make a histogram of the absolute values of atomic displacements ? If the distribution is bimodal (as you indicated that it may), then indicating statistical significance should be much easier (and convincing ?). My twocents, Nicholas On Mon, 21 Nov 2011, Filip Van Petegem wrote: Dear crystallographers, I have a general question concerning the comparison of different structures. Suppose you have a crystal structure containing a few domains. You also have another structure of the same, but in a different condition (with a bound ligand, a mutation, or simply a different crystallization condition,...). After careful superpositions, you notice that one of the domains has shifted over a particular distance compared to the other domains, say 1-1.5 Angstrom. This is a shift of the entire domain. Now how can you know that this is a 'significant' change? Say the overall resolution of the structures is lower than the observed distance (2.5A for example). Now saying that a 1.5 Angstrom movement of an entire domain is not relevant at this resolution would seem wrong: we're not talking about some electron density protruding a bit more in one structure versus another, but all of the density has moved in a concerted fashion. So this would seem 'real', and not due to noise. I'm not talking about the fact that this movement was artificially caused by crystal packing or something similar. Just for whatever the reason (whether packing, pH, ligand binding, ...), you simply observe the movement. So the question is: how you can state that a particular movement was 'significantly large' compared to the resolution limit? In particular, what is the theoretical framework that allows you to state that some movement is signifcant? This type of question of course also applies to other methods such as cryo-EM. Is a 7A movement of an entire domain 'significant' in a 10A map? If it is, how do we quantify the significance? If anybody has a great reference or just an individual opinion, I'd like to hear about it. Regards, Filip Van Petegem -- Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] should the final model be refined against full datset
Dear Ethan, List, Surely someone must have done this! But I can't recall ever reading an analysis of such a refinement protocol. Does anyone know of relevant reports in the literature? Total statistical cross validation is indeed what we should be doing, but for large structures the computational cost may be significant. In the absence of total statistical cross validation the reported Rfree may be an 'outlier' (with respect to the distribution of the Rfree values that would have been obtained from all disjoined sets). To tackle this, we usually resort to the following ad hoc procedure : At an early stage of the positional refinement, we use a shell script which (a) uses Phil's PDBSET with the NOISE keyword to randomly shift atomic positions, (b) refine the resulting models with each of the different free sets to completion, (c) Calculate the mean of the resulting free R values, (d) Select (once and for all) the free set which is closer to the mean of the Rfree values obtained above. For structures with a small number of reflections, the statistical noise in the 5% sets can be very significant indeed. We have seen differences between Rfree values obtained from different sets reaching up to 4%. Ideally, and instead of PDBSET+REFMAC we should have been using simulated annealing (without positional refinement), but moving continuously between the CNS-XPLOR and CCP4 was too much for my laziness. All the best, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] should the final model be refined against full datset
For structures with a small number of reflections, the statistical noise in the 5% sets can be very significant indeed. We have seen differences between Rfree values obtained from different sets reaching up to 4%. This is very intriguing indeed! Is there something specific in these structures that Rfree differences depending on the set used reach 4%? NCS? Or the 5% set having less than ~1000-1500 reflections? Tassos, by your standards, these structures should have been described as 'tiny' and not small ... ;-) [Yes, significantly less than 1000. In one case the _total_ number of reflections was 5132 reflections (which were, nevertheless, slowly and meticulously measured by a CAD4 one-by-one. These were the days ... :-)) ]. -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Linux vs MacOS for crystallographic software
Hi William, It is essentially BSD unix. So it should be fine, unless they continue to lobotomize everything and make it into an iPod on a stick. :-)) +1 Thankfully, it is possible to gcc-cross-compile for MacOSX (both i686 + ppc) from a GNU/Linux machine (the procedure for getting it to work was so convoluted that I keep three separate backups of all required directory trees, knowing that I won't be able to repeat it ;-). Having said that, the resulting executables work fine on a Snow Leopard machine (even Ygl-based graphics !), but I bet my luck won't last long ... All the best, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] How to evaluate Fourier transform ripples
Hi Conan, Specifically, the reviewers question whether ripples may affect on the electron density around heavy metal center which has a Mo-S-As connection. From which angle or in which way this problem should be addressed most convincingly ? The maximum entropy estimate of the map should be insensitive to series termination errors, and would thus answer the referee's concerns. You could give it a try with the program 'graphent' which can read mtz files and produce ccp4 map files. Whether the referee considers maximum entropy methods 'convincing' remains to be seen ... ;-) My twocents, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] contour map
Hi Leila, What feels like two centuries ago, we had written a program that would read a ccp4 map and produce a postscript plot containing both contours and a dithered grayscale representation of the map. This is useful for complex projections like the one shown in [1]. The program (mcps) is still available via our pages ([2]), it even has documentation ([3]), but the ready-made executables are really old (and the old timers will certainly smile hearing that the tarball contains executables for irix, osf and solaris, but sorry, no VMS :-)) Having said that, the circa 2001 linux executable appears to run flawlesly under ubuntu 10.04, but as always YMMV. Nicholas [1] http://utopia.duth.gr/~glykos/mcps_graph.html [2] http://utopia.duth.gr/~glykos/other.html [3] http://utopia.duth.gr/~glykos/mcps.html -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] generating 2D projections
Hi Jeff, I'm currently working on a project where we are attempting to compare a single crystal X-ray structure with 2D crystals we've imaged with TEM. I have nice 2D cyrstals and low dose cryo images. My question is whether there is a program in ccp4 that will let you input a model and then generate a 2D projection using different plane groups. We are pretty sure, based on electron diffraction, that the 2D crystals contain a screw axis, as did the X-ray structure. Also, after generating the images, is there another program that will calculate structure factors and a diffraction pattern? Since you have an X-ray structure, the easiest (?) way would be to obtain structure factors for the X-ray structure, select zone axis reflections for the sought projection, Fourier transform them, and then use maprot to prepare a map with the projection down the sectioning axis. You can then directly compare your projections. Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Deposition of riding H: R-factor is overrated
Hi Ethan, mainly because (a) the calculation of likelihood is only based on a subset of the 'data' that are obtained from an X-ray diffraction experiment (for example, we ignore diffuse scattering as Ian pointed-out), I do not think that is a valid criticism. In any field of science one might hypothesize that conducting a different kind of experiment and fitting it in accordance with a different theory would produce a different model. But that is only a hypothetical; it does not invalidate the analysis of the experiment you did do based on the data you did collect. For the example I mentioned (diffuse scattering), the experiment would be identical. Although using only subset of the available information may not invalidate the analysis performed, still it is not the best that can be done with the data in hand. (b) we consciously avoid 'prior' because this would make the models 'subjective', meaning that better informed people would deposit (for the same data) different models than the less well informed, I don't know of anyone who consciously avoids using their prior knowledge to inform their current work. But yes, people with more experience may in the end deposit better models than people with little experience. That's why it is valuable to have automated tools like Molprobity to check a proposed model against established prior expectations. It's also one way this bulletin board is value, because it allows those with less experience to ask advice from those with more experience. Most people would like to think that the models they deposit correspond to an 'objective' representation of the experimentally accessible physical reality. The validation tools, mainly by enforcing a uniformity of interpretation, discourage (and not encourage) the incorporation in the model of prior knowledge about the problem at hand, and thus, offer to their users the safety of an 'objectively validated model'. (c) the format of the PDB does not offer much room for 'creative interpretations' of the electron density maps [for example, you can't have discrete disorder on the backbone (or has this changed ?)]. Could you expand on this point? I am not aware of any restriction on multiple backbone conformations, now or ever. It is true that our refinement programs have not always been very well suited to refine such a model, but that is not a fault of the PDB format. I stand corrected on that. It was probably just me :-) I sense that what is being deposited is not the 'best model' in any conceivable way, but the model that 'best' accounts for the final 2mFo-DFc map within the limitations of the program used for the final refinement. That would be true if the refinement is conducted in real space. However, it is nearly universal to do the final refinement in reciprocal space. The emphasis of what I said was clearly on model building, and not on the refinement methodology. The reference to the refinement program was again model-centric (ranging from the treatment of hydrogens, to the bulk solvent model used). Best regards, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Deposition of riding H: R-factor is overrated
snip it seems that we are trying to deposit one model to satisfy two different purposes - one for model validation and the other for model interpretation (use in docking etc), and what's good for one purpose might not be necessarily good for the other. /snip This has been discussed before on this list, but allow me to repeat it: You would have expected that the crystallographers' aim would be to deposit the model that maximises the product (likelihood * prior). Clearly, this is not what we do, mainly because (a) the calculation of likelihood is only based on a subset of the 'data' that are obtained from an X-ray diffraction experiment (for example, we ignore diffuse scattering as Ian pointed-out), (b) we consciously avoid 'prior' because this would make the models 'subjective', meaning that better informed people would deposit (for the same data) different models than the less well informed, (c) the format of the PDB does not offer much room for 'creative interpretations' of the electron density maps [for example, you can't have discrete disorder on the backbone (or has this changed ?)]. I sense that what is being deposited is not the 'best model' in any conceivable way, but the model that 'best' accounts for the final 2mFo-DFc map within the limitations of the program used for the final refinement. My twocents, Nicholas ps. May I say parenthetically that making the deposited models dependant on their intended usage, would possibly qualify as 'fraud' ;-) -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] software to represent raw reflections in hkl zones
... only in the [0kl] plane. ... I'm sure you've already checked, but if during data collection the [0kl] axis was nearly perpendicular to the rotation axis, then you may only have to superimpose (with ipdisp) few suitably selected images to obtain a small (low resolution) portion of what you are after. -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Why Do Phases Dominate?
snip ... and on the other side, laws like Moore's law seem completely descriptive and not at all causal or essential. /snip http://dx.doi.org/10.1088/0143-0807/16/4/005 ;-) -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] FW: pdb-l: Retraction of 12 Structures...
Dear Robbie, List, This thread is steadily diverging. Apologies for my contribution to its diversification. snip Who knows what they did to the maps in terms of (unwarrented) density modefication to make them look cleaner? The advantage of the EDS is that it is impartial and uniform. The maps are generated in a clear and well-described way. /snip I agree with you that map deposition is probably a waste of resources. I strongly disagree, though, with the existence of validation tools that have strong views about how best I should do science. For example, your sentence above imply that the validation tool is more fit (than myself) to decide which maps I should be looking at. Which means that if I chose to calculate (and view) not the simple FFT-derived map, but its maximum entropy estimate, I am in danger of being accused that 'I did something to the maps to make them look cleaner', where in fact, I'm just doing a better job out of the existing data than the validation tool (which probably generate maps in a clear, well-described and wrong way :-) The take home message of what I'm saying is this: We should not be deterred from practising our craft as best as we could, even if that implies that our models contain information that a validation tool can not reproduce. It is only fair that a well-informed and well-educated human being can do a better job than a fixed-frozen automated procedure. Fraud is a moral issue, and can not (and should not) be used as an excuse for converting validation tools to the sacred holders of scientific standards. My twocents, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] video that explains, very simply, what Structural Molecular Biology is about
Apologies for the free editing: snip Molecular models are the result of numbers emerging from computer programs. The results of such computations do not reflect anything in nature. There's no experimental evidence whatsoever, making modelling a very theoretical -- in my eyes uninteresting -- exercise. /snip snip Because computational models don't have direct experimental data behind them, they need an even stronger external validation. /snip I do not understand this argument about the lack of experimental data behind 'computational models': Even the so-called empirical force fields are parametrised and optimised against solid experimental data and ab initio quantum mechanical calculations (which makes me wonder: are quantum mechanical calculations also devoid of any physical meaning according to the views presented above ?). To make this crystal clear: I do not understand why, for example, a pure physics-based folding simulation of small protein which results in the recording of a folding event should be treated with anything less than pure enthusiasm, for it proves the level of detailed understanding of the physical world encoded in these models. Lastly, may I add that a significant portion of the PDB (the NMR structures), are very heavily dependant on these same molecular mechanics forcefields that are used for molecular modeling and simulation. My twocents, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Blast from the past
Hi James, Too many reflections to store These errors are so 1993. ;-)) I adore both 1993 and my statically declared arrays, especially those that include otherwise arbitrary arithmetic constants hidden in #defines. They make the code impossible to read, and give me the opportunity to say printf(Increase MAX_NOF_REFL and recompile (if you can).\n); exit(1); Nicholas ps. I know it wasn't me. I never say 'Too many reflections to store' .-) -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] taming structural biology
The CCP4 proceedings are a great source of valuable experience. For data collection and processing you would probably want to have a look at: Data collection and analysis Acta Crys. D Volume 62, Part 1 (January 2006) Data Collection and Processing Acta Crys. D Volume 55, Part 10 (October 1999) (for details start from http://www.ccp4.ac.uk/ccp4course.php ) Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] attachments
Hi Ian, snip I think the fundamental problem is that some people are still very much attached (pun not intended) And none taken .-) to their text-based e-mail client (Pine, Pico or whatever), and I completely agree that on this BB we have to cater for the lowest common denominator. If communicating legible equations by e-mail was a priority for them, you can be sure they would move to a MIME-compliant e-mail client. /snip It is not just attachment, it is not just hating to be forced to herd the mouse around the desktop, it is also sensible use of hardware resources: On my ancient linux box 'pine' fits in 2251 KB of physical memory, whereas 'firefox' needs 49736 KB (and several seconds) just to get going. Needing 50 MB of physical memory to be able to read 20 lines of text, sounds wasteful. I agree with your point concerning equations, but I have my doubts whether we should aim to fit all possible communication media inside a good-old SMTP-based mailing list (when it comes to communicate legible equations we normally use other media, like pdf files produced from pdflatex). My twopence Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Patterson Map missing a peak?
Hi Engin, Two thoughts: * Is the Patterson map consistent ? (with different resolution ranges, exclusions, ...). You could possibly give 'GraphEnt' a try [assuming it still works with current generation mtz files]. * The number of heavy atoms may be much larger and may deviate from the known NCS (and, so, what you interpret as a self vector may not be a self vector at all). My twopence, Nicholas On Sun, 13 Sep 2009, Engin Ozkan wrote: Hi everybody, I have a little weekend puzzle in my hands. I have (probably) two heavy atom sites and pseudo-translation in P2(1) (i.e. an NCS 2-fold parallel to the unique crystallographic axis). Doing a little algebra, I would expect two self vectors and the pseudo-translation in the Harker section (y=0.5) of the Anomolous Difference Patterson map (plus a cross vector at y=0). Unfortunately, I am seeing only one of the self vectors and the vector corresponding to NCS+CS (pseudo-translation), but not the second self vector, which does not make sense to me. Drawing predicted pattersons confirms my expectations but not the real-world patterson map. I have already considered one site being weak as a possibility, but then the pseudo translation would not be stronger than the only self vector I'm seeing. I would appreciate anyone with a fresh brain pointing out what I might be missing here. Engin Anomalous Difference Patterson: CELL 95.4120 132.0710 99.0230 90. 109.0950 90. ATOM1 Ano 0. 0. 0. 53.95 0.0 BFAC 20.0 ATOM2 Ano 0.4784 0.5000 0.06478.63 0.0 BFAC 20.0 ATOM3 Ano 0.2043 0.5000 0.76317.23 0.0 BFAC 20.0 ATOM4 Ano 0.1977 0. 0.16893.55 0.0 BFAC 20.0 ATOM5 Ano 0.4099 0.5000 0.17943.01 0.0 BFAC 20.0 Native Patterson: CELL 95.4120 132.0710 99.0230 90. 109.0950 90. ATOM1 Ano 0. 0. 0. 49.32 0.0 BFAC 20.0 ATOM2 Ano 0.4686 0.5000 0.04658.99 0.0 BFAC 20.0 ATOM3 Ano 0.5000 0.5000 0.94044.92 0.0 BFAC 20.0 ATOM4 Ano 0.3074 0. 0.24473.49 0.0 BFAC 20.0 ATOM5 Ano 0.1803 0.5000 0.77103.44 0.0 BFAC 20.0 ATOM6 Ano 0.0479 0. 0.90723.35 0.0 BFAC 20.0 -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Molecular Replacement of a small peptide
Hi Allen, With such low solvent content (and corresponding tight packing), you may want to give 'Queen of spades (Qs)' a try. The reason being that Qs is not (at least directly) a Patterson-based method (and does not assume that intra-molecular vectors are topologically segregated). If you expect that your model is good, do not hesitate to give it a try with even 2A data (but also try 3A ;-). If you have access to a dual- or quad-core machine, do compile the MPI version (see http://norma.mbg.duth.gr/index.php?id=about:benchmarks:qsmpi for results on relatively recent hardware). My twocents, Nicholas On Fri, 19 Jun 2009, Sickmier, Allen wrote: I am trying to do molecular replacement on a small peptide (less than 40 AA) and have not had any success using phaser. Are there any tricks or better programs for really small peptides? The data is great 1.1 A, ~35% solvent, and two molecules in the ASU. I have tried all the standard stuff, changing resolution cut off etc. I may move to sulfur phasing at this resolution but I would like to get the MR to work. Allen -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Computer hardware and OS survey
Everything generally just works from the installers so I would say you are more likely to be able to just get on with your structure-related science on this OS that Linux. I tried to recall when was the last time that we determined a structure without having to write a single line of code, or without having to modify sources (and recompile). Guess what: never happened (and, hopefully, never will ;-) Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Computer hardware and OS survey
These Greeks ... However, the inhabitants of this planet seem to be quite fond of this OS That's because the inhabitants of this planet use computing machines mainly to play Doom World of Warcraft ;-) But then it comes to using computers for actual computation, things change: ___ The representation of the various operating system families in the latest top500 list is²: Count Share (%) Linux 439 87.8 % === Windoze 5 1.0 % === Unix23 4.6 % BSD Based 1 0.2 % Mixed 31 6.2 % Mac OS 1 0.2 % Totals 500 100.0 % ___ Nicholas ² http://www.top500.org/stats/list/32/osfam -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Computer hardware and OS survey
Dear All, We confuse scientific computing with the individual scientists' computing needs: just because a scientist has to write a grant application using word, does not make windows a platform suitable for scientific computing (or anything else for that matter). Using computing machines for doing science boils down to actually using computing machines to compute things, and for that you need a proper open-source, production-oriented, stable programming environment, ie. GNU/Linux. What individual scientists prefer for satisfying their desktop needs is interesting, but, at least to my mind, largely irrelevant. My twopence, Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620, Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/
Re: [ccp4bb] Rigid body refinement as last refinement?
I am annoyed at the fact that Refmac doesn't seem to have much respect for the electron density map Other people are much better qualified than me to point this out, but maximum likelihood methods have the utmost respect for your _data_ (and especially their standard deviations). What you think you see in the map is not what your data really 'say'. -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, 68100 Alexandroupolis, Greece, Fax +302551030613 Tel ++302551030620 (77620), http://www.mbg.duth.gr/~glykos/
Re: [ccp4bb] brute force molecular replacement
Hi Xie, Many thanks to all who responded to my earlier query (Drs. Paul Swepston, Randy Read and Nicholas Glykos). I am trying to determine the structure of a very long coiled coil dimer (roughly 150 residues long) by molecular replacement. I don't know if it forms a canonical coiled coil from end to end (it probably doesn't). I seem to get the right solution using different lengths of polyala (and polygly and modeled ones with side-chains) coiled coils as my search models (maps show some density for unmodeled parts of the coiled coil structure). If your current best phase-set gives you back correct information that was not part of your model, you are doing fine. You are getting there (but it may take a while). Starting from your current best polyAla model, you can give it a try with very many runs of torsion angle simulated annealing (starting from a high temperature), or with rigid-body simulated annealing (both from within CNS or XPLOR). Don't do positional refinement. With an almost perfect polyAla model you should expect R-free and R in the low 40s (%) range for ~2A data. However, building a model accounting for the entire sequence, getting the correct chain direction and placing residue side-chains have proved difficult. And my R-facs have never improved beyond the low 50s during different refinement attempts. At the moment, I have 3 more or less contiguous 30 residue stretches of coiled coil found by 3 independent molecular replacement attempts using epmr (This was done to account for possible bends, stutters in the dimer). I am trying to get the rest of the model by new rounds of molecular replacement using epmr. I have carried out exhaustive runs using epmr. But I haven't found the rest of the model. What are the options available to me with molecular replacement? I have tried molrep, phaser and amore from the ccp4 suite. My diffraction data is quite good (data between 25 and 2 A with an Rsym of 8%). I'm not sure whether what you really need is more molecular replacement. Maybe you should invest more time with the maps (combined with 'safe' simulated-annealing-based optimisation methods). What I've just said only applies if these 90 residues you have can indeed produce correct information that was not part of the initial model (as you said). Can I run Qs when a partial structure is available? No. Sorry. Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, 68100 Alexandroupolis, Greece, Fax +302551030613 Tel ++302551030620 (77620), http://www.mbg.duth.gr/~glykos/
Re: [ccp4bb] brute force molecular replacement
Dear Jacob, Why is it called queen of spades? As it happened, I was reading Alexander Pushkin's Queen of Spades while writing Qs. The book touches upon the subject of how close you can get to win everything and still lose it all. I thought that this was relevant for a stochastic search, and so the program got baptised. Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, 68100 Alexandroupolis, Greece, Fax +302551030613 Tel ++302551030620 (77620), http://www.mbg.duth.gr/~glykos/
Re: [ccp4bb] brute force molecular replacement
Dear Xie, What are the other brute force programs for molecular replacement out there? Qs (available via http://www.mbg.duth.gr/~glykos/Qs.html) can be as brutal with your CPU(s) as they can take. Nicholas -- Dr Nicholas M. Glykos, Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, 68100 Alexandroupolis, Greece, Fax +302551030613 Tel ++302551030620 (77620), http://www.mbg.duth.gr/~glykos/