Re: [ccp4bb] first use of synchrotron radiation in PX
Jean Witz (now deceased) once told me that the following paper is the first one mentionning data collection on a synchrotron. The journal is not really obscure and the paper should easily be found. The work was done in Germany, if I remember well. G. Rosenbaum, K.C. Holmes and J. Witz, Synchrotron radiation as a source for X-ray diffraction, Nature, 230, 434-437 (1971). Philippe Dumas
[ccp4bb] Very sad new
I learnt today that Roger Fourme passed away on December 24. He was Professeur Emérite at Paris-Sud University and former Directeur Scientifique of the SOLEIL synchrotron. Along with Richard Kahn (also deceased recently), he has been deeply involved in the development of the MAD technique. Until his sudden death, he remained very active in the field of high-pressure crystallography. I think I may say he was highly appreciated in our whole community after tens of years of commitment in macromolecular crystallography and in Science. His funeral will take place at Palaiseau cemetery (near Paris) on January, 2nd at 11:45. Philippe Dumas IBMC-CNRS, 15 rue René Descartes F67084 Strasbourg, France
Re: [ccp4bb] refining against weak data and Table I stats
Le Vendredi 7 Décembre 2012 18:48 CET, Gerard Bricogne g...@globalphasing.com a écrit: May I add something to Gerard's comment. In the same vein, provided one does consider two sets of terms with zero mean (which corresponds to the proviso mentioned by Gerard), one can define an R-factor R as the sine of the same angle leading to a correlation coefficient C and one has R^2 + C^2 = 1. Thus, in some way, on a practical ground, an R-factor is a sensitive criterion for higly correlated data, whereas a correlation coefficient is better suited for poorly correlated data. Likely, I just rephrased here ideas that have been written long time ago in well-known papers. Did I ? Philippe Dumas Dear Zbyszek, That is a useful point. Another way of making it is to notice that the correlation coefficient between two random variables is the cosine of the angle between two vectors of paired values for these, with the proviso that the sums of the component values for each vector add up to zero. The fact that an angle is involved means that the CC is independent of scale, while the fact that it is the cosine of that angle makes it rather insensitive to small-ish angles: a cosine remains close to 1.0 for quite a range of angles. This is presumably the nature of correlation coefficients you were referring to. With best wishes, Gerard. -- On Fri, Dec 07, 2012 at 11:14:50AM -0600, Zbyszek Otwinowski wrote: The difference between one and the correlation coefficient is a square function of differences between the datapoints. So rather large 6% relative error with 8-fold data multiplicity (redundancy) can lead to CC1/2 values about 99.9%. It is just the nature of correlation coefficients. Zbyszek Otwinowski Related to this, I've always wondered what CC1/2 values mean for low resolution. Not being mathematically inclined, I'm sure this is a naive question, but i'll ask anyway - what does CC1/2=100 (or 99.9) mean? Does it mean the data is as good as it gets? Alan On 07/12/2012 17:15, Douglas Theobald wrote: Hi Boaz, I read the KK paper as primarily a justification for including extremely weak data in refinement (and of course introducing a new single statistic that can judge data *and* model quality comparably). Using CC1/2 to gauge resolution seems like a good option, but I never got from the paper exactly how to do that. The resolution bin where CC1/2=0.5 seems natural, but in my (limited) experience that gives almost the same answer as I/sigI=2 (see also KK fig 3). On Dec 7, 2012, at 6:21 AM, Boaz Shaanan bshaa...@exchange.bgu.ac.il wrote: Hi, I'm sure Kay will have something to say about this but I think the idea of the K K paper was to introduce new (more objective) standards for deciding on the resolution, so I don't see why another table is needed. Cheers, Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Douglas Theobald [dtheob...@brandeis.edu] Sent: Friday, December 07, 2012 1:05 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] refining against weak data and Table I stats Hello all, I've followed with interest the discussions here about how we should be refining against weak data, e.g. data with I/sigI 2 (perhaps using all bins that have a significant CC1/2 per Karplus and Diederichs 2012). This all makes statistical sense to me, but now I am wondering how I should report data and model stats in Table I. Here's what I've come up with: report two Table I's. For comparability to legacy structure stats, report a classic Table I, where I call the resolution whatever bin I/sigI=2. Use that as my high res bin, with high res bin stats reported in parentheses after global stats. Then have another Table (maybe Table I* in supplementary material?) where I report stats for the whole dataset, including the weak data I used in refinement. In both tables report CC1/2 and Rmeas. This way, I don't redefine the (mostly) conventional usage of resolution, my Table I can be compared to precedent, I report stats for all the data and for the model against all data, and I take advantage of the information in the weak data during refinement. Thoughts? Douglas ^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^` Douglas L. Theobald Assistant Professor Department of Biochemistry Brandeis University Waltham, MA 02454-9110 dtheob...@brandeis.edu http://theobald.brandeis.edu/ ^\ /` /^. / /\
Re: [ccp4bb] PNAS on fraud
Le Jeudi 18 Octobre 2012 19:16 CEST, Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com a écrit: I had a look to this PNAS paper by Fang et al. I am a bit surprised by their interpretation of their Fig. 3: they claim that here exists a highly signficant correlation between Impact factor and number of retractations. Personnaly, I would have concluded to a complete lack of correlation... Should I retract this judgment? Philippe Dumas Dear CCP4 followers, Maybe you are already aware of this interesting study in PNAS regarding the prevalence of fraud vs. 'real' error in paper retractions: Fang FC, Steen RG and Casadevall A (2012) Misconduct accounts for the majority of retracted scientific publications. Proc Natl Acad Sci U S A 109(42): 17028-33. http://www.pnas.org/content/109/42/17028.abstract There were also a few comments on related stuff such as fake peer review in the Chronicle of Higher Education. As not all may have access to that journal, I have put the 3 relevant pdf links on my web http://www.ruppweb.org/CHE_Misconduct_PNAS_Stuft_Oct_2012.pdf http://www.ruppweb.org/CHE_DYI_reviews_Sept_30_2012.pdf http://www.ruppweb.org/CHE_The-Great-Pretender_Oct_8_2012.pdf Best regards, BR - Bernhard Rupp 001 (925) 209-7429 +43 (676) 571-0536 b...@ruppweb.org hofkristall...@gmail.com http://www.ruppweb.org/ -
Re: [ccp4bb] Series termination effect calculation.
Le Lundi 17 Septembre 2012 08:32 CEST, James Holton jmhol...@lbl.gov a écrit Hello May I add a few words after the thorough comments by James. I lmay be easier to consider series termination in real space as follows. The effect of series termination in 3D on rho(r) is of convoluting the exact rho(r) with the approximation of a delta function resulting from the limit in resolution. This approximation in 3D is given exactly by the function G[X] = 3*[Sin(X) - X*Cos(X)]/X^3, where X = 2*Pi*r/d (r in Angstrom and d the resolution, also in Angstrom). This is the function appearing in the rotation function (for exactly the same reason of truncating the resolution). If you consider that the iron atom is punctual (i.e. its Fourier transform would be merely constant), then the approximation resulting from series termination is just given by G[X] (apart for a scaling factor). And if you convolute the exact and ideal rho(r) with G[X], you will obtain the exact form of rho[r] affected by series termination. Note that, considering the Gaussian approximation of the structure factors, this would amount to convolute gaussians with G[X] (see James comments). I join a figure corresponding to the simplification of a punctual iron atom. I only put on this figure the curves corresponding to the limits in resolution, 1.3, 2 an 2.5 Angstrom because at a resolution of 1 Angstrom, the iron atom is definitely not punctual. I used the same color codes as in Fig. 1 of the paper. One can see that the ripples on my approximate figure are essentially the same as in Fig. 1 of the paper. Of course, it cannot reproduce the features of rho[r] for r--0 since the iron aton is definitely not punctual. Practical comment. It is quite useful to consider the following rule of thumb: the first minimum of G[X] appears at a distance equal to 0.92*d (d = resolution) and the first maximum at 1.45*d. Therefore, if one suspects that series terminaiton effects might cause a spurious through, or peak, it may be enough to recalculate the e.d. map at different resolutions to check whether these features are moving or not. Philippe Dumas PS: it is instructive to make a comparison with the Airy function in astronomy. Airy calculated this function to take into account the distorsion brought by the limlited optical resolution of a telescope to a punctual image of a star. Nothing else than our problem, with an iron atom replacing a star... Plus ça change, plus c'est la même chose. Yes, the constant term in the 5-Gaussian structure factor tables does become annoying when you try to plot electron density in real space, but only if you try to make the B factor zero. If the B factors are ~12 (like they are in 1m1n), then the electron density 2.0 A from an Fe atom is not -0.2 e-/A^3, it is 0.025 e-/A^3. This is only 1% of the electron density at the center of a nitrogen atom with the same B factor. But if you do set the B factor to zero, then the electron density at the center of any atom (using the 5-Gaussian model) is infinity. To put it in gnuplot-ish, the structure factor of Fe (in reciprocal space) can be plotted with this function: Fe_sf(s)=Fe_a1*exp(-Fe_b1*s*s)+Fe_a2*exp(-Fe_b2*s*s)+Fe_a3*exp(-Fe_b3*s*s)+Fe_a4*exp(-Fe_b4*s*s)+Fe_c where: Fe_c = 1.036900; Fe_a1 = 11.769500; Fe_a2 = 7.357300; Fe_a3 = 3.522200; Fe_a4 = 2.304500; Fe_b1 = 4.761100; Fe_b2 = 0.307200; Fe_b3 = 15.353500; Fe_b4 = 76.880501; and s is sin(theta)/lambda applying a B factor is then just multiplication by exp(-B*s*s) Since the terms are all Gaussians, the inverse Fourier transform can actually be done analytically, giving the real-space version, or the expression for electron density vs distance from the nucleus (r): Fe_ff(r,B) = \ +Fe_a1*(4*pi/(Fe_b1+B))**1.5*safexp(-4*pi**2/(Fe_b1+B)*r*r) \ +Fe_a2*(4*pi/(Fe_b2+B))**1.5*safexp(-4*pi**2/(Fe_b2+B)*r*r) \ +Fe_a3*(4*pi/(Fe_b3+B))**1.5*safexp(-4*pi**2/(Fe_b3+B)*r*r) \ +Fe_a4*(4*pi/(Fe_b4+B))**1.5*safexp(-4*pi**2/(Fe_b4+B)*r*r) \ +Fe_c *(4*pi/(B))**1.5*safexp(-4*pi**2/(B)*r*r); Where here applying a B factor requires folding it into each Gaussian term. Notice how the Fe_c term blows up as B-0? This is where most of the series-termination effects come from. If you want the above equations for other atoms, you can get them from here: http://bl831.als.lbl.gov/~jamesh/pickup/all_atomsf.gnuplot http://bl831.als.lbl.gov/~jamesh/pickup/all_atomff.gnuplot This infinitely sharp spike problem seems to have led some people to conclude that a zero B factor is non-physical, but nothing could be further from the truth! The scattering from mono-atomic gasses is an excellent example of how one can observe the B=0 structure factor. In fact, gas scattering is how the quantum mechanical self-consistent field calculations of electron clouds around atoms was experimentally verified. Does this mean that there really is an infinitely sharp spike in the
Re: [ccp4bb] Off-topic: Best Scripting Language
Le Mercredi 12 Septembre 2012 16:40 CEST, George M. Sheldrick gshe...@shelx.uni-ac.gwdg.de a écrit: May I add a little personal joke to the serious remark by George. This remembers me a discussion I had with Jorge Navaza, let's say 15 years ago, about the programming language of the future. (To a good approximation, 15 years ago, the future was now) The answer by Jorge was: I don't know what it will be, but I know it's name will be FORTRAN. I hope he will confirm the statement... Philippe Dumas I always use FORTRAN for such tasks, especially if speed is important. George On 09/12/2012 04:32 PM, Jacob Keller wrote: Dear List, since this probably comes up a lot in manipulation of pdb/reflection files and so on, I was curious what people thought would be the best language for the following: I have some huge (100s MB) tables of tab-delimited data on which I would like to do some math (averaging, sigmas, simple arithmetic, etc) as well as some sorting and rejecting. It can be done in Excel, but this is exceedingly slow even in 64-bit, so I am looking to do it through some scripting. Just as an example, a sort which takes 10 min in Excel takes ~10 sec max with the unix command sort (seems crazy, no?). Any suggestions? Thanks, and sorry for being off-topic, Jacob -- *** Jacob Pearson Keller Northwestern University Medical Scientist Training Program email: j-kell...@northwestern.edu mailto:j-kell...@northwestern.edu *** -- Prof. George M. Sheldrick FRS Dept. Structural Chemistry, University of Goettingen, Tammannstr. 4, D37077 Goettingen, Germany Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-22582
Re: [ccp4bb] off topic: ITC or Biacore
Le Jeudi 9 Août 2012 09:55 CEST, rashmi panigrahi rashmi.panigrah...@gmail.com a écrit: Hello Rashi There is no problem with ITC at low temperature, apart for a likely slower binding and, hence, lower heat power signal. Do not conclude that there is no binding if deltaH is close to 0 around 20-25 °C. You may well have DeltaH = 0 at a given temperature, and yet the affinity constant is maximum at that temperature (by the Van't Hoff equation). However, if you have no signal (DeltaH = 0) at several temperatures, this means that this peptide does not bind. It is very well established that DeltaCp = dDeltaH/dT is often large, which means that a 15°C variation is sufficient to get a significant change in deltaH. Conclusion: do ITC at low temperature and try to increase as much as possible the concentrations of the protein and of the peptide. Also, if the peptide is too hydrophobic, you can put it in the cell and the protein in the syringe. Finally, you may also try to repeat CD in presence of various amounts of peptide and see if this results in Tm increase. Philippe Dumas Hi All, I am working on a protein that has a Tm of 30 degrees, and by CD I have observed that the secondary structure is intact at 10 degrees and slowly starts unfolding at 20 degrees. Literature suggest that it binds to a 5 residue peptide. I tried doing ITC @ 25 and 20 degrees , there was no binding observed. Does any one have the experience of doing ITC or Biacore(SPR) at 10 or 15 degrees? wondering that temperature is a problem as suggested by CD. thanks for your suggestions -- rashmi
Re: [ccp4bb] Dennis Ritchie
Le Mardi 18 Octobre 2011 16:36 CEST, Sabuj Pattanayek sab...@gmail.com a écrit: Should I understand that Gérard Brigogne really meant that Ritchie's achievements were peanuts ? Yet, after so many years in England I thought Gerard mastered British humour rather well... Encore un effort Gérard Philippe Dumas The silence on this list was deafening. group is saying: OK, so he discovered fire and invented the wheel - but what has he done since?. 1983 Turing Award 1990 IEEE Hamming medal 1999 National Medal of Technology 2011 Japan Prize for Information and Communications (while he was still alive I think) Even if he hadn't done anything technologically innovative that was made public since C and UNIX, his C book (which was the first programming book I read in it's entirety) and the foundations of his OS have helped countless millions of people. I find it sad that in many undergrad computer science curricula, C is no longer being taught. If you want to understand how software works under the hood you either learn assembly or C.