Re: [ccp4bb] delete subject
http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0053374 On Fri, Mar 29, 2013 at 4:09 AM, Eric Bennett er...@pobox.com wrote: Scott, I'm not sure I understand your last paragraph. Once researchers have had their data pass peer review (which I interpret as meaning a journal has accepted it), how often do you think it happens that it does not immediately get published? Just depositing data in the PDB, or posting it on a public web site, is not meet[ing] the veracity of peer review. There is something to be said for giving credit to the first people who have subjected their data to peer review and had the data pass that step, otherwise people will be tempted to just post data of dubious quality to stake a public claim before the quality of the data has been independently checked. In a case where this initial public non-peer-reviewed posting is of unacceptable data quality, that would dilute credit granted to another person who later obtained good data. An unfortunate number of problematic structures still sneak through peer review. Relaxing quality review standards that must be passed before a scientist gets to claim credit for a discovery is a step backwards IMO. Cheers, Eric On Mar 28, 2013, at 5:06 PM, Scott Pegan wrote: Hey everyone, Both Mark and Fred make some good points. I totally agree with Nat (beat me to the send button). Although in an ideal world with all the advancements in crowd sourcing and electronic media, one might think that posting data on a bulletin board might be considered marking one's turf and protect the scientist place in that pathway towards discoveries. Regrettably, the current reality doesn't' support this case. As structural biologists, we are still in the mode of first to publish gets the bulk of the glory and potentially future funding on the topic. For instance, when I was in graduate school, the lab I was in had KcsA crystals at the same time as a couple of competing groups. Several groups including the one I belong to had initial diffraction data. One group was able to solve KcsA, the first K channel trans-membrane protein structure, first. That group was led by Roderick Mackinnon, now a Noble Laureate partly because of this work. Now imagine if one of Mackinnon's student would have put up the web their initial diffraction data and another group would have used it to assist in their interpretation of their own data and either solved the structure before Mackinnon, or at least published it prior. Even if they acknowledged Mackinnion for the assistance of his data (as they should), Mackinnion and the other scientists in his lab would likely not have received the broad acclaim that they received and justly deserved. Also, ask Rosalind Franklin how data sharing worked out for her. Times haven't changed that much since ~10 years ago. Actually, as many have mentioned, things have potentially gotten worse. Worse in the respect that the scientific impact of structure is increasingly largely tide to the biochemical/biological studies that accompany the structure. In other words, the discoveries based on the insights the structure provides. Understandably, this increasing emphasis on follow up experiments to get into high impact journals in many cases increases the time between solving the structure and publishing it. During this gap, the group who solved the structure first is vulnerable to being scoped. Once scoped unless the interpretation of the structure and the conclusion of the follow up experiments are largely and justifiably divergent from the initial publications, there is usually a significant difficulty getting the article published in a top tier journal. Many might argue that they deposited it first, but I haven't seen anyone win that argument either. Because follow up articles will cite the publication describing the structure, not the PDB entry. Naturally, many could and should argue that this isn't they way it should be. We could rapidly move science ahead in many cases if research groups were entirely transparent and made available their discovers as soon as they could meet the veracity of peer-review. However, this is not the current reality or model we operate in. So, until this changes, one might be cautious about tipping your competition off whether they be another structural biology group looking to publish their already solved structure, or biology group that could use insights gathered by your structure information for a publication that might limit your own ability to publish. Fortunately, for Tom his structure sounds like it is only important to a pretty specific scientific question that many folks might not be working on exactly. Scott -- ** Toufic El Arnaout Trinity Biomedical Science Institute (TCD) 152-160 Pearse Street, Dublin 2
Re: [ccp4bb] High Rmerge and I/sigma values....?
Without seeing the raw data and watching the integration of frames, I would be suspicious of an incorrect space group assignment with an Rmerge 0.10. Are image spots well-predicted by the integration program (all spots have predictions, all predictions have spots)? Are the spots well-resolved (no or few overlaps) at the camera distance used? The lack of falloff of I/sig(I) with resolution bin also looks atypical. Typically, you would see a larger spread between the overall I/sig(I) and the I/sig(I) in the highest resolution bin (assuming you have collected data out to the resolution limit of the crystal). And with 10-fold redundancy, I would expect the overall I/sig(I) to be much higher, if your integration program factors redundancy into the I/sig(I) calculation. I see results similar to this doing automated processing of frames in real time during collection when the program goes off the rails and chooses the wrong space group. Cheers, ___ Roger S. Rowlett Gordon Dorothy Kline Professor Department of Chemistry Colgate University 13 Oak Drive Hamilton, NY 13346 tel: (315)-228-7245 ofc: (315)-228-7395 fax: (315)-228-7935 email: rrowl...@colgate.edu On 3/29/2013 9:19 AM, hamid khan wrote: Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein crystal diffraction data. Basically I am new to this field and do not have much idea about diffraction data interpretation and crystallography software’s use. 1)What could be the possible reasons for a high *Rmerge value, *say like *0.185*? 2)Value *6.2* for average *I/sigma(I)* for higher shell means that the resolution of the diffraction data is much higher than actually measured, what could be the possible reasons for this? For your ease I would like to past the table here; Values in parentheses are for the last resolution shell Space group P2221 Unit-cell parameters (A°) a 58.08 b 101.32 c 103.47 Molecules in ASU 1 Resolution range38.63 - 2.50 (2.59 - 2.50) Total number of reflections 228902 Number of unique reflections 21600 Completeness (%) 99.1 (98.0) *Rmerge 0.185 * (0.373) Reduced χ2 0.94 (1.01) *Average I/σ(I) 9.8 (6.2)* Thanks for the tips.., Hamid Khan
Re: [ccp4bb] High Rmerge and I/sigma values....?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Hamid, the statistics for I/sigI and the R-value per resolution shell would shed more light than the overall values. Judging from the Rmerge in the high resolution shell the data may have been processed by somebody who still thinks Rmerge = 30% is a good criterium for resolution cut-off. The high overall Rmerge might indicate a wrong space-group was picked with too high symmetry. If you have a copy of the unmerged data, run it through pointless, if you even have a copy of the frames, reprocess them in P1 and run the data through pointless! If these data are from an article you are refereeing please point out that Rmerge should not be published anymore and be replaced by Rmeas (alias Rrim)! Best, Tim Gruene On 03/29/2013 02:19 PM, hamid khan wrote: Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein crystal diffraction data. Basically I am new to this field and do not have much idea about diffraction data interpretation and crystallography software’s use. 1) What could be the possible reasons for a high Rmerge value, say like 0.185? 2) Value 6.2 for average I/sigma(I) for higher shell means that the resolution of the diffraction data is much higher than actually measured, what could be the possible reasons for this? For your ease I would like to past the table here; Values in parentheses are for the last resolution shell Space group P2221 Unit-cell parameters (A°) a58.08 b101.32 c103.47 Molecules in ASU 1 Resolution range 38.63 - 2.50 (2.59 - 2.50) Total number of reflections 228902 Number of unique reflections 21600 Completeness (%) 99.1 (98.0) Rmerge0.185 (0.373) Reduced χ2 0.94(1.01) Average I/σ(I) 9.8 (6.2) Thanks for the tips.., Hamid Khan - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFRVZu4UxlJ7aRr7hoRAs0cAJ45ITAQyygvqtC7fYTHTZcLPW7c3ACfUsvs Ei8BBrcxfxv2OKGqZgtELH8= =09R3 -END PGP SIGNATURE-
Re: [ccp4bb] delete subject
There's a second side to that. Reviewers who can't get enough data and request even more when you submit a decent paper with 18 pages of supplement for example. Jürgen On Mar 29, 2013, at 7:32 AM, Toufic El Arnaout wrote: http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0053374 On Fri, Mar 29, 2013 at 4:09 AM, Eric Bennett er...@pobox.commailto:er...@pobox.com wrote: Scott, I'm not sure I understand your last paragraph. Once researchers have had their data pass peer review (which I interpret as meaning a journal has accepted it), how often do you think it happens that it does not immediately get published? Just depositing data in the PDB, or posting it on a public web site, is not meet[ing] the veracity of peer review. There is something to be said for giving credit to the first people who have subjected their data to peer review and had the data pass that step, otherwise people will be tempted to just post data of dubious quality to stake a public claim before the quality of the data has been independently checked. In a case where this initial public non-peer-reviewed posting is of unacceptable data quality, that would dilute credit granted to another person who later obtained good data. An unfortunate number of problematic structures still sneak through peer review. Relaxing quality review standards that must be passed before a scientist gets to claim credit for a discovery is a step backwards IMO. Cheers, Eric On Mar 28, 2013, at 5:06 PM, Scott Pegan wrote: Hey everyone, Both Mark and Fred make some good points. I totally agree with Nat (beat me to the send button). Although in an ideal world with all the advancements in crowd sourcing and electronic media, one might think that posting data on a bulletin board might be considered marking one's turf and protect the scientist place in that pathway towards discoveries. Regrettably, the current reality doesn't' support this case. As structural biologists, we are still in the mode of first to publish gets the bulk of the glory and potentially future funding on the topic. For instance, when I was in graduate school, the lab I was in had KcsA crystals at the same time as a couple of competing groups. Several groups including the one I belong to had initial diffraction data. One group was able to solve KcsA, the first K channel trans-membrane protein structure, first. That group was led by Roderick Mackinnon, now a Noble Laureate partly because of this work. Now imagine if one of Mackinnon's student would have put up the web their initial diffraction data and another group would have used it to assist in their interpretation of their own data and either solved the structure before Mackinnon, or at least published it prior. Even if they acknowledged Mackinnion for the assistance of his data (as they should), Mackinnion and the other scientists in his lab would likely not have received the broad acclaim that they received and justly deserved. Also, ask Rosalind Franklin how data sharing worked out for her. Times haven't changed that much since ~10 years ago. Actually, as many have mentioned, things have potentially gotten worse. Worse in the respect that the scientific impact of structure is increasingly largely tide to the biochemical/biological studies that accompany the structure. In other words, the discoveries based on the insights the structure provides. Understandably, this increasing emphasis on follow up experiments to get into high impact journals in many cases increases the time between solving the structure and publishing it. During this gap, the group who solved the structure first is vulnerable to being scoped. Once scoped unless the interpretation of the structure and the conclusion of the follow up experiments are largely and justifiably divergent from the initial publications, there is usually a significant difficulty getting the article published in a top tier journal. Many might argue that they deposited it first, but I haven't seen anyone win that argument either. Because follow up articles will cite the publication describing the structure, not the PDB entry. Naturally, many could and should argue that this isn't they way it should be. We could rapidly move science ahead in many cases if research groups were entirely transparent and made available their discovers as soon as they could meet the veracity of peer-review. However, this is not the current reality or model we operate in. So, until this changes, one might be cautious about tipping your competition off whether they be another structural biology group looking to publish their already solved structure, or biology group that could use insights gathered by your structure information for a publication that might limit your own ability to publish. Fortunately, for Tom his structure sounds like it is only important to a pretty specific scientific question that many folks might
Re: [ccp4bb] High Rmerge and I/sigma values....?
As mentioned lots of reasons for this. a. Poor crystal b. Poor mount of the crystal c. Poor equipment or non-working equipment d. Poor maintenance of good equipment e. Improper cryoprotection f. Vibration or movement of goniometer, goniometer head, mounting pin, mounting loop, magnet, etc g. Temperature fluctuation of the environment during the data collection h. Not enough exposure time or poor signal to noise (improper experimental design) i. Improper data processing (too many things to mention here) j. etc. k. et al. Jim From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of hamid khan [hamid...@yahoo.com] Sent: Friday, March 29, 2013 8:19 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] High Rmerge and I/sigma values? Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein crystal diffraction data. Basically I am new to this field and do not have much idea about diffraction data interpretation and crystallography software’s use. 1)What could be the possible reasons for a high Rmerge value, say like 0.185? 2)Value 6.2 for average I/sigma(I) for higher shell means that the resolution of the diffraction data is much higher than actually measured, what could be the possible reasons for this? For your ease I would like to past the table here; Values in parentheses are for the last resolution shell Space group P2221 Unit-cell parameters (A°) a58.08 b101.32 c103.47 Molecules in ASU 1 Resolution range 38.63 - 2.50 (2.59 - 2.50) Total number of reflections 228902 Number of unique reflections 21600 Completeness (%) 99.1(98.0) Rmerge0.185 (0.373) Reduced χ2 0.94(1.01) Average I/σ(I)9.8 (6.2) Thanks for the tips.., Hamid Khan
Re: [ccp4bb] High Rmerge and I/sigma values....?
I must disagree with Tim on the statement Rmerge should not be published anymore. That would be a shame. Perhaps even a crime. When Uli Arndt introduced Rmerge he was in no way, shape or form proposing that it be used for resolution cutoffs, or anything else about the quality of the structure. He was, however, trying to define a good statistic to evaluate a diffractometer system, and Rmerge is still VERY useful for that! Any halfway decent modern detector/shutter/beam system should be able to measure reasonably strong spots to within 5% of their true intensity. Note that this is the _overall_ Rmerge value. The Rmerge divided up in resolution bins is pretty useless for this, especially the outermost bin, where you are basically dividing by zero. The only useful Rmerge bin is actually the lowest-angle one, where the spots tend to all be strong. Remember, Rmerge is defined as the _sum_ of all the variations in spot intensity divided by the _sum_ of all the intensity. This should never be much more than 5% for strong spots. If it is, then something is wrong with either your detector, or your shutter, or perhaps your assumptions about symmetry. Yes, I know multiplicity makes Rmerge higher, but in actual fact multiplicity makes Rmerge more honest. It is better to say that low multiplicity makes your Rmerge appear too low. Basically, if you actually do have RMS 5% error per spot, and you only measure each hkl twice, then you expect to see Rmerge=2.8%, even though the actual error is 5%. And of course, if you measure 1e6 photons in one spot you might fool yourself into thinking the error is only 0.1%. Its not. On the other hand, if all your spots are weak, then you do expect the variation to be dominated by photon-counting error, and you will get Rmerge values much greater than 5% on a perfectly good detector. It is only at high multiplicities with strong spots that Rmerge truly shows you how bad your equipment is. This is why its always good to check Rmerge in your lowest-angle bin. Yes, I know we probably all take our local well-maintained and finely-tuned beamline for granted, but that does not mean we should stop using the only statistic that tells us something might be wrong with the machine we used to measure our data. That is definitely worth the ~20 extra bytes it takes up in your paper. -James Holton MAD Scientist On Fri, Mar 29, 2013 at 6:48 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Hamid, the statistics for I/sigI and the R-value per resolution shell would shed more light than the overall values. Judging from the Rmerge in the high resolution shell the data may have been processed by somebody who still thinks Rmerge = 30% is a good criterium for resolution cut-off. The high overall Rmerge might indicate a wrong space-group was picked with too high symmetry. If you have a copy of the unmerged data, run it through pointless, if you even have a copy of the frames, reprocess them in P1 and run the data through pointless! If these data are from an article you are refereeing please point out that Rmerge should not be published anymore and be replaced by Rmeas (alias Rrim)! Best, Tim Gruene On 03/29/2013 02:19 PM, hamid khan wrote: Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein crystal diffraction data. Basically I am new to this field and do not have much idea about diffraction data interpretation and crystallography software’s use. 1) What could be the possible reasons for a high Rmerge value, say like 0.185? 2) Value 6.2 for average I/sigma(I) for higher shell means that the resolution of the diffraction data is much higher than actually measured, what could be the possible reasons for this? For your ease I would like to past the table here; Values in parentheses are for the last resolution shell Space group P2221 Unit-cell parameters (A°) a58.08 b101.32 c103.47 Molecules in ASU 1 Resolution range 38.63 - 2.50 (2.59 - 2.50) Total number of reflections 228902 Number of unique reflections 21600 Completeness (%) 99.1 (98.0) Rmerge0.185 (0.373) Reduced χ2 0.94(1.01) Average I/σ(I) 9.8 (6.2) Thanks for the tips.., Hamid Khan - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFRVZu4UxlJ7aRr7hoRAs0cAJ45ITAQyygvqtC7fYTHTZcLPW7c3ACfUsvs Ei8BBrcxfxv2OKGqZgtELH8= =09R3 -END PGP
Re: [ccp4bb] High Rmerge and I/sigma values....?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi James, you misquote me: I was saying that Rmeas should be replacing Rmerge, and I guess everything you say holds for Rmeas, too, but still it is a better statistical quantity than Rmerge. So please replace Rmerge with Rmeas. Best, Tim On 03/29/2013 06:08 PM, James Holton wrote: I must disagree with Tim on the statement Rmerge should not be published anymore. That would be a shame. Perhaps even a crime. When Uli Arndt introduced Rmerge he was in no way, shape or form proposing that it be used for resolution cutoffs, or anything else about the quality of the structure. He was, however, trying to define a good statistic to evaluate a diffractometer system, and Rmerge is still VERY useful for that! Any halfway decent modern detector/shutter/beam system should be able to measure reasonably strong spots to within 5% of their true intensity. Note that this is the _overall_ Rmerge value. The Rmerge divided up in resolution bins is pretty useless for this, especially the outermost bin, where you are basically dividing by zero. The only useful Rmerge bin is actually the lowest-angle one, where the spots tend to all be strong. Remember, Rmerge is defined as the _sum_ of all the variations in spot intensity divided by the _sum_ of all the intensity. This should never be much more than 5% for strong spots. If it is, then something is wrong with either your detector, or your shutter, or perhaps your assumptions about symmetry. Yes, I know multiplicity makes Rmerge higher, but in actual fact multiplicity makes Rmerge more honest. It is better to say that low multiplicity makes your Rmerge appear too low. Basically, if you actually do have RMS 5% error per spot, and you only measure each hkl twice, then you expect to see Rmerge=2.8%, even though the actual error is 5%. And of course, if you measure 1e6 photons in one spot you might fool yourself into thinking the error is only 0.1%. Its not. On the other hand, if all your spots are weak, then you do expect the variation to be dominated by photon-counting error, and you will get Rmerge values much greater than 5% on a perfectly good detector. It is only at high multiplicities with strong spots that Rmerge truly shows you how bad your equipment is. This is why its always good to check Rmerge in your lowest-angle bin. Yes, I know we probably all take our local well-maintained and finely-tuned beamline for granted, but that does not mean we should stop using the only statistic that tells us something might be wrong with the machine we used to measure our data. That is definitely worth the ~20 extra bytes it takes up in your paper. -James Holton MAD Scientist On Fri, Mar 29, 2013 at 6:48 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: Dear Hamid, the statistics for I/sigI and the R-value per resolution shell would shed more light than the overall values. Judging from the Rmerge in the high resolution shell the data may have been processed by somebody who still thinks Rmerge = 30% is a good criterium for resolution cut-off. The high overall Rmerge might indicate a wrong space-group was picked with too high symmetry. If you have a copy of the unmerged data, run it through pointless, if you even have a copy of the frames, reprocess them in P1 and run the data through pointless! If these data are from an article you are refereeing please point out that Rmerge should not be published anymore and be replaced by Rmeas (alias Rrim)! Best, Tim Gruene On 03/29/2013 02:19 PM, hamid khan wrote: Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein crystal diffraction data. Basically I am new to this field and do not have much idea about diffraction data interpretation and crystallography software’s use. 1) What could be the possible reasons for a high Rmerge value, say like 0.185? 2) Value 6.2 for average I/sigma(I) for higher shell means that the resolution of the diffraction data is much higher than actually measured, what could be the possible reasons for this? For your ease I would like to past the table here; Values in parentheses are for the last resolution shell Space group P2221 Unit-cell parameters (A°) a58.08 b101.32 c103.47 Molecules in ASU 1 Resolution range 38.63 - 2.50 (2.59 - 2.50) Total number of reflections 228902 Number of unique reflections 21600 Completeness (%) 99.1 (98.0) Rmerge0.185 (0.373) Reduced χ2 0.94(1.01) Average I/σ(I) 9.8 (6.2) Thanks for the tips.., Hamid
Re: [ccp4bb] High Rmerge and I/sigma values....?
Ahh. But what I'm saying is that Rmeas is not a replacement for Rmerge because Rmeas is _always_ lower than Rmerge. It is even less useful that a low-multiplicity Rmerge for evaluating the diffractometer. I fully realize that Rmeas does have the desirable property of being flatter with respect to multiplicity, but being equally too low for all multiplicity is not better than being too low for some multiplicities. IMHO. Yes, I know, we all like R statistics that are lower. Indeed, by using the mean absolute deviation |I-I|, Uli was able to come up with a definition of Rmerge that will always be lower than the RMS error (for infinite multiplicity and RMS 5% error you actually get Rmerge=3.99%). No doubt, this must have contributed to the acceptance of Rmerge at the time. But we can't just keep re-defining our metric of error every ~20 years so that the same crappy data keeps looking better and better. That's a slippery slope I'd rather not be on. I think it is important to remember what it is we are trying to measure, and to be honest and consistent about what the error bars really are. But that's just my opinion. I could be wrong. -James Holton MAD Scientist On Fri, Mar 29, 2013 at 10:28 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi James, you misquote me: I was saying that Rmeas should be replacing Rmerge, and I guess everything you say holds for Rmeas, too, but still it is a better statistical quantity than Rmerge. So please replace Rmerge with Rmeas. Best, Tim On 03/29/2013 06:08 PM, James Holton wrote: I must disagree with Tim on the statement Rmerge should not be published anymore. That would be a shame. Perhaps even a crime. When Uli Arndt introduced Rmerge he was in no way, shape or form proposing that it be used for resolution cutoffs, or anything else about the quality of the structure. He was, however, trying to define a good statistic to evaluate a diffractometer system, and Rmerge is still VERY useful for that! Any halfway decent modern detector/shutter/beam system should be able to measure reasonably strong spots to within 5% of their true intensity. Note that this is the _overall_ Rmerge value. The Rmerge divided up in resolution bins is pretty useless for this, especially the outermost bin, where you are basically dividing by zero. The only useful Rmerge bin is actually the lowest-angle one, where the spots tend to all be strong. Remember, Rmerge is defined as the _sum_ of all the variations in spot intensity divided by the _sum_ of all the intensity. This should never be much more than 5% for strong spots. If it is, then something is wrong with either your detector, or your shutter, or perhaps your assumptions about symmetry. Yes, I know multiplicity makes Rmerge higher, but in actual fact multiplicity makes Rmerge more honest. It is better to say that low multiplicity makes your Rmerge appear too low. Basically, if you actually do have RMS 5% error per spot, and you only measure each hkl twice, then you expect to see Rmerge=2.8%, even though the actual error is 5%. And of course, if you measure 1e6 photons in one spot you might fool yourself into thinking the error is only 0.1%. Its not. On the other hand, if all your spots are weak, then you do expect the variation to be dominated by photon-counting error, and you will get Rmerge values much greater than 5% on a perfectly good detector. It is only at high multiplicities with strong spots that Rmerge truly shows you how bad your equipment is. This is why its always good to check Rmerge in your lowest-angle bin. Yes, I know we probably all take our local well-maintained and finely-tuned beamline for granted, but that does not mean we should stop using the only statistic that tells us something might be wrong with the machine we used to measure our data. That is definitely worth the ~20 extra bytes it takes up in your paper. -James Holton MAD Scientist On Fri, Mar 29, 2013 at 6:48 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: Dear Hamid, the statistics for I/sigI and the R-value per resolution shell would shed more light than the overall values. Judging from the Rmerge in the high resolution shell the data may have been processed by somebody who still thinks Rmerge = 30% is a good criterium for resolution cut-off. The high overall Rmerge might indicate a wrong space-group was picked with too high symmetry. If you have a copy of the unmerged data, run it through pointless, if you even have a copy of the frames, reprocess them in P1 and run the data through pointless! If these data are from an article you are refereeing please point out that Rmerge should not be published anymore and be replaced by Rmeas (alias Rrim)! Best, Tim Gruene On 03/29/2013 02:19 PM, hamid khan wrote: Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein
Re: [ccp4bb] High Rmerge and I/sigma values....?
To support James Holton, MAD Scientist From the very far past: In 60's we have estimated X-ray diffraction data by eyes, using strips of intensities made with propagating exposure factor increase of 2 (I guess in F it corresponds to sqrt(2)~=1.4). We have used to estimate our Weissenberg diffraction data 3 times (we used X-ray films). After 1 time, logbooks were locked in the deposit safe box by our advisers. The second round of estimation we recorded in a different logbook, and it was again locked in the safe deposit box. After 3 time we have punched (binary code) our data on cards, translated to ASCII, checked, corrected, and finally run averaging programs. Our best data were of about 30% Rmerge. However, as for the structure refinement (and maybe some flavours of structure determination) absolute error in a measurement of a single set of symmetry related reflections (precision) is much less important than relative error in measuring of other sets (accuracy), we were able to refine anisotropically small molecule structures to R=5%, because our data were not precise, but accurate. BTW some erroneous 'discoveries' in metalo-organic complexes with non-centrosymmetric space groups were made, based on our inability to measure anomalous signal (it is in the single set of symmetry related reflections), differences were averaged and absolute structure information was lost introducing artificial asymmetry of the coordination sphere. About 20 year later in H. D. Flack (1983). On Enantiomorph-Polarity Estimation, Acta Cryst A39: 876–881, all become clearly explained. To conclude : I am not impressed by very low Rmerge. Once ALL reflections were overexposed on XENTRONIX (which was the area detector with not very good dynamic range), as a result of running after low Rmerge without understanding procedures and instruments to measure diffraction, and Rmerge of 1% was seen, but data were useless. I am not depressed by diffraction data of very high (but correct) Rmerge of 13% with which a structure of important for us protein complex was swiftly solved by SAD (450 residues, 5 Se). The crystals were small; even 10 sec exposure produced relatively weak (but accurate) data. And long live a bending magnet that almost never burn down cryo-maintained crystals! All depends on circumstances My 2 cents... Dr Felix Frolow Professor of Structural Biology and Biotechnology, Department of Molecular Microbiology and Biotechnology Tel Aviv University 69978, Israel Acta Crystallographica F, co-editor e-mail: mbfro...@post.tau.ac.il Tel: ++972-3640-8723 Fax: ++972-3640-9407 Cellular: 0547 459 608 On Mar 29, 2013, at 20:52 , James Holton jmhol...@lbl.gov wrote: Ahh. But what I'm saying is that Rmeas is not a replacement for Rmerge because Rmeas is _always_ lower than Rmerge. It is even less useful that a low-multiplicity Rmerge for evaluating the diffractometer. I fully realize that Rmeas does have the desirable property of being flatter with respect to multiplicity, but being equally too low for all multiplicity is not better than being too low for some multiplicities. IMHO. Yes, I know, we all like R statistics that are lower. Indeed, by using the mean absolute deviation |I-I|, Uli was able to come up with a definition of Rmerge that will always be lower than the RMS error (for infinite multiplicity and RMS 5% error you actually get Rmerge=3.99%). No doubt, this must have contributed to the acceptance of Rmerge at the time. But we can't just keep re-defining our metric of error every ~20 years so that the same crappy data keeps looking better and better. That's a slippery slope I'd rather not be on. I think it is important to remember what it is we are trying to measure, and to be honest and consistent about what the error bars really are. But that's just my opinion. I could be wrong. -James Holton MAD Scientist On Fri, Mar 29, 2013 at 10:28 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi James, you misquote me: I was saying that Rmeas should be replacing Rmerge, and I guess everything you say holds for Rmeas, too, but still it is a better statistical quantity than Rmerge. So please replace Rmerge with Rmeas. Best, Tim On 03/29/2013 06:08 PM, James Holton wrote: I must disagree with Tim on the statement Rmerge should not be published anymore. That would be a shame. Perhaps even a crime. When Uli Arndt introduced Rmerge he was in no way, shape or form proposing that it be used for resolution cutoffs, or anything else about the quality of the structure. He was, however, trying to define a good statistic to evaluate a diffractometer system, and Rmerge is still VERY useful for that! Any halfway decent modern detector/shutter/beam system should be able to measure reasonably strong spots to within 5% of their true
Re: [ccp4bb] High Rmerge and I/sigma values....?
Woops! Sorry. I was thinking Rpim, which is always lower than Rmerge. Rmeas is always higher, and more correctly estimates the infinite-multiplicity Rmerge. Sorry for the confusion, and thanks for the many reminders I just got about the definition! -James Holton MAD Scientist On 3/29/2013 10:28 AM, Tim Gruene wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi James, you misquote me: I was saying that Rmeas should be replacing Rmerge, and I guess everything you say holds for Rmeas, too, but still it is a better statistical quantity than Rmerge. So please replace Rmerge with Rmeas. Best, Tim On 03/29/2013 06:08 PM, James Holton wrote: I must disagree with Tim on the statement Rmerge should not be published anymore. That would be a shame. Perhaps even a crime. When Uli Arndt introduced Rmerge he was in no way, shape or form proposing that it be used for resolution cutoffs, or anything else about the quality of the structure. He was, however, trying to define a good statistic to evaluate a diffractometer system, and Rmerge is still VERY useful for that! Any halfway decent modern detector/shutter/beam system should be able to measure reasonably strong spots to within 5% of their true intensity. Note that this is the _overall_ Rmerge value. The Rmerge divided up in resolution bins is pretty useless for this, especially the outermost bin, where you are basically dividing by zero. The only useful Rmerge bin is actually the lowest-angle one, where the spots tend to all be strong. Remember, Rmerge is defined as the _sum_ of all the variations in spot intensity divided by the _sum_ of all the intensity. This should never be much more than 5% for strong spots. If it is, then something is wrong with either your detector, or your shutter, or perhaps your assumptions about symmetry. Yes, I know multiplicity makes Rmerge higher, but in actual fact multiplicity makes Rmerge more honest. It is better to say that low multiplicity makes your Rmerge appear too low. Basically, if you actually do have RMS 5% error per spot, and you only measure each hkl twice, then you expect to see Rmerge=2.8%, even though the actual error is 5%. And of course, if you measure 1e6 photons in one spot you might fool yourself into thinking the error is only 0.1%. Its not. On the other hand, if all your spots are weak, then you do expect the variation to be dominated by photon-counting error, and you will get Rmerge values much greater than 5% on a perfectly good detector. It is only at high multiplicities with strong spots that Rmerge truly shows you how bad your equipment is. This is why its always good to check Rmerge in your lowest-angle bin. Yes, I know we probably all take our local well-maintained and finely-tuned beamline for granted, but that does not mean we should stop using the only statistic that tells us something might be wrong with the machine we used to measure our data. That is definitely worth the ~20 extra bytes it takes up in your paper. -James Holton MAD Scientist On Fri, Mar 29, 2013 at 6:48 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote: Dear Hamid, the statistics for I/sigI and the R-value per resolution shell would shed more light than the overall values. Judging from the Rmerge in the high resolution shell the data may have been processed by somebody who still thinks Rmerge = 30% is a good criterium for resolution cut-off. The high overall Rmerge might indicate a wrong space-group was picked with too high symmetry. If you have a copy of the unmerged data, run it through pointless, if you even have a copy of the frames, reprocess them in P1 and run the data through pointless! If these data are from an article you are refereeing please point out that Rmerge should not be published anymore and be replaced by Rmeas (alias Rrim)! Best, Tim Gruene On 03/29/2013 02:19 PM, hamid khan wrote: Dear CCP4BB Members, I am interested in your expert comments/opinions about two values of a protein crystal diffraction data. Basically I am new to this field and do not have much idea about diffraction data interpretation and crystallography software’s use. 1) What could be the possible reasons for a high Rmerge value, say like 0.185? 2) Value 6.2 for average I/sigma(I) for higher shell means that the resolution of the diffraction data is much higher than actually measured, what could be the possible reasons for this? For your ease I would like to past the table here; Values in parentheses are for the last resolution shell Space group P2221 Unit-cell parameters (A°) a58.08 b101.32 c103.47 Molecules in ASU 1 Resolution range 38.63 - 2.50 (2.59 - 2.50) Total number of reflections 228902 Number of unique reflections 21600 Completeness (%) 99.1 (98.0)
Re: [ccp4bb] Twinning problem
Hello everyone sorry for the intervention with some basic questions regarding twinning In continuation with the discussion with Liang i would like to ask a question which i faced..i have also solved a structure and the statistics depending on twin laws as described through xtriage, phenix is as follows: operator k,h,-l type pseudomerohedral brotton alpha 0.019 h alpha 0.023 m alpha 0.22 it seems the probable twin fraction in my case is 0.2, now the question is does it mean that in another twin domain ie. twin operator h,k,l the twin fraction will be 0.8 ? On Tue, Mar 26, 2013 at 9:07 PM, vellieux frederic.velli...@ibs.fr wrote: Hello, I would suggest to use several tools (in addition to Phenix's) - CCP4's detwin, the plots generated by truncate before detwinning, the Yeates twinning server and there might be others - to get a good idea of what the twinning fraction is. Here we've had success using CCP4's detwin to detwin diffraction data. The resulting mtz file is not equivalent to an mtz file containing data recorded from an untwinned crystal - this detwinning operation is not a perfectly accurate operation... In our case we used the estimate of the twinning fraction obtained from Phenix (which was lower). HTH, Fred. On 26/03/13 15:45, Liang Zhang wrote: Hi, All, I got a set of P2(or P21) data for MR. However, the Phenix-Xtriage indicated that it could be a pseudo-merohedral twinning. Does anyone know how to deal with such kind of twinning problem? Thanks. Best, Liang -- Fred. Vellieux (B.Sc., Ph.D., hdr) ouvrier de la recherche IBS / ELMA 41 rue Jules Horowitz F-38027 Grenoble Cedex 01 Tel: +33 438789605 Fax: +33 438785494 -- Regards Faisal School of Life Sciences JNU