Re: [ccp4bb] very informative - Trends in Data Fabrication
On 8 Apr 2012, at 21:18, aaleshin wrote: What I suggested with respect to the PDB data validation was adding some additional information that would allow to independently validate such parameters as the resolution and data quality (catching of model fabrications would be a byproduct of this process). Does the current system allow to overestimate those parameters? I believe so (but I might be wrong, correct me!). Periodically, people ask at ccp4bb how to determine the resolution of their data, but some idiots may decide to do it on their own and add 30% of noise to their structural factors. As James mentioned, one does not need to be extremely smart to do so, moreover, such an idiot would have less restraints than an educated crystallographer, because the idiot believes that nobody would notice his cheating. His moral principles are not corrupted, because he thinks that the model is correct and no harm is done. But the harm is still there, because people are forced to believe the model more than it deserves. The question is still open to me about what percentage of PDB structures overestimates data quality in terms of resolution. Is it possible to make it less dependent on the opinion of persons submitting the data? We all have so different opinions about everything... Regards, Alex Aleshin Using the weak high resolution data in a structure determination is not cheating. We should use data out to the point where there is no more significant and as long as it helps the structure determination and refinement, provided that we are using appropriate statistical treatment of the errors. We have become addicted to the idea that resolution is a single indicator of quality, and that is a gross over-simplification. Resolution tells us how many data were used, not their quality nor the quality of the model. Phil
Re: [ccp4bb] very informative - Trends in Data Fabrication
Thank you Phil, for clarification of my point, but it appears as cheating in a current situation, when an author has to fit a three dimensional statistics into a one-dimentional table. Moreover, many of journal reviewers may never worked with the low-resolution data and understand importance of every A^3 counts. It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. Alex On Apr 9, 2012, at 4:51 AM, Phil Evans wrote: On 8 Apr 2012, at 21:18, aaleshin wrote: What I suggested with respect to the PDB data validation was adding some additional information that would allow to independently validate such parameters as the resolution and data quality (catching of model fabrications would be a byproduct of this process). Does the current system allow to overestimate those parameters? I believe so (but I might be wrong, correct me!). Periodically, people ask at ccp4bb how to determine the resolution of their data, but some idiots may decide to do it on their own and add 30% of noise to their structural factors. As James mentioned, one does not need to be extremely smart to do so, moreover, such an idiot would have less restraints than an educated crystallographer, because the idiot believes that nobody would notice his cheating. His moral principles are not corrupted, because he thinks that the model is correct and no harm is done. But the harm is still there, because people are forced to believe the model more than it deserves. The question is still open to me about what percentage of PDB structures overestimates data quality in terms of resolution. Is it possible to make it less dependent on the opinion of persons submitting the data? We all have so different opinions about everything... Regards, Alex Aleshin Using the weak high resolution data in a structure determination is not cheating. We should use data out to the point where there is no more significant and as long as it helps the structure determination and refinement, provided that we are using appropriate statistical treatment of the errors. We have become addicted to the idea that resolution is a single indicator of quality, and that is a gross over-simplification. Resolution tells us how many data were used, not their quality nor the quality of the model. Phil
Re: [ccp4bb] very informative - Trends in Data Fabrication
How about such a footnote to Table 1: The resolution of data is 3A in the a direction, 3.5A in b direction and 5A in the c direction Wouldn't this do the trick? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of aaleshin [aales...@burnham.org] Sent: Monday, April 09, 2012 6:47 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Thank you Phil, for clarification of my point, but it appears as cheating in a current situation, when an author has to fit a three dimensional statistics into a one-dimentional table. Moreover, many of journal reviewers may never worked with the low-resolution data and understand importance of every A^3 counts. It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. Alex On Apr 9, 2012, at 4:51 AM, Phil Evans wrote: On 8 Apr 2012, at 21:18, aaleshin wrote: What I suggested with respect to the PDB data validation was adding some additional information that would allow to independently validate such parameters as the resolution and data quality (catching of model fabrications would be a byproduct of this process). Does the current system allow to overestimate those parameters? I believe so (but I might be wrong, correct me!). Periodically, people ask at ccp4bb how to determine the resolution of their data, but some idiots may decide to do it on their own and add 30% of noise to their structural factors. As James mentioned, one does not need to be extremely smart to do so, moreover, such an idiot would have less restraints than an educated crystallographer, because the idiot believes that nobody would notice his cheating. His moral principles are not corrupted, because he thinks that the model is correct and no harm is done. But the harm is still there, because people are forced to believe the model more than it deserves. The question is still open to me about what percentage of PDB structures overestimates data quality in terms of resolution. Is it possible to make it less dependent on the opinion of persons submitting the data? We all have so different opinions about everything... Regards, Alex Aleshin Using the weak high resolution data in a structure determination is not cheating. We should use data out to the point where there is no more significant and as long as it helps the structure determination and refinement, provided that we are using appropriate statistical treatment of the errors. We have become addicted to the idea that resolution is a single indicator of quality, and that is a gross over-simplification. Resolution tells us how many data were used, not their quality nor the quality of the model. Phil
Re: [ccp4bb] very informative - Trends in Data Fabrication
I've done that in papers The more fundamental problem is in the end what we want to know are things like is residue 43 close to residue 146?, which side chains interact with the ligand? etc etc and resolution is only a very rough guide to the correctness of such conclusions Phil On 9 Apr 2012, at 17:32, Boaz Shaanan wrote: How about such a footnote to Table 1: The resolution of data is 3A in the a direction, 3.5A in b direction and 5A in the c direction Wouldn't this do the trick? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of aaleshin [aales...@burnham.org] Sent: Monday, April 09, 2012 6:47 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Thank you Phil, for clarification of my point, but it appears as cheating in a current situation, when an author has to fit a three dimensional statistics into a one-dimentional table. Moreover, many of journal reviewers may never worked with the low-resolution data and understand importance of every A^3 counts. It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. Alex On Apr 9, 2012, at 4:51 AM, Phil Evans wrote: On 8 Apr 2012, at 21:18, aaleshin wrote: What I suggested with respect to the PDB data validation was adding some additional information that would allow to independently validate such parameters as the resolution and data quality (catching of model fabrications would be a byproduct of this process). Does the current system allow to overestimate those parameters? I believe so (but I might be wrong, correct me!). Periodically, people ask at ccp4bb how to determine the resolution of their data, but some idiots may decide to do it on their own and add 30% of noise to their structural factors. As James mentioned, one does not need to be extremely smart to do so, moreover, such an idiot would have less restraints than an educated crystallographer, because the idiot believes that nobody would notice his cheating. His moral principles are not corrupted, because he thinks that the model is correct and no harm is done. But the harm is still there, because people are forced to believe the model more than it deserves. The question is still open to me about what percentage of PDB structures overestimates data quality in terms of resolution. Is it possible to make it less dependent on the opinion of persons submitting the data? We all have so different opinions about everything... Regards, Alex Aleshin Using the weak high resolution data in a structure determination is not cheating. We should use data out to the point where there is no more significant and as long as it helps the structure determination and refinement, provided that we are using appropriate statistical treatment of the errors. We have become addicted to the idea that resolution is a single indicator of quality, and that is a gross over-simplification. Resolution tells us how many data were used, not their quality nor the quality of the model. Phil
Re: [ccp4bb] very informative - Trends in Data Fabrication
On 04/09/12 12:32, Boaz Shaanan wrote: How about such a footnote to Table 1: The resolution of data is 3A in the a direction, 3.5A in b direction and 5A in the c direction Wouldn't this do the trick? Usually there's a requirement for a table of statistics, including completeness and R in the outer shell. In the case of anisotropic data, what constitutes the outer shell? This is not a rhetorical question, I have some anisotropic data myself and will be facing these questions when it comes time to publish. This looks like a good place to plug the UCLA MBI Diffraction Anisotropy Server, which I found to be useful: http://services.mbi.ucla.edu/anisoscale/ Cheers, -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu
Re: [ccp4bb] very informative - Trends in Data Fabrication
It is a wonderful server indeed, but its default setting cuts the resolution at 3 sigma (if I remember correctly). It is too stringent in my opinion. Also, it is not clear to me whether to submit all data to the highest resolution point, or the data that come from the server? But then again, the question remains at what sigma level to cut them? Aex On Apr 9, 2012, at 9:46 AM, David Schuller wrote: On 04/09/12 12:32, Boaz Shaanan wrote: How about such a footnote to Table 1: The resolution of data is 3A in the a direction, 3.5A in b direction and 5A in the c direction Wouldn't this do the trick? Usually there's a requirement for a table of statistics, including completeness and R in the outer shell. In the case of anisotropic data, what constitutes the outer shell? This is not a rhetorical question, I have some anisotropic data myself and will be facing these questions when it comes time to publish. This looks like a good place to plug the UCLA MBI Diffraction Anisotropy Server, which I found to be useful: http://services.mbi.ucla.edu/anisoscale/ Cheers, -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu
Re: [ccp4bb] very informative - Trends in Data Fabrication
On Apr 9, 2012, at 11:47 AM, aaleshin wrote: Thank you Phil, for clarification of my point, but it appears as cheating in a current situation, when an author has to fit a three dimensional statistics into a one-dimentional table. Moreover, many of journal reviewers may never worked with the low-resolution data and understand importance of every A^3 counts. It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. Alex In the very low resolution world of SAXS, the whole idea of resolution is problematic. One can quote the minimum d-spacing (maximum angle) measured, but it is not a useful number to report. People are much more concerned about the quality of the data at maximum d-spacing (lowest angle). Perhaps very low-resolution crystallography is starting to enter this regime as well in which resolution concerns are turned upside down. Granted, SAXS is a heavily averaged experiment which can densely sample q space, but which does not even attempt to produce density. But the point I think that is appreciated in the SAXS community, is that the connection between extent of data in reciprocal space and model features is not simple. Richard Gillilan MacCHESS
Re: [ccp4bb] very informative - Trends in Data Fabrication
Hi Alex, It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. can't be easier I guess: just switch from characterizing data sets with one single number (which is suboptimal, at least, as Phil pointed out earlier) and show statistics by resolution instead. For example, R-factors, data completeness, Fobs shown in resolution bins are obviously much more informative metrics then a single number. If you want to be even more sophisticated, you can. See for example: A program to analyze the distributions of unmeasured reflections J. Appl. Cryst. (2011). 44, 865-872 L. Urzhumtseva and A. Urzhumtsev Pavel
Re: [ccp4bb] very informative - Trends in Data Fabrication
Hi Pavel, Reporting the table that you suggested would create more red flags for the reviewers and readers than explaining how to understand the resolution of my data. We need more studies into this issue (correlation between the resolution of anisotropic data and model quality). And there should be a common rule how to report and interpret such data (IMHO). Regards, Alex On Apr 9, 2012, at 11:02 AM, Pavel Afonine wrote: Hi Alex, It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. can't be easier I guess: just switch from characterizing data sets with one single number (which is suboptimal, at least, as Phil pointed out earlier) and show statistics by resolution instead. For example, R-factors, data completeness, Fobs shown in resolution bins are obviously much more informative metrics then a single number. If you want to be even more sophisticated, you can. See for example: A program to analyze the distributions of unmeasured reflections J. Appl. Cryst. (2011). 44, 865-872 L. Urzhumtseva and A. Urzhumtsev Pavel
Re: [ccp4bb] very informative - Trends in Data Fabrication
Alex, I think you are mixing two things here: presenting statistics that characterizes the data and its interpretation. Looking at data completeness as a single number tells something but not a lot, while looking at these metrics per resolution reveals a whole lot more information (for example, distribution of missing data in reciprocal space may tell you why your maps look funny). Pavel On Mon, Apr 9, 2012 at 11:11 AM, aaleshin aales...@burnham.org wrote: Hi Pavel, Reporting the table that you suggested would create more red flags for the reviewers and readers than explaining how to understand the resolution of my data. We need more studies into this issue (correlation between the resolution of anisotropic data and model quality). And there should be a common rule how to report and interpret such data (IMHO). Regards, Alex On Apr 9, 2012, at 11:02 AM, Pavel Afonine wrote: Hi Alex, It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. can't be easier I guess: just switch from characterizing data sets with one single number (which is suboptimal, at least, as Phil pointed out earlier) and show statistics by resolution instead. For example, R-factors, data completeness, Fobs shown in resolution bins are obviously much more informative metrics then a single number. If you want to be even more sophisticated, you can. See for example: A program to analyze the distributions of unmeasured reflections J. Appl. Cryst. (2011). 44, 865-872 L. Urzhumtseva and A. Urzhumtsev Pavel
Re: [ccp4bb] very informative - Trends in Data Fabrication
Or as tensor, see classic: ANISOTROPIC SCALING OF 3-DIMENSIONAL INTENSITY DATA Author(s): SHAKKED, Z (SHAKKED, Z) Source: ACTA CRYSTALLOGRAPHICA SECTION A Volume: 39 Issue: MAY Pages: 278-279 DOI: 10.1107/S0108767383000665 Published: 1983 I guess this or similar is implemented in shelxl. Look also in : J. F. Nye Physical Properties of Crystals: Their Representation by Tensors and Matrices Dr Felix Frolow Professor of Structural Biology and Biotechnology Department of Molecular Microbiology and Biotechnology Tel Aviv University 69978, Israel Acta Crystallographica F, co-editor e-mail: mbfro...@post.tau.ac.il Tel: ++972-3640-8723 Fax: ++972-3640-9407 Cellular: 0547 459 608 On Apr 9, 2012, at 21:02 , Pavel Afonine wrote: Hi Alex, It is not clear to me how to report the resolution of data when it is 3A in one direction, 3.5A in another and 5A in the third. can't be easier I guess: just switch from characterizing data sets with one single number (which is suboptimal, at least, as Phil pointed out earlier) and show statistics by resolution instead. For example, R-factors, data completeness, Fobs shown in resolution bins are obviously much more informative metrics then a single number. If you want to be even more sophisticated, you can. See for example: A program to analyze the distributions of unmeasured reflections J. Appl. Cryst. (2011). 44, 865-872 L. Urzhumtseva and A. Urzhumtsev Pavel
Re: [ccp4bb] very informative - Trends in Data Fabrication
On 4/2/2012 6:03 AM, herman.schreu...@sanofi.com wrote: If James Holton had been involved, the fabrication would not have been discovered. Herman Uhh. Thanks. I think? Apologies for remaining uncharacteristically quiet. I have been keeping up with the discussion, but not sure how much difference one more vote would make on the various issues. Especially since most of this has come up before. I agree that fraud is sick and wrong. I think backing up your data is a good idea, etc. etc. However, I seem to have been declared a leading expert on fake data, so I suppose I ought to say something about that. Not quite sure I want to volunteer to be the Defense Against The Dark Arts Teacher (they always seem to end badly). But, here goes: I think the core of the fraud problem lies in our need for models, and I mean models in the general scientific sense not just PDB files. Fundamental to the practice of science is coming up with a model that explains the observations you made, preferably to within experimental error. One is also generally expected to estimate what the experimental error was. That is, if you plot a bunch of points on a graph, you need to fit some sort of curve to them, and that curve had better fit to within the error bars, or you have some explaining to do. Protein structures are really nothing more than a ~50,000 parameter curve fit to ~50,000 data points. So, given that the technology for constructing models is widely available (be it gnuplot or refmac), as is the technology for estimating errors and generating random numbers, all the hard work a would-be fraud needs to make a plausible forgery has already been done. This is not something unique to crystallography! It is a general property of any mature science. Indeed, fake data, is not only a common tool in science but an inextricable part of it. Simulated diffraction images appear in the literature at least as early as Arndt and Wonacott (1976), and I'm sure even Moseley and Darwin (1913) made some fake data when trying to figure out all the sources of systematic error they were dealing with measuring reflected x-ray beams. At its heart, fake data is a control. Remember controls from science class? They come in two flavors: positive and negative, and you are supposed to have both. In fact, all a fraud really is is someone who in some way, shape or form takes a positive control and calls it their experiment. Pasting gel lanes together is an example of this. I think this is why fraud is so hard to prevent in science. You can't do science without controls, but anyone who has access to the technology for doing a control can also use it for evil. The labels are everything. Personally, I classify fraud as an intentionally incorrect result. This separates it from unintentionally incorrect results (mistakes), which are far more common. Validation is meant to catch the incorrect part, but can never be expected to establish intent! In fact, I expect a mildly clever fraud might actually plan to hide behind the we made a mistake in the deposition/figure/paper but now can't find the original data defense. The case at hand (Zaborsky et al. 2010) may be a very good example of this. A new validation procedure (Rupp 2012) drew attention to the fabricated 3k78 structure as well as real structures where Fcalc was accidentally deposited instead Fobs (there are a number of these). Rupp's follow-up on 3k78 found troubling irregularities, but could it still be a mistake? If there is a combination of buttons in some GUI somewhere that lets you do this then I imagine at least one idiot may have discovered it. Perhaps even pleased with themselves for finding a new way to get their R factor down. The best evidence that Fobs simply does not exist for 3k78 was in the response (Zaborsky et al. 2012). The same validation procedure also drew attention to other cases. Two of them 1n0r and 1n0q (Mosavi et al. 2002) were from my beamline (ALS 8.3.1), so finding the original images was simply a matter of flipping through the books of old DVDs I have in my office. They cost us $0.25 each in 2002. Yes, I do back up every image, primarily because figuring out which ones were worth backing up was actually a more expensive proposition. Even in adjusted dollars, I think the cost of the whole archive is still cheaper than what it would have cost Dan to re-grow his crystals and collect the data again in 2012. It is also nice to be able to say that the data for 1n0r were collected on Jan 30 2002 from 9:47 pm to 11:48 pm and 1n0q was collected on Mar 15 2002 from 12:52 pm until 3:48 pm. I was there! I saw the whole thing! Yes, I know, since I am the guy who can fake images I am not the best witness (the Defense Against the Dark Arts Teacher never is), but for whatever it is worth I DO recommend keeping your old images around. You never know when a forgotten slip of
Re: [ccp4bb] very informative - Trends in Data Fabrication
Since I was the person who started a public outcry to do something, I shell explain myself to my critics. Similarly to all of you, I do not care much about those few instances of structure fabrication. I might put too much emphases on them to initiate the discussion, but they are, indeed, only tiny blips on the ocean of science. But, could they be tips of a huge iceberg? That was my concern. I believe that an enormous competition in science that we experience nowadays makes many of us desperate, and desperation forces people to cheat. Is current validation system at PDB good enough to catch various aspects of data cheating? Is there a simple but efficient way to make it more difficult and, hence, less desirable? Good sportsmen (in terms of sport abilities) sometimes get caught with taking performance enhancers. I bet everyone would do it if the drug control did not exist. Many sportsmen would do it against their will, just because there was no other way to win. Do not you think a similar situation can develop in science? I suppose as social animals we like to think we can trust and be trusted Well, I suppose that these two antagonistic abilities of social animals (trust and cheating) developed in parallel as means to promote the evolution. In a very hierarchical society with no legal means to change a social status, cheating has been an important tool to contribute ones genes to a society. The socially unjust societies still exist and their members may have a slightly different view on morality of cheating than those from just societies. Moreover, ability to cheat often correlates with the intellect. Could not it be called cheating when someone is told to do something in one way, but he does it in his own way, because he believes it is more efficient? When a scientist feels that he is right about validity of his results, but they do not look good enough to be sold to validators, he is supposed to do more research. But he is out of time, why not to hide weak spots of the work if he knows that the major conclusions are RIGHT? Even if someone will redo the work later, they will be reproduced, right? In my opinion, this is the major motif for cheating in science. What I suggested with respect to the PDB data validation was adding some additional information that would allow to independently validate such parameters as the resolution and data quality (catching of model fabrications would be a byproduct of this process). Does the current system allow to overestimate those parameters? I believe so (but I might be wrong, correct me!). Periodically, people ask at ccp4bb how to determine the resolution of their data, but some idiots may decide to do it on their own and add 30% of noise to their structural factors. As James mentioned, one does not need to be extremely smart to do so, moreover, such an idiot would have less restraints than an educated crystallographer, because the idiot believes that nobody would notice his cheating. His moral principles are not corrupted, because he thinks that the model is correct and no harm is done. But the harm is still there, because people are forced to believe the model more than it deserves. The question is still open to me about what percentage of PDB structures overestimates data quality in terms of resolution. Is it possible to make it less dependent on the opinion of persons submitting the data? We all have so different opinions about everything... People invented laws to create conditions when they can trust each other. Sociopaths who do not follow the rules get caught and excluded from a society, which maintains the trust. But when the trust is abused, it quickly disappears. Many of those who wrote on the matter expressed a strong opinion that the system is not broken and we should continue trusting each other. Great! I do not mind the status quo. Regards, Alex Aleshin On Apr 8, 2012, at 8:48 AM, James Holton wrote: On 4/2/2012 6:03 AM, herman.schreu...@sanofi.com wrote: If James Holton had been involved, the fabrication would not have been discovered. Herman Uhh. Thanks. I think? Apologies for remaining uncharacteristically quiet. I have been keeping up with the discussion, but not sure how much difference one more vote would make on the various issues. Especially since most of this has come up before. I agree that fraud is sick and wrong. I think backing up your data is a good idea, etc. etc. However, I seem to have been declared a leading expert on fake data, so I suppose I ought to say something about that. Not quite sure I want to volunteer to be the Defense Against The Dark Arts Teacher (they always seem to end badly). But, here goes: I think the core of the fraud problem lies in our need for models, and I mean models in the general scientific sense not just PDB files. Fundamental to the practice of science is coming up with a model that explains the observations you
Re: [ccp4bb] very informative - Trends in Data Fabrication
You never know when a forgotten slip of the mouse when using AutoDep ten years ago will come back to haunt you. On the paper James refers to and found the data, added mystery was that the postdoc who may have slipped disappeared w/o much of trace and the PI died. Dan was the only survivor. Still they found the data. BR
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Ron, Quite so, and who cannot laugh at the Yes Minister perfect hospital ward operating theatre sketch ( Thankyou James W). Anyway:- Let's not get too hung up on one detail of your point 3. Your various points, including point 3, added several missing elements in this CCp4bb thread. Overall what I am saying is that to me it is good that my University at least is gearing up to provide a local Data Archive service which, since I wish to link my raw data sets in future to my publications via doi registrations, this will give a longevity to them that I cannot guarantee with a 'my raw data are in my desk drawer' approach. These could be useful in future reuse ie:- I see various improvements to understanding diffuse scattering and, secondly, to squeezing more diffraction resolution out of the Bragg data as computer hardware and software both improve. Once in my career I nearly made a mistake on a space group choice ( I wrote it up as an educational story in 1996 in Acta D); if I had made that mistake the literature would finally have caught up i suppose and said :- where are the raw data, let's check that space group choice. This latter type of challenge of course is catchable via depositing processed Bragg data as triclinic; it probably doesn't need raw images. Finally I have a project that I have worked for some years on now to solve the structure; there are two, possibly several , crystal lattices and diffuse streaks. If I have to finally give up I will make them available via doi on my a university raw data archive; meanwhile of course we make new protein and recrystallise etc, the other approach! Greetings, John Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol. Chair School of Chemistry, University of Manchester, Athena Swan Team. http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html On 6 Apr 2012, at 17:23, Ronald E Stenkamp stenk...@u.washington.edu wrote: Dear John, Your points are well taken and they're consistent with policies and practices in the US as well. I wonder about the nature of the employer's responsibility though. I sit on some university committees, and the impression I get is that much of the time, the employers are interested in reducing their legal liabilities, not protecting the integrity of science. The end result is the same though in that the employers get involved and oversee the handling of scientific misconduct. What is unclear to me is whether the system for dealing with misconduct is broken. It seems to work pretty well from my viewpoint. No system is perfect for identifying fraud, errors, etc, and I understand the idea that improvements might be possible. However, too many improvements might break the system as well. Ron On Fri, 6 Apr 2012, John R Helliwell wrote: Dear Ron, Re (3):- Yes of course the investigator has that responsibility. The additional point I would make is that the employer has a share in that responsibility. Indeed in such cases the employer university convenes a research fraud investigating committee to form the final judgement on continued employment. A research fraud policy, at least ours, also includes the need for avoding inadvertent loss of raw data, which is also deemed to be research malpractice. Thus the local data repository, with doi registration for data sets that underpin publication, seems to me and many others, ie in other research fields, a practical way forward for these data sets. It also allows the employer to properly serve the research investigations of its employees and be duely diligent to the research sponsors whose grants it accepts. That said there is a variation of funding that at least our UK agencies will commit to 'Data management plans'. Greetings, John 2012/4/5 Ronald E Stenkamp stenk...@u.washington.edu: This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at
Re: [ccp4bb] very informative - Trends in Data Fabrication
I doubt many people completely fail to archive data but maintaining data archives can be a pain so I'm not sure what the useful age of the average archive is. Do people who archived to tape keep their tapes in a format that can be read by modern tape drives? Do people who archived data to a hard drive 10 years ago have something that can still read an Irix EFS-formatted SCSI hard drive today and, if not, did they bother to move the data to some other storage medium? -Eric On Apr 5, 2012, at 9:08 AM, Roger Rowlett wrote: FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear John, Your points are well taken and they're consistent with policies and practices in the US as well. I wonder about the nature of the employer's responsibility though. I sit on some university committees, and the impression I get is that much of the time, the employers are interested in reducing their legal liabilities, not protecting the integrity of science. The end result is the same though in that the employers get involved and oversee the handling of scientific misconduct. What is unclear to me is whether the system for dealing with misconduct is broken. It seems to work pretty well from my viewpoint. No system is perfect for identifying fraud, errors, etc, and I understand the idea that improvements might be possible. However, too many improvements might break the system as well. Ron On Fri, 6 Apr 2012, John R Helliwell wrote: Dear Ron, Re (3):- Yes of course the investigator has that responsibility. The additional point I would make is that the employer has a share in that responsibility. Indeed in such cases the employer university convenes a research fraud investigating committee to form the final judgement on continued employment. A research fraud policy, at least ours, also includes the need for avoding inadvertent loss of raw data, which is also deemed to be research malpractice. Thus the local data repository, with doi registration for data sets that underpin publication, seems to me and many others, ie in other research fields, a practical way forward for these data sets. It also allows the employer to properly serve the research investigations of its employees and be duely diligent to the research sponsors whose grants it accepts. That said there is a variation of funding that at least our UK agencies will commit to 'Data management plans'. Greetings, John 2012/4/5 Ronald E Stenkamp stenk...@u.washington.edu: This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes the trust we have in each others results. No one has time or resources to check everything, so science is based on trust. There are efforts underway outside crystallographic circles to address this larger threat to all science, and we should be participating in those discussions as much as possible. Ron On Thu, 5 Apr 2012, aaleshin wrote: Dear John,Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages
Re: [ccp4bb] very informative - Trends in Data Fabrication
Ron makes an excellent point. Many institutions devote far more energy to limiting risk than to doing the right thing. This leads administrators to a frightening, but logical conclusion: The less science we do, the less chance of our doing something that could invite a penalty on the university. This translates into rules intended to head off bad behavior, but which in fact make it more difficult to do honest science, and increase the administrative burden (our IT group has already made great strides in this direction--if you can't connect to the network, then you can't use it to violate HIPAA!). So I agree that we should be cautious about improvements. Pat On 6 Apr 2012, at 12:23 PM, Ronald E Stenkamp wrote: Dear John, Your points are well taken and they're consistent with policies and practices in the US as well. I wonder about the nature of the employer's responsibility though. I sit on some university committees, and the impression I get is that much of the time, the employers are interested in reducing their legal liabilities, not protecting the integrity of science. The end result is the same though in that the employers get involved and oversee the handling of scientific misconduct. What is unclear to me is whether the system for dealing with misconduct is broken. It seems to work pretty well from my viewpoint. No system is perfect for identifying fraud, errors, etc, and I understand the idea that improvements might be possible. However, too many improvements might break the system as well. Ron --- Patrick J. Loll, Ph. D. Professor of Biochemistry Molecular Biology Director, Biochemistry Graduate Program Drexel University College of Medicine Room 10-102 New College Building 245 N. 15th St., Mailstop 497 Philadelphia, PA 19102-1192 USA (215) 762-7706 pat.l...@drexelmed.edu
Re: [ccp4bb] very informative - Trends in Data Fabrication
As famously observed by the Yes Minister teamthe dream outcome for any organization: http://www.youtube.com/watch?v=x-5zEb1oS9Afeature=youtube_gdata_player J Sent from my iPhone On 07/04/2012, at 3:16 AM, Patrick Loll pat.l...@drexel.edu wrote: Ron makes an excellent point. Many institutions devote far more energy to limiting risk than to doing the right thing. This leads administrators to a frightening, but logical conclusion: The less science we do, the less chance of our doing something that could invite a penalty on the university. This translates into rules intended to head off bad behavior, but which in fact make it more difficult to do honest science, and increase the administrative burden (our IT group has already made great strides in this direction--if you can't connect to the network, then you can't use it to violate HIPAA!). So I agree that we should be cautious about improvements. Pat On 6 Apr 2012, at 12:23 PM, Ronald E Stenkamp wrote: Dear John, Your points are well taken and they're consistent with policies and practices in the US as well. I wonder about the nature of the employer's responsibility though. I sit on some university committees, and the impression I get is that much of the time, the employers are interested in reducing their legal liabilities, not protecting the integrity of science. The end result is the same though in that the employers get involved and oversee the handling of scientific misconduct. What is unclear to me is whether the system for dealing with misconduct is broken. It seems to work pretty well from my viewpoint. No system is perfect for identifying fraud, errors, etc, and I understand the idea that improvements might be possible. However, too many improvements might break the system as well. Ron --- Patrick J. Loll, Ph. D. Professor of Biochemistry Molecular Biology Director, Biochemistry Graduate Program Drexel University College of Medicine Room 10-102 New College Building 245 N. 15th St., Mailstop 497 Philadelphia, PA 19102-1192 USA (215) 762-7706 pat.l...@drexelmed.edu
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the possibility of establishing community practices for the orderly retention and referencing of raw data sets, and the IUCr would like to see such data sets become part of the routine record of scientific research in the future, to the extent that this proves feasible and cost-effective. I draw your attention therefore to the IUCr Forum on such matters at:- http://forums.iucr.org/ Within this Forum you can find for example the ICSU convened Strategic Coordinating Committee on Information and Data fairly recent report; within this we learn of many other areas of science efforts on data archiving and eg that the radio astronomy square kilometre array will pose the biggest raw data archiving challenge on the planet.[Our needs are thereby relatively modest.] The IUCr Diffraction Data Deposition Working Group is actively addressing all these various issues. We weclome your input at the IUCr Forum, which will thereby be most timely. Thankyou. Best wishes, Yours
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Colleagues, Clearly, no system will be able to perfectly preserve every pixel of every dataset collected at a cost that can be afforded. Resources are finite and we must set priorities. I would suggest that, in order of declining priority, we try our best to retain: 1. raw data that might tend to refute published results 2. raw data that might tend to support published results 3. raw data that may be of significant use in currently ongoing studies either in refutation or support 4. raw data that may be of significant use in future studies While no archiving system can be perfect, we should not let the search for a perfect solution prevent us from working with currently available good solutions, and even in this era of tight budgets, there are good solutions. Regards, Herbert On 4/5/12 7:16 AM, John R Helliwell wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the possibility of establishing community practices for the orderly retention and referencing of raw data
Re: [ccp4bb] very informative - Trends in Data Fabrication
FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.com wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK ( http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies ). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the possibility of establishing community practices for the orderly retention and referencing of raw data sets, and the IUCr would like to see such data sets become part of the routine record of scientific research in the future, to the extent that this proves feasible and cost-effective. I draw your attention therefore to the IUCr Forum on such matters at:- http://forums.iucr.org/ Within this Forum you can find for example the ICSU convened Strategic Coordinating Committee on Information and Data
Re: [ccp4bb] very informative - Trends in Data Fabrication
I would say everybody keeps probably too many junk datasets around - at least I do. And I run into the trouble of having to buy new TB plates every now and then. I think on average per year my group acquires currently ~700 GB of raw images (compressed), now if we were to only keep the useful datasets we probably would be down to 10% of that. But as always you hope for the best and keep some data considered junk in 2009 which might be useful in 2015. Jürgen On Apr 5, 2012, at 9:08 AM, Roger Rowlett wrote: FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.commailto:jrhelliw...@gmail.com wrote: Dear 'aales...@burnham.orgmailto:aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that many IUCr Commissions are interested in the
[ccp4bb] Category 4 Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Herbert, Category 4, in Manchester, we find is tricky, for want of a better word. Needless to say that we have collaborators on our Crystallography Research Service who request data sets from eg ten years ago, that are now urgent for publication writing up. So we are keeping everything, although only recent years the raw diffraction images, and nb soon to be assisted by the Univ Manchester centralised Data Repository for its researchers. (Incidentally I have kept all of my film oscillation, and inc later Laue data, back to approx 1977, which fills a whole wall shelf worth, ~ 10 metres.) Greetings, John Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol. Chair School of Chemistry, University of Manchester, Athena Swan Team. http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html On 5 Apr 2012, at 13:50, Herbert J. Bernstein y...@bernstein-plus-sons.com wrote: Dear Colleagues, Clearly, no system will be able to perfectly preserve every pixel of every dataset collected at a cost that can be afforded. Resources are finite and we must set priorities. I would suggest that, in order of declining priority, we try our best to retain: 1. raw data that might tend to refute published results 2. raw data that might tend to support published results 3. raw data that may be of significant use in currently ongoing studies either in refutation or support 4. raw data that may be of significant use in future studies While no archiving system can be perfect, we should not let the search for a perfect solution prevent us from working with currently available good solutions, and even in this era of tight budgets, there are good solutions. Regards, Herbert On 4/5/12 7:16 AM, John R Helliwell wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no
[ccp4bb] Via Annual Reports...Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Roger, At the recent ICSTI Workshop on Delivering Data in science the NSF presenter, when I asked about monitoring, replied that the PIs' annual reports should include data management aspects. See http://www.icsti.org/spip.php?rubrique42 Best wishes, John Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol. Chair School of Chemistry, University of Manchester, Athena Swan Team. http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html On 5 Apr 2012, at 14:08, Roger Rowlett rrowl...@colgate.edu wrote: FYI, every NSF grant proposal now must have a data management plan that describes how all experimental data will be archived and in what formats. I'm not sure how seriously these plans are monitored, but a plan must be provided nevertheless. Is anyone NOT archiving their original data in some way? Roger Rowlett On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.com wrote: Dear 'aales...@burnham.org', Re the pixel detector; yes this is an acknowledged raw data archiving challenge; possible technical solutions include:- summing to make coarser images ie in angular range, lossless compression (nicely described on this CCP4bb by James Holton) or preserving a sufficient sample of data(but nb this debate is certainly not yet concluded). Re And all this hassle is for the only real purpose of preventing data fraud? Well.Why publish data? Please let me offer some reasons: • To enhance the reproducibility of a scientific experiment • To verify or support the validity of deductions from an experiment • To safeguard against error • To allow other scholars to conduct further research based on experiments already conducted • To allow reanalysis at a later date, especially to extract 'new' science as new techniques are developed • To provide example materials for teaching and learning • To provide long-term preservation of experimental results and future access to them • To permit systematic collection for comparative studies • And, yes, To better safeguard against fraud than is apparently the case at present Also to (probably) comply with your funding agency's grant conditions:- Increasingly, funding agencies are requesting or requiring data management policies (including provision for retention and access) to be taken into account when awarding grants. See e.g. the Research Councils UK Common Principles on Data Policy (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital Curation Centre overview of funding policies in the UK (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies). See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion on policies relevant to crystallography in other countries. Nb these policies extend over derived, processed and raw data, ie without really an adequate clarity of policy from one to the other stages of the 'data pyramid' ((see http://www.stm-assoc.org/integration-of-data-and-publications). And just to mention IUCr Journals Notes for Authors for biological macromolecular structures, where we have our ie macromolecular crystallography's version of the 'data pyramid' :- (1) Derived data • Atomic coordinates, anisotropic or isotropic displacement parameters, space group information, secondary structure and information about biological functionality must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. • Relevant experimental parameters, unit-cell dimensions are required as an integral part of article submission and are published within the article. (2) Processed experimental data • Structure factors must be deposited with the Protein Data Bank before or in concert with article publication; the article will link to the PDB deposition using the PDB reference code. (3) Primary experimental data (here I give small and macromolecule Notes for Authors details):- For small-unit-cell crystal/molecular structures and macromolecular structures IUCr journals have no current binding policy regarding publication of diffraction images or similar raw data entities. However, the journals welcome efforts made to preserve and provide primary experimental data sets. Authors are encouraged to make arrangements for the diffraction data images for their structure to be archived and available on request. For articles that present the results of powder diffraction profile fitting or refinement (Rietveld) methods, the primary diffraction data, i.e. the numerical intensity of each measured point on the profile as a function of scattering angle, should be deposited. Fibre data should contain appropriate information such as a photograph of the data. As primary diffraction data cannot be satisfactorily extracted from such figures, the basic digital diffraction data should be deposited. Finally to mention that
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear John, Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages to do. There might be more numerous occurances of data massaging like overestimation of the resolution or data quality, the system does not allow to verify them. IUCr and PDB follows the American taxation policy, where the responsibility for a fraud is placed on people, and the agency does not take sufficient actions to prevent it. I believe it is inefficient and inhumane. Making a routine check of submitted data at a bit lower level would reduce a temptation to overestimate the unclearly defined quality statistics and make the model fabrication more difficult to accomplish. Many people do it unknowingly, and catching them afterwards makes no good. I suggested to turn the current incidence, which might be too complex for burning heretics, into something productive that is done as soon as possible, something that will prevent fraud from occurring. Since my persistent trolling at ccp4bb did not take any effect (until now), I wrote a bad-English letter to the PDB administration, encouraging them to take urgent actions. Those who are willing to count grammar mistakes in it can reading the message below. With best regards, Alexander Aleshin, staff scientist Sanford-Burnham Medical Research Institute 10901 North Torrey Pines Road La Jolla, California 92037 Dear PDB administrators; I am wringing to you regarding the recently publicized story about submission of calculated structural factors to the PDB entry 3k79 (http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This presumable fraud (or a mistake) occurred just several years after another, more massive fabrication of PDB structures (Acta Cryst. (2010). D66, 115) that affected many scientists including myself. The repetitiveness of these events indicates that the current mechanism of structure validation by PDB is not sufficiently robust. Moreover, it is completely incapable of detecting smaller mischief such as overestimation of the data resolution and quality. There are two approaches to handling fraud problems: (1) raising policing and punishment, or (2) making a fraud too difficult to implement. Obviously, the second approach is more humane and efficient. This issue has been discussed on several occasions by the ccp4bb community, and some members began promoting the idea of submitting raw crystallographic images as a fraud repellent. However, this validation approach is not easy and cheap, moreover, it requires a considerable manpower to conduct it on a day-to-day basis. Indeed, indexing data sets is sometimes a nontrivial problem and cannot be accomplished automatically. For this reason, submitting the indexed and partially integrated data (such as .x files from HKL2000 or the output.mtz file from Mosfilm) appears as a cheaper substitute to the image storing/validating. Analysis of the partially integrated data provides almost same means to the fraud prevention as the images. Indeed, the observed cases of data fraud suggest that they would likely be attempted by a biochemist-crystallographer, who is insufficiently educated to fabricate the partially processed data. A method developer, on contrary, does not have a reasonable incentive to forge a particular structure, unless he teams up with a similarly minded biologist. But the latter scenario is very improbable and has not been detected yet. The most valuable benefit in using the partially processed data as a validation tool would be the standardization of definition for the data resolution and detection of inappropriate massaging of experimental data. Implementation of this approach requires minuscule adaptation of the current system, which most of practicing crystallographers would accept (in my humble opinion). The requirement to the data storage would be only ~1000 fold higher than the current one, and transferring the new data to PDB could be still done over the Internet. Moreover, storing the raw data is not required after the validation is done. A program such as Scala of CCP4 could be easily adopted to process the validation data and compare them with a conventional set of structural factors. Precise consistency of the two sets is not necessary. They only need to agree within statistically
Re: [ccp4bb] very informative - Trends in Data Fabrication
This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes the trust we have in each others results. No one has time or resources to check everything, so science is based on trust. There are efforts underway outside crystallographic circles to address this larger threat to all science, and we should be participating in those discussions as much as possible. Ron On Thu, 5 Apr 2012, aaleshin wrote: Dear John,Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages to do. There might be more numerous occurances of data massaging like overestimation of the resolution or data quality, the system does not allow to verify them. IUCr and PDB follows the American taxation policy, where the responsibility for a fraud is placed on people, and the agency does not take sufficient actions to prevent it. I believe it is inefficient and inhumane. Making a routine check of submitted data at a bit lower level would reduce a temptation to overestimate the unclearly defined quality statistics and make the model fabrication more difficult to accomplish. Many people do it unknowingly, and catching them afterwards makes no good. I suggested to turn the current incidence, which might be too complex for burning heretics, into something productive that is done as soon as possible, something that will prevent fraud from occurring. Since my persistent trolling at ccp4bb did not take any effect (until now), I wrote a bad-English letter to the PDB administration, encouraging them to take urgent actions. Those who are willing to count grammar mistakes in it can reading the message below. With best regards, Alexander Aleshin, staff scientist Sanford-Burnham Medical Research Institute 10901 North Torrey Pines Road La Jolla, California 92037 Dear PDB administrators; I am wringing to you regarding the recently publicized story about submission of calculated structural factors to the PDB entry 3k79 (http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This presumable fraud (or a mistake) occurred just several years after another, more massive fabrication of PDB structures (Acta Cryst. (2010). D66, 115) that affected many scientists including myself. The repetitiveness of these events indicates that the current mechanism of structure validation by PDB is not sufficiently robust. Moreover, it is completely incapable of
Re: [ccp4bb] very informative - Trends in Data Fabrication
I also don't really worry about the images as a primary means of fraud prevention, although such may be a useful side effect. These cases are spectacular but so rare that it indeed would not primarily justify the effort. That it can be a useful political instrument to make that argument and get funding, may be, but that is a bit of a double edged sword and harm can be done see (5) The real point to me seems - a) is there something in the images and in between casually indexed main reflections we do not use right now that allows us to ultimately get better structures? I think there is, and it has been told before, from superstructures, modulation, diffuse contributions etc etc. A processed data file does not help here. But do we need the old image data for that or rather use new ones from modern detectors? Where is the cost/benefit cutoff here? b) looking at how some structures are refined, there is little reason to believe that data processing would be done more competently by untrained casual users (except that much of the data processing is done with the help of beam line personnel who rather know how to do it). Had we images, the next step then could be PDB_reprocess. A processed data file does not help much there either. c) Discarding your primary data is generally considered bad form... @AlexA: Arguing with the PDB is not really useful. They did not generate the bad data. Best, BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ronald E Stenkamp Sent: Thursday, April 05, 2012 1:04 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes the trust we have in each others results. No one has time or resources to check everything, so science is based on trust. There are efforts underway outside crystallographic circles to address this larger threat to all science, and we should be participating in those discussions as much as possible. Ron On Thu, 5 Apr 2012, aaleshin wrote: Dear John,Thank you for a very informative letter about the IUCr activities towards archiving the experimental data. I feel that I did not explain myself properly. I do not object archiving the raw data, I just believe that current methodology of validating data at PDB is insufficiently robust and requires a modification. Implementation of the raw image storage and validation will take a considerable time, while the recent incidents of a presumable data frauds demonstrate that the issue is urgent. Moreover, presenting the calculated structural factors in place of the experimental data is not the only abuse that the current validation procedure encourages to do. There might be more numerous occurances of data massaging like overestimation of the resolution or data quality, the system does not allow to verify them. IUCr and PDB follows the American taxation policy, where the responsibility for a fraud is placed on people, and the agency
Re: [ccp4bb] very informative - Trends in Data Fabrication
Well, looks like my opinion about importance of data validation at the moment of their submission does not catch much support, it is sad but understandable. Automatic redoing the pdb structures by professionals is a good idea, I myself suggested a similar thing 10 years ago at Accelrys (we were developing a tool that allowed detecting and remodeling changes in protein-ligand structures due to ligand binding), but there was not much financial interest. How much the raw images would enhance the remodeling process is an open question, but good luck in getting it funded. c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. What is wrong with partially integrated data in terms of structure validation? @AlexA: Arguing with the PDB is not really useful. I did not argue yet, but I'll take your advice. They did not generate the bad data. This is a genuine American thinking! But they might create conditions that would prevent their deposition. I think I should stop heating up this discussion. Regards, Alex On Apr 5, 2012, at 2:11 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: I also don't really worry about the images as a primary means of fraud prevention, although such may be a useful side effect. These cases are spectacular but so rare that it indeed would not primarily justify the effort. That it can be a useful political instrument to make that argument and get funding, may be, but that is a bit of a double edged sword and harm can be done see (5) The real point to me seems - a) is there something in the images and in between casually indexed main reflections we do not use right now that allows us to ultimately get better structures? I think there is, and it has been told before, from superstructures, modulation, diffuse contributions etc etc. A processed data file does not help here. But do we need the old image data for that or rather use new ones from modern detectors? Where is the cost/benefit cutoff here? b) looking at how some structures are refined, there is little reason to believe that data processing would be done more competently by untrained casual users (except that much of the data processing is done with the help of beam line personnel who rather know how to do it). Had we images, the next step then could be PDB_reprocess. A processed data file does not help much there either. c) Discarding your primary data is generally considered bad form... @AlexA: Arguing with the PDB is not really useful. They did not generate the bad data. Best, BR -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ronald E Stenkamp Sent: Thursday, April 05, 2012 1:04 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication This discussion has been interesting, and it's provided an interesting forum for those interested in dealing with fraud in science. I've not contributed anything to this thread, but the message from Alexander Aleshin prodded me to say some things that I haven't heard expressed before. 1. The sky is not falling! The errors in the birch pollen antigen pointed out by Bernhard are interesting, and the reasons behind them might be troubling. However, the self-correcting functions of scientific research found the errors, and current publication methods permitted an airing of the problem. It took some effort, but the scientific method prevailed. 2. Depositing raw data frames will make little difference in identifying and correcting structural problems like this one. Nor will new requirements for deposition of this or that detail. What's needed for finding the problems is time and interest on the part of someone who's able to look at a structure critically. Deposition of additional information could be important for that critical look, but deposition alone (at least with today's software) will not be sufficient to find incorrect structures. 3. The responsibility for a fraudulent or wrong or poorly-determined structure lies with the investigator, not the society of crystallographers. My political leanings are left-of-central, but I still believe in individual responsibility for behavior and actions. If someone messes up a structure, they're accountable for the results. 4. Adding to the deposition requirements will not make our science more efficient. Perhaps it's different in other countries, but the administrative burden for doing research in the United States is growing. It would be interesting to know the balance between the waste that comes from a wrong structure and the waste that comes from having each of us deal with additional deposition requirements. 5. The real danger that arises from cases of wrong or fraudulent science is that it erodes
Re: [ccp4bb] very informative - Trends in Data Fabrication
Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR
Re: [ccp4bb] very informative - Trends in Data Fabrication
Alright, if the image deposition is the only way out, then I am for it, but please make sure that synchrotrons will do it for me... On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR
Re: [ccp4bb] very informative - Trends in Data Fabrication
How should they ? They have no clue which of the 20 datasets was actually useful to solve your structure. If you ask James Holton he has (suggested) to go back to the archived data after a certain time and try to solve the undeposited structures then :-) [Where is James anyhow ? Haven't seen a post recently from him] Seriously, I think it is in our own interest to submit the corresponding images which led to a structure solution somewhere. And as others mentioned bad data or good data can always serve for educational purposes. Just as an example http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/1Y13 Jürgen On Apr 5, 2012, at 11:46 PM, aaleshin wrote: Alright, if the image deposition is the only way out, then I am for it, but please make sure that synchrotrons will do it for me... On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] very informative - Trends in Data Fabrication
Did you play as a child a game called a broken phone? It is when someone tells something quickly to a neighbor, and so on until the words come back to the author. Very funny game. My original thesis was that downloading/depositing the raw images would be a pain in the neck for crystallographers, so why would not to begin with the partially processed data, like .x files from HKL2000? People should be trained to hardships gradually... On Apr 5, 2012, at 8:57 PM, Bosch, Juergen wrote: How should they ? They have no clue which of the 20 datasets was actually useful to solve your structure. If you ask James Holton he has (suggested) to go back to the archived data after a certain time and try to solve the undeposited structures then :-) [Where is James anyhow ? Haven't seen a post recently from him] Seriously, I think it is in our own interest to submit the corresponding images which led to a structure solution somewhere. And as others mentioned bad data or good data can always serve for educational purposes. Just as an example http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/1Y13 Jürgen On Apr 5, 2012, at 11:46 PM, aaleshin wrote: Alright, if the image deposition is the only way out, then I am for it, but please make sure that synchrotrons will do it for me... On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Ojweh c) Discarding your primary data is generally considered bad form... Agreed, but it is a big burden on labs to maintain archives of their raw data indefinitely. Even IRS allows to discard them after some time. But you DO have to file in the first place, right? How long to keep is an entirely different question. What is wrong with partially integrated data in terms of structure validation? Who thinks something is wrong with that idea? Section 3.1 under figure 3 of said incendiary pamphlet states: '...yadayadawhen unmerged data or images for proper reprocessing are not available owing to the unfortunate absence of a formal obligation to deposit unmerged intensity data or diffraction images.' They did not generate the bad data. This is a genuine American thinking! Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-) видеть вас на Лубянке. But they might create conditions that would prevent their deposition. Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB deposition a total pain for everybody, you don't get compliance, you get defiance. Ever seen any happy faces in a TSA check line? Anyhow, image deposition will come. Over and out, BR .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] very informative - Trends in Data Fabrication
No James, you're not alone - astonishing petty pile-on (bullying?) on this board the last few days. Wikipedia says: In Internet slang http://en.wikipedia.org/wiki/Internet_slang, a *troll* is someone who posts inflammatory,^[2] http://en.wikipedia.org/wiki/Troll_%28Internet%29#cite_note-1 extraneous http://en.wiktionary.org/wiki/extraneous#Adjective, or off-topic http://en.wikipedia.org/wiki/Off-topic messages in an online community, such as an online discussion forum, chat room, or blog, with the primary intent of provoking readers into an emotional http://en.wikipedia.org/wiki/Emotion response^[3] http://en.wikipedia.org/wiki/Troll_%28Internet%29#cite_note-PCMAG_def-2 or of otherwise disrupting normal on-topic discussion. The emotional and disruptive response certainly fit the definition, but that's about all. And while Kevin's tiny blips in my inbox were trivial to delete and ignore, the resulting email hurricane of pompous indignation was not. Yuk. phx On 04/04/2012 00:29, James Stroud wrote: I read the first part of the page you linked to. I'm not sure what the decent into troll etymology says about the CCP4BB community--especially in response to your seemingly innocent post. My understanding is that the goal of the CCP4BB is to educate and not belittle the naivety of other members of the community. I hope I am not alone. James On Apr 3, 2012, at 4:33 PM, Kevin Jin wrote: Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin
Re: [ccp4bb] very informative - Trends in Data Fabrication
Then everyone's data can be lost at once in the next cloud failure. Progress! The hardware failed in such a way that we could not forensically restore the data. What we were able to recover has been made available via a snapshot, although the data is in such a state that it may have little to no utility... -Amazon to some of its cloud customers following their major crash last year http://articles.businessinsider.com/2011-04-28/tech/29958976_1_amazon-customer-customers-data-data-loss -Eric On Apr 3, 2012, at 9:22 PM, Zhijie Li wrote: Hi, Regarding the online image file storage issue, I just googled cloud storage and had a look at the current pricing of such services. To my surprise, some companies are offering unlimited storage for as low as $5 a month. So that's $600 for 10 years. I am afraid that these companies will feel really sorry to learn that there are some monsters called crystallographers living on our planet. In our lab, some pre-21st century data sets were stored on tapes, newer ones on DVD discs and IDE hard drives. All these media have become or will become obsolete pretty soon. Not to mention the positive relationship of getting CRC errors with the medium's age. Admittedly, it may become quite a job to upload all image files that the whole crystallographic community generates per year. But for individual labs, I think clouding data might become something worth thinking of. Zhijie
Re: [ccp4bb] very informative - Trends in Data Fabrication
People who raise their voices for a prolonged storage of raw images miss a simple fact that the volume of collected data increases proportionally if not faster than the cost of storage space drops. I just had an opportunity to collect data with the PILATUS detector at SSRL and say you that monster allows slicing the data 4-5 times thinner than other detectors do. Some people also like collecting very redundant data sets. Even now, transferring and storage of raw data from a synchrotron is a pain in the neck, but in a few years it may become simply impractical. And all this hassle is for the only real purpose of preventing data fraud? An't there a cheaper and more adequate solutions to the problem? I also wonder why after the first occurrence of data fraud several years ago, PDB did not take any action to prevent its appearance in the future? Or administrative actions are simply impossible nowadays without a mega-dollar grant? On Apr 4, 2012, at 3:45 PM, Eric Bennett wrote: Then everyone's data can be lost at once in the next cloud failure. Progress! The hardware failed in such a way that we could not forensically restore the data. What we were able to recover has been made available via a snapshot, although the data is in such a state that it may have little to no utility... -Amazon to some of its cloud customers following their major crash last year http://articles.businessinsider.com/2011-04-28/tech/29958976_1_amazon-customer-customers-data-data-loss -Eric On Apr 3, 2012, at 9:22 PM, Zhijie Li wrote: Hi, Regarding the online image file storage issue, I just googled cloud storage and had a look at the current pricing of such services. To my surprise, some companies are offering unlimited storage for as low as $5 a month. So that's $600 for 10 years. I am afraid that these companies will feel really sorry to learn that there are some monsters called crystallographers living on our planet. In our lab, some pre-21st century data sets were stored on tapes, newer ones on DVD discs and IDE hard drives. All these media have become or will become obsolete pretty soon. Not to mention the positive relationship of getting CRC errors with the medium's age. Admittedly, it may become quite a job to upload all image files that the whole crystallographic community generates per year. But for individual labs, I think clouding data might become something worth thinking of. Zhijie
Re: [ccp4bb] very informative - Trends in Data Fabrication
Hi Eric, My previous email may have been a little misleading, but I do not recommend deleting the originals from the hard drives/discs/tapes. Clouded data should be better viewed as an extra copy (considering that our lab/office are quite prone to catch fire, and theft too), or a copy that can be easily accessed from anywhere. A disaster on the Clouding servers certainly would not be accepted as an valid excuse for not being able to provide the raw images when their very existence is in question. Zhijie From: Eric Bennett Sent: Wednesday, April 04, 2012 6:45 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Then everyone's data can be lost at once in the next cloud failure. Progress! The hardware failed in such a way that we could not forensically restore the data. What we were able to recover has been made available via a snapshot, although the data is in such a state that it may have little to no utility... -Amazon to some of its cloud customers following their major crash last year http://articles.businessinsider.com/2011-04-28/tech/29958976_1_amazon-customer-customers-data-data-loss -Eric On Apr 3, 2012, at 9:22 PM, Zhijie Li wrote: Hi, Regarding the online image file storage issue, I just googled cloud storage and had a look at the current pricing of such services. To my surprise, some companies are offering unlimited storage for as low as $5 a month. So that's $600 for 10 years. I am afraid that these companies will feel really sorry to learn that there are some monsters called crystallographers living on our planet. In our lab, some pre-21st century data sets were stored on tapes, newer ones on DVD discs and IDE hard drives. All these media have become or will become obsolete pretty soon. Not to mention the positive relationship of getting CRC errors with the medium's age. Admittedly, it may become quite a job to upload all image files that the whole crystallographic community generates per year. But for individual labs, I think clouding data might become something worth thinking of. Zhijie
Re: [ccp4bb] very informative - Trends in Data Fabrication
In fact, I would put it even stronger, if we know a referee is being dishonest, it is our duty to make sure he is removed from science, blacklisted from the journal etc. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote: Mark, I know some stories (which of course I'll not post here) from the Crystallography field and from other fields where reviewers profit from the fact that suddenly they have new, interpreted data which fits very well with their own results. Stories like to block a manuscript or ask for more results for the reviewer to be able to submit its own paper (with new ideas) in time, or copy a structure from the figures, or ask for experiments that only the reviewer can do so he/she is included in the paper, or submit as fast as possible in another journal with an extremely short delay of acceptance (e.g. 10 days, without revision?, talking to the editorial board?) things like this. Well, it is not question of making a full list, here!. The whole problem comes from publishing first, from competition. The hope with fraud with X-ray data is that it seems to be detectable, thanks to valuable people that develop methods to detect it. But it is very difficult to demonstrate that your work, ideas or results have been copied. How do you defend from this? And how after giving to them the valuable PDB? Finally, how many crystallographers are in the world? 5000? The concept of ethics can change from one place to another and, more than this, there is the fact that the reviewer is anonymous. I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit too much, indeed. I think they all have done a very nice job. But some of the stories from above happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an easy molecular replacement someone can solve a difficult structure and publish it first. And then the only thing left to the bad reviewer is to change the author's list! (and for the true author what is left is to feel like an idiot). In my humble opinion, we must be strict but not kill ourselves. Trust authors as we trust reviewers. Otherwise, the whole effort might be useless. Maria Dep. Structural Biology IBMB-CSIC Baldiri Reixach 10-12 08028 BARCELONA Spain Tel: (+34) 93 403 4950 Fax: (+34) 93 403 4979 e-mail: maria.s...@ibmb.csic.es On 3 April 2012 16:58, Mark J van Raaij mjvanra...@cnb.csic.es wrote: The remedy for the fact that some reviewers act unethically is not withholding coordinates and structure factors, but a more active role for the authors to denounce these possible violations and more effective investigations by the journals whose reviewers are suspected by the authors of committing these violations. I have witnessed authors being hesitant to complain about possible violations and journals not always taking complaints seriously enough. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 16:45, Bosch, Juergen wrote: Hi Fred, I'll go public on this one. This happened to me. I will not reveal who reviewed my paper and which paper it was only that your naive assumption might not always be correct. I have learned my lesson and exclude people with overlapping interests (even though they actually might be the best critical reviewers for your work). Unfortunately you don't really have control if the journal still decides to pick those excluded reviewers. As a suggestion to people out there, make sure to not encrypt your comments as pdf and PW protect them - that's how I found out about the identity of the reviewer - as it couldn't be changed by the journal. I agree though that it shouldn't happen and I hope it only happens in very few cases. Jürgen On Apr 3, 2012, at 9:10 AM, Dyda wrote: I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. .. Jürgen Bosch Johns Hopkins University Bloomberg
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Colleagues, One thing that would help is avoiding misappropriated priority of research results would be to join the math and physics community in their robust use of open-access preprints in arXiv. Such public preprints establish reliable timelines for research credit and help to ensure timely access to new results by the entire community. Fully peer-reviewed publications in real journals are still desirable, but to make this work, our journals would have to be willing to accept papers for which such a preprint system has been used. To understand the complexity of the issue, see http://nanoscale.blogspot.com/2008/01/arxiv-and-publishing.html I believe the IUCr is willing to accept papers that are posted on a preprint server (somebody correct me if I am wrong). It works for the math and physics community. Perhaps it would work for the crystallographic community. On 4/3/12 1:28 PM, Mark J van Raaij wrote: In fact, I would put it even stronger, if we know a referee is being dishonest, it is our duty to make sure he is removed from science, blacklisted from the journal etc. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote: Mark, I know some stories (which of course I'll not post here) from the Crystallography field and from other fields where reviewers profit from the fact that suddenly they have new, interpreted data which fits very well with their own results. Stories like to block a manuscript or ask for more results for the reviewer to be able to submit its own paper (with new ideas) in time, or copy a structure from the figures, or ask for experiments that only the reviewer can do so he/she is included in the paper, or submit as fast as possible in another journal with an extremely short delay of acceptance (e.g. 10 days, without revision?, talking to the editorial board?) things like this. Well, it is not question of making a full list, here!. The whole problem comes from publishing first, from competition. The hope with fraud with X-ray data is that it seems to be detectable, thanks to valuable people that develop methods to detect it. But it is very difficult to demonstrate that your work, ideas or results have been copied. How do you defend from this? And how after giving to them the valuable PDB? Finally, how many crystallographers are in the world? 5000? The concept of ethics can change from one place to another and, more than this, there is the fact that the reviewer is anonymous. I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit too much, indeed. I think they all have done a very nice job. But some of the stories from above happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an easy molecular replacement someone can solve a difficult structure and publish it first. And then the only thing left to the bad reviewer is to change the author's list! (and for the true author what is left is to feel like an idiot). In my humble opinion, we must be strict but not kill ourselves. Trust authors as we trust reviewers. Otherwise, the whole effort might be useless. Maria Dep. Structural Biology IBMB-CSIC Baldiri Reixach 10-12 08028 BARCELONA Spain Tel: (+34) 93 403 4950 Fax: (+34) 93 403 4979 e-mail: maria.s...@ibmb.csic.es On 3 April 2012 16:58, Mark J van Raaijmjvanra...@cnb.csic.es wrote: The remedy for the fact that some reviewers act unethically is not withholding coordinates and structure factors, but a more active role for the authors to denounce these possible violations and more effective investigations by the journals whose reviewers are suspected by the authors of committing these violations. I have witnessed authors being hesitant to complain about possible violations and journals not always taking complaints seriously enough. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 16:45, Bosch, Juergen wrote: Hi Fred, I'll go public on this one. This happened to me. I will not reveal who reviewed my paper and which paper it was only that your naive assumption might not always be correct. I have learned my lesson and exclude people with overlapping interests (even though they actually might be the best critical reviewers for your work). Unfortunately you don't really have control if the journal still decides to pick those excluded reviewers. As a suggestion to people out there, make sure to not encrypt your comments as pdf and PW protect them -
Re: [ccp4bb] very informative - Trends in Data Fabrication
I agree with Herbert that a pre-print setup is one way to establish priority and get useful comments for an author. And I know this has been discussed before, but another way is to remove the anonymous aspect of the review, as this would achieve the same as the community pre-print distribution (at least in many ways). I would be happy to give my name when reviewing, as I feel it is my job to improve the paper, and I can still face my colleagues after the exercise. cheers, tom Tom Peat Biophysics Group CSIRO, CMSE 343 Royal Parade Parkville, VIC, 3052 +613 9662 7304 +614 57 539 419 tom.p...@csiro.au From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Herbert J. Bernstein [y...@bernstein-plus-sons.com] Sent: Wednesday, April 04, 2012 4:33 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Dear Colleagues, One thing that would help is avoiding misappropriated priority of research results would be to join the math and physics community in their robust use of open-access preprints in arXiv. Such public preprints establish reliable timelines for research credit and help to ensure timely access to new results by the entire community. Fully peer-reviewed publications in real journals are still desirable, but to make this work, our journals would have to be willing to accept papers for which such a preprint system has been used. To understand the complexity of the issue, see http://nanoscale.blogspot.com/2008/01/arxiv-and-publishing.html I believe the IUCr is willing to accept papers that are posted on a preprint server (somebody correct me if I am wrong). It works for the math and physics community. Perhaps it would work for the crystallographic community. On 4/3/12 1:28 PM, Mark J van Raaij wrote: In fact, I would put it even stronger, if we know a referee is being dishonest, it is our duty to make sure he is removed from science, blacklisted from the journal etc. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote: Mark, I know some stories (which of course I'll not post here) from the Crystallography field and from other fields where reviewers profit from the fact that suddenly they have new, interpreted data which fits very well with their own results. Stories like to block a manuscript or ask for more results for the reviewer to be able to submit its own paper (with new ideas) in time, or copy a structure from the figures, or ask for experiments that only the reviewer can do so he/she is included in the paper, or submit as fast as possible in another journal with an extremely short delay of acceptance (e.g. 10 days, without revision?, talking to the editorial board?) things like this. Well, it is not question of making a full list, here!. The whole problem comes from publishing first, from competition. The hope with fraud with X-ray data is that it seems to be detectable, thanks to valuable people that develop methods to detect it. But it is very difficult to demonstrate that your work, ideas or results have been copied. How do you defend from this? And how after giving to them the valuable PDB? Finally, how many crystallographers are in the world? 5000? The concept of ethics can change from one place to another and, more than this, there is the fact that the reviewer is anonymous. I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit too much, indeed. I think they all have done a very nice job. But some of the stories from above happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an easy molecular replacement someone can solve a difficult structure and publish it first. And then the only thing left to the bad reviewer is to change the author's list! (and for the true author what is left is to feel like an idiot). In my humble opinion, we must be strict but not kill ourselves. Trust authors as we trust reviewers. Otherwise, the whole effort might be useless. Maria Dep. Structural Biology IBMB-CSIC Baldiri Reixach 10-12 08028 BARCELONA Spain Tel: (+34) 93 403 4950 Fax: (+34) 93 403 4979 e-mail: maria.s...@ibmb.csic.es On 3 April 2012 16:58, Mark J van Raaijmjvanra...@cnb.csic.es wrote: The remedy for the fact that some reviewers act unethically is not withholding coordinates and structure factors, but a more active role for the authors to denounce these possible violations and more effective investigations by the journals whose reviewers are suspected by the authors of committing these violations. I
Re: [ccp4bb] very informative - Trends in Data Fabrication
On the topic of MX fraud : could not an encryption algorithm be applied to answer the question of truth or falsity of a pdb/wwpdb/pdbe entry? has anyone proposed such an idea before? for example (admittedly this is a mess): * a detector parameter - perhaps the serial number - is used as a public key. the detector parameter is shared among beamlines/companies/*pdb. specifically, the experimentor requests it at beamtime. * experimentor voluntarily encrypts something, using GPLv3 programs, small but essential to the deposition materials, like the R-free set indices (or please suggest something better), using their private key. maybe symmetric cipher would work better for this. or the Free R set indices are used to generate a key. * at deposition time, the *pdb unencrypts the relevant entry components using their private key related to the detector used. existing deposition methods pass or fail based on this (so maybe not the Free R set). * why do this : at deposition time, *pdb will have a yes-or-no result from a single string of characters. can be a stop-gap measure until images can be archived easily. all elements of the chain are required to be free and unencumbered by proprietary interests. importantly, it is voluntary. this will prevent entries such as Schwarzenbacher or Ajees getting past deposition - so admittedly, not many. references: http://en.wikipedia.org/wiki/RSA_(algorithm) http://en.wikipedia.org/wiki/Diffie-Hellman_key_exchange -Bryan
Re: [ccp4bb] very informative - Trends in Data Fabrication
I'm not sure how encryption can solve a problem of truth or falsity. Public key encryption only says that the message that is decrypted using the public key must have been encrypted by someone who knows the private key. A person can use their private key to encrypt a lie as well as the truth. I don't quite follow your prescription, but if you are saying that the beamline gives the depositor a code when they collect data that proves data were collected, how does the beamline personal know the contents of the crystal? Couldn't one simply collect HEWL and then deposit any model they like? The beamline could encrypt all images with their private key, and the data integration program could decrypt the images using the public key. That way when a depositor presents a set of images it could be proved that those images came, unmodified, from that beamline. The images would still have to be deposited, however. (And this provides no protection against forgeries of home source data sets.) Dale Tronrud On 04/03/12 13:19, Bryan Lepore wrote: On the topic of MX fraud : could not an encryption algorithm be applied to answer the question of truth or falsity of a pdb/wwpdb/pdbe entry? has anyone proposed such an idea before? for example (admittedly this is a mess): * a detector parameter - perhaps the serial number - is used as a public key. the detector parameter is shared among beamlines/companies/*pdb. specifically, the experimentor requests it at beamtime. * experimentor voluntarily encrypts something, using GPLv3 programs, small but essential to the deposition materials, like the R-free set indices (or please suggest something better), using their private key. maybe symmetric cipher would work better for this. or the Free R set indices are used to generate a key. * at deposition time, the *pdb unencrypts the relevant entry components using their private key related to the detector used. existing deposition methods pass or fail based on this (so maybe not the Free R set). * why do this : at deposition time, *pdb will have a yes-or-no result from a single string of characters. can be a stop-gap measure until images can be archived easily. all elements of the chain are required to be free and unencumbered by proprietary interests. importantly, it is voluntary. this will prevent entries such as Schwarzenbacher or Ajees getting past deposition - so admittedly, not many. references: http://en.wikipedia.org/wiki/RSA_(algorithm) http://en.wikipedia.org/wiki/Diffie-Hellman_key_exchange -Bryan
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin
Re: [ccp4bb] very informative - Trends in Data Fabrication
On Tue, Apr 3, 2012 at 5:16 PM, Dale Tronrud det...@uoxray.uoregon.edu wrote: I'm not sure how encryption can solve a problem of truth or falsity. AFAIU any given checksum will tell you if a file is corrupted or not. My brain decided to interpret that as true or false. and A person can use their private key to encrypt a lie as well as the truth. [...] I don't quite follow your prescription, ...I admitted it is a mess - and sorry to mix up the various algorithms. also I must emphasize I do not have a clear picture of how encryption would work here. can I step back - it *seems* that following facts point to a checksum of sorts for a *pdb entry: * random number generator seed * randomly chosen Free R set * integer indices of the Free R set * detector things - serial number, or fingerprint of sorts - known to *pdb only. ... by checksum of sorts for a *pdb entry, what that means is an easy way to verify if all parts of the entry originated with diffraction images. detector things indicates that I am wondering if something besides an SN on a detector would be useful. ... so a scenario that comes to mind is the deposition team runs the checksum (or whatever), and gets the Free R set (for example). they run the battery of tests. they find that refinement is a disaster. they go check the detector specs they have, etc., etc., there were no images used. The beamline could encrypt all images with their private key, and[...] it could be proved that those images came, unmodified, from that beamline. would encryption of images significantly increase the integration time? Also, I am not following the image deposition forum elsewhere. ... anyways, this sounds like it was just an excercise. Thanks anyway. Regards, -Bryan
Re: [ccp4bb] very informative - Trends in Data Fabrication
Orcus, if you put yourself persistently into the face of guys who play hard, you need to learn to take a few hits and shake it off. Maybe a little retrospection on why your postings might perhaps possibly maybe perceived as somewhat self-promoting and ungracious could be helpful. The skill of presentation is at least as important in Science as being right. Best, BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin Jin Sent: Tuesday, April 03, 2012 3:34 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin
Re: [ccp4bb] very informative - Trends in Data Fabrication
Thanks of your education. I got it. By the way, what does Orcus mean here? Regards, Kevin On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com wrote: Orcus, ** ** if you put yourself persistently into the face of guys who play hard, you need to learn to take a few hits and shake it off. Maybe a little retrospection on why your postings might perhaps possibly maybe perceived as somewhat self-promoting and ungracious could be helpful. ** ** The skill of presentation is at least as important in Science as being right. ** ** Best, BR ** ** *From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of *Kevin Jin *Sent:* Tuesday, April 03, 2012 3:34 PM *To:* CCP4BB@JISCMAIL.AC.UK *Subject:* Re: [ccp4bb] very informative - Trends in Data Fabrication ** ** Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin ** ** -- Kevin Jin Sharing knowledge each other is always very joyful.. Website: http://www.jinkai.org/
Re: [ccp4bb] very informative - Trends in Data Fabrication
Trollus maximus perhaps ? But it could have different meanings e.g. in German there is something going south if it went down the orcus :-) Don't worry to much and relax. Jürgen On Apr 3, 2012, at 8:22 PM, Kevin Jin wrote: Thanks of your education. I got it. By the way, what does Orcus mean here? Regards, Kevin On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.commailto:hofkristall...@gmail.com wrote: Orcus, if you put yourself persistently into the face of guys who play hard, you need to learn to take a few hits and shake it off. Maybe a little retrospection on why your postings might perhaps possibly maybe perceived as somewhat self-promoting and ungracious could be helpful. The skill of presentation is at least as important in Science as being right. Best, BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin Jin Sent: Tuesday, April 03, 2012 3:34 PM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin -- Kevin Jin Sharing knowledge each other is always very joyful.. Website: http://www.jinkai.org/ .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] very informative - Trends in Data Fabrication
Might I suggest looking to Sean Seaver and the P212121.com as an example of a a successful crystallographer science blogger though the site has shifted more towards a consumable supplier in recent years. I would also consider looking into adding an RSS feed to your site so that those people interested in your articles can be informed without spamming the boards. The gods of the interwebz have blessed us with the gift of RSS so that we may be made aware of when someone might be yelling something potentially interesting into the void (that and to bring us silly pictures of cats covered in phonetically spelled captions when we have a failed experiment). It is my hope that this will not discourage you from taking every opportunity to improve your writing skills but help you find a more appropriate means of disseminating your product. Cheers, Katherine On Tue, Apr 3, 2012 at 7:22 PM, Kevin Jin kevin...@gmail.com wrote: Thanks of your education. I got it. By the way, what does Orcus mean here? Regards, Kevin On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com wrote: Orcus, ** ** if you put yourself persistently into the face of guys who play hard, you need to learn to take a few hits and shake it off. Maybe a little retrospection on why your postings might perhaps possibly maybe perceived as somewhat self-promoting and ungracious could be helpful. ** ** The skill of presentation is at least as important in Science as being right. ** ** Best, BR ** ** *From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of *Kevin Jin *Sent:* Tuesday, April 03, 2012 3:34 PM *To:* CCP4BB@JISCMAIL.AC.UK *Subject:* Re: [ccp4bb] very informative - Trends in Data Fabrication ** ** Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin ** ** -- Kevin Jin Sharing knowledge each other is always very joyful.. Website: http://www.jinkai.org/
Re: [ccp4bb] very informative - Trends in Data Fabrication
My intent with the troll joke was to give a humorous reminder that a little self promotion is ok, but a couple times a day is annoying. Orcus means troll, as in Internet troll, meaning one who subverts the intended use of the site and is annoying people. You have made a number of on topic posts that were very nice, but also a number that were clearly off topic and viewed as self promotion, with links to your consulting service. A couple times a day is a bit much. No one wants to be rude, so we try to humor you into toning it down a little. Compared to many Internet forums, this is likely one of the nicer responses you could expect. all the best, Kendall On Apr 3, 2012, at 8:22 PM, Kevin Jin kevin...@gmail.commailto:kevin...@gmail.com wrote: Thanks of your education. I got it. By the way, what does Orcus mean here? Regards, Kevin On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.commailto:hofkristall...@gmail.com wrote: Orcus, if you put yourself persistently into the face of guys who play hard, you need to learn to take a few hits and shake it off. Maybe a little retrospection on why your postings might perhaps possibly maybe perceived as somewhat self-promoting and ungracious could be helpful. The skill of presentation is at least as important in Science as being right. Best, BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin Jin Sent: Tuesday, April 03, 2012 3:34 PM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Dear All, Here may be another example for the importance of image storage. http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html Regards, Kevin -- Kevin Jin Sharing knowledge each other is always very joyful.. Website: http://www.jinkai.org/
Re: [ccp4bb] very informative - Trends in Data Fabrication
Hi, Regarding the online image file storage issue, I just googled cloud storage and had a look at the current pricing of such services. To my surprise, some companies are offering unlimited storage for as low as $5 a month. So that's $600 for 10 years. I am afraid that these companies will feel really sorry to learn that there are some monsters called crystallographers living on our planet. In our lab, some pre-21st century data sets were stored on tapes, newer ones on DVD discs and IDE hard drives. All these media have become or will become obsolete pretty soon. Not to mention the positive relationship of getting CRC errors with the medium's age. Admittedly, it may become quite a job to upload all image files that the whole crystallographic community generates per year. But for individual labs, I think clouding data might become something worth thinking of. Zhijie
Re: [ccp4bb] very informative - Trends in Data Fabrication
On Apr 3, 2012, at 7:19 PM, Katherine Sippel wrote: I would also consider looking into adding an RSS feed to your site so that those people interested in your articles can be informed without spamming the boards. Why continue to punish him? Adding an RSS feed means installing and configuring an RSS server. Aren't there rules against cruel and inhumane punishment? There are many free newsfeed disseminators. Twitter is the most famous. There are others, maybe better, so I'm not being a twittervangelist here. My point is this: free and easy is better than difficult. James
Re: [ccp4bb] very informative - Trends in Data Fabrication
James makes an important point. I've come to regret my joke as showing poor manners. I hesitate to add to more email that no one cares about, but I do think it is important to contribute the idea that the positive tone of this forum needs to be protected. I apologize, and suggest my comments should have been offered directly and off-line in order to be constructive and not off-putting to others who would want to contribute or ask questions. Kendall On Apr 3, 2012, at 10:01 PM, James Stroud xtald...@gmail.commailto:xtald...@gmail.com wrote: On Apr 3, 2012, at 7:19 PM, Katherine Sippel wrote: I would also consider looking into adding an RSS feed to your site so that those people interested in your articles can be informed without spamming the boards. Why continue to punish him? Adding an RSS feed means installing and configuring an RSS server. Aren't there rules against cruel and inhumane punishment? There are many free newsfeed disseminators. Twitter is the most famous. There are others, maybe better, so I'm not being a twittervangelist here. My point is this: free and easy is better than difficult. James
Re: [ccp4bb] very informative - Trends in Data Fabrication
The sad situation is that more and more scientists are becoming desperate (for funding or tenure or both) and are told 'publish or perish'; they become obsessed with impact factors, sensationalise the data in the process (be it complete fabrication or 'massaging' the results) and rush to publish to be the 'first' to do so. This was recently highlighted in the following article: http://www.nature.com/nature/journal/v483/n7391/full/483531a.html I personally think that whole review process should be open and transparent where the coordinates are available for everyone to see (after deposition and with authors' consent) along with the names and comments of the reviewers. If sloppy mistakes are made (deliberately or otherwise), they will be picked up by the wider scientific community if not the reviewers. Regards Ravi On 02/04/2012 19:00, Maria Sola i Vilarrubias wrote: Dear Phoebe, I cannot imagine myself delivering maps and coordinates (after years of work... I insist: after years of work) to a reviewer that could be, for whatever chance, my best competitor (even if I suggested to the editor not to include him/her as a reviewer... but decisions from editors are of all kind). I simply prefer not imagine this after two publications fuelled by clear, direct and strong competition. That was stressful enough, already. If I have to add to this stress the thought that my coordinates can go to the wrong hands, then I think I would just give up or, alternatively, send the work to a lower impact, fast-publishing journal and make my life easier while sending my scientific future to the low-impact bin, killing future opportunities. Competition is there. I see that data to be deposited is strictly confidential. I support the PDB to make the quality check work at the level you mention, but not a reviewer: People are nice but the world is big and competition is crazy... at least enough to make fraud or copy other's work. The latter is less difficult; by copying (simply copy and paste to my computer this nice structure that I was looking for!), there is no need to invent anything. About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. Maria On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu mailto:pr...@uchicago.edu wrote: Can we leverage this to push journals to routinely allow reviewers access coordinates and maps? Outright fraud is outrageous, but I'm actually more worried about ligands fit to marginal density and other issues of under-supervised model building. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 tel:773%20834%201723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 08:41:02 -0700 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com mailto:hofkristall...@gmail.com) Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 http://www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 and Louise Jones form the IUCr office has kindly made the article open access. http://journals.iucr.org/f/issues/2012/04/00/issconts.html BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Sunday, April 01, 2012 06:06 To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Hofkristallrat auA*er Dienst, is written as Bernhard - unless you are referring to some other guy with a french name Bernard. As one may extrapolate given my recent paper, I have been called names a lot worse A* And the book indeed is a bible of xtallography. Enough of this - it is becoming embarrassing. I wish I had done a more careful job proofing, as over 500 errata attest to, and we all are only seeing further because we are standing on the shoulders of giants. So once again thanks to all the contributors I have pestered with my questions on BB and then some, and to all those who actually read BMC
Re: [ccp4bb] very informative - Trends in Data Fabrication
If journals would require that not only coordinates, but also structure factors would be made publicly available immediately AFTER publication, any sloppy author will be caught within days by the Rups, redo people and Bricognes. Anyone who would then still submit and publish questionable data has choosen the wrong metier and, as has been mentioned before, should probably look for a job in the financial sector. my 2 cents, Herman From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ravi Nookala Sent: Tuesday, April 03, 2012 9:31 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication The sad situation is that more and more scientists are becoming desperate (for funding or tenure or both) and are told 'publish or perish'; they become obsessed with impact factors, sensationalise the data in the process (be it complete fabrication or 'massaging' the results) and rush to publish to be the 'first' to do so. This was recently highlighted in the following article: http://www.nature.com/nature/journal/v483/n7391/full/483531a.html I personally think that whole review process should be open and transparent where the coordinates are available for everyone to see (after deposition and with authors' consent) along with the names and comments of the reviewers. If sloppy mistakes are made (deliberately or otherwise), they will be picked up by the wider scientific community if not the reviewers. Regards Ravi On 02/04/2012 19:00, Maria Sola i Vilarrubias wrote: Dear Phoebe, I cannot imagine myself delivering maps and coordinates (after years of work... I insist: after years of work) to a reviewer that could be, for whatever chance, my best competitor (even if I suggested to the editor not to include him/her as a reviewer... but decisions from editors are of all kind). I simply prefer not imagine this after two publications fuelled by clear, direct and strong competition. That was stressful enough, already. If I have to add to this stress the thought that my coordinates can go to the wrong hands, then I think I would just give up or, alternatively, send the work to a lower impact, fast-publishing journal and make my life easier while sending my scientific future to the low-impact bin, killing future opportunities. Competition is there. I see that data to be deposited is strictly confidential. I support the PDB to make the quality check work at the level you mention, but not a reviewer: People are nice but the world is big and competition is crazy... at least enough to make fraud or copy other's work. The latter is less difficult; by copying (simply copy and paste to my computer this nice structure that I was looking for!), there is no need to invent anything. About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. Maria On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu wrote: Can we leverage this to push journals to routinely allow reviewers access coordinates and maps? Outright fraud is outrageous, but I'm actually more worried about ligands fit to marginal density and other issues of under-supervised model building. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 tel:773%20834%201723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 08:41:02 -0700 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com) Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: CCP4BB@JISCMAIL.AC.UK Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2
Re: [ccp4bb] very informative - Trends in Data Fabrication
I think that to review a paper containing a structure derived from crystallographic data should indeed involve the referee having access to coordinates and to the electron density. Without this access it is not possible to judge the quality and very often even the soundness of statements in the paper. I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. Unfortunately however, there is another serious issue. After a very troubling experience with a paper I reviewed, I discussed this issue with journal editors. What they said was that they already have a hell of time to find people who agree to referee, by raising the task level (asking refs to look at coords and density) they feared that no one would agree. Actually, perhaps many have noticed the large number of 5 liner referee reports saying really not much about a full length research article. People simply don't have the time to put the effort in. So I am not sure how realistic is to ask even more, for something that at some level, is pro bono work. Fred [32m*** Fred Dyda, Ph.D. Phone:301-402-4496 Laboratory of Molecular BiologyFax: 301-496-0201 DHHS/NIH/NIDDK e-mail:fred.d...@nih.gov Bldg. 5. Room 303 Bethesda, MD 20892-0560 URGENT message e-mail: 2022476...@mms.att.net Google maps coords: 39.000597, -77.102102 http://www2.niddk.nih.gov/NIDDKLabs/IntramuralFaculty/DydaFred ***[m
Re: [ccp4bb] very informative - Trends in Data Fabrication
Hi I was thinking about the last statement in the Acta editorial - It is important to note, however, that in neither of these cases was a single frame of data collected. Not one.. This brought me back to the images.. To date there is no global acceptance that original diffractiom images must be deposited (though I personally think there should be). Many of the arguments around this issue relate to the time and space required to house such data. However (and apologies if this has already been raised and I have missed it), if our sole intent is to ascertain that there's no trouble at t'mill then deposition of a modest wedge of data and / or a 0 and 90, while not ideal, may be sufficient to provide a decent additional check and balance, particularly if such images, headers etc were automatically analysed as part of the already excellent validation tools in development. I'm sure there are a number of clever ways (that could be unadvertised or kept confidential to the pdb) that could be used to check off sufficient variables within such data such that it should (?) be very difficult to falsify images without triggering alarm bells. Of course this would probably then drive those that are truly bonkers to attempt to fabricate realistically noisy false diffraction images, however I would hope that such a scheme might make things just a little more difficult for those with fraudulent intent, particularly if no one (apart from the developers) knows precisely how and what the checking software checks! While it seems sad that it's come to this cell biologists and biochemists have had to deal with more and more sophisticated versions of the photoshopped western for years. Accordingly, most high profile journals run figures through commercial software that does a reasonable job of detection of such issues. J Sent from my iPhone On 03/04/2012, at 11:10 PM, Dyda d...@ulti.niddk.nih.gov wrote: I think that to review a paper containing a structure derived from crystallographic data should indeed involve the referee having access to coordinates and to the electron density. Without this access it is not possible to judge the quality and very often even the soundness of statements in the paper. I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. Unfortunately however, there is another serious issue. After a very troubling experience with a paper I reviewed, I discussed this issue with journal editors. What they said was that they already have a hell of time to find people who agree to referee, by raising the task level (asking refs to look at coords and density) they feared that no one would agree. Actually, perhaps many have noticed the large number of 5 liner referee reports saying really not much about a full length research article. People simply don't have the time to put the effort in. So I am not sure how realistic is to ask even more, for something that at some level, is pro bono work. Fred [32m*** Fred Dyda, Ph.D. Phone:301-402-4496 Laboratory of Molecular BiologyFax: 301-496-0201 DHHS/NIH/NIDDK e-mail:fred.d...@nih.gov Bldg. 5. Room 303 Bethesda, MD 20892-0560 URGENT message e-mail: 2022476...@mms.att.net Google maps coords: 39.000597, -77.102102 http://www2.niddk.nih.gov/NIDDKLabs/IntramuralFaculty/DydaFred ***[m
Re: [ccp4bb] very informative - Trends in Data Fabrication
Hi Fred, I'll go public on this one. This happened to me. I will not reveal who reviewed my paper and which paper it was only that your naive assumption might not always be correct. I have learned my lesson and exclude people with overlapping interests (even though they actually might be the best critical reviewers for your work). Unfortunately you don't really have control if the journal still decides to pick those excluded reviewers. As a suggestion to people out there, make sure to not encrypt your comments as pdf and PW protect them - that's how I found out about the identity of the reviewer - as it couldn't be changed by the journal. I agree though that it shouldn't happen and I hope it only happens in very few cases. Jürgen On Apr 3, 2012, at 9:10 AM, Dyda wrote: I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] very informative - Trends in Data Fabrication
The remedy for the fact that some reviewers act unethically is not withholding coordinates and structure factors, but a more active role for the authors to denounce these possible violations and more effective investigations by the journals whose reviewers are suspected by the authors of committing these violations. I have witnessed authors being hesitant to complain about possible violations and journals not always taking complaints seriously enough. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 16:45, Bosch, Juergen wrote: Hi Fred, I'll go public on this one. This happened to me. I will not reveal who reviewed my paper and which paper it was only that your naive assumption might not always be correct. I have learned my lesson and exclude people with overlapping interests (even though they actually might be the best critical reviewers for your work). Unfortunately you don't really have control if the journal still decides to pick those excluded reviewers. As a suggestion to people out there, make sure to not encrypt your comments as pdf and PW protect them - that's how I found out about the identity of the reviewer - as it couldn't be changed by the journal. I agree though that it shouldn't happen and I hope it only happens in very few cases. Jürgen On Apr 3, 2012, at 9:10 AM, Dyda wrote: I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] very informative - Trends in Data Fabrication
Mark, I know some stories (which of course I'll not post here) from the Crystallography field and from other fields where reviewers profit from the fact that suddenly they have new, interpreted data which fits very well with their own results. Stories like to block a manuscript or ask for more results for the reviewer to be able to submit its own paper (with new ideas) in time, or copy a structure from the figures, or ask for experiments that only the reviewer can do so he/she is included in the paper, or submit as fast as possible in another journal with an extremely short delay of acceptance (e.g. 10 days, without revision?, talking to the editorial board?) things like this. Well, it is not question of making a full list, here!. The whole problem comes from publishing first, from competition. The hope with fraud with X-ray data is that it seems to be detectable, thanks to valuable people that develop methods to detect it. But it is very difficult to demonstrate that your work, ideas or results have been copied. How do you defend from this? And how after giving to them the valuable PDB? Finally, how many crystallographers are in the world? 5000? The concept of ethics can change from one place to another and, more than this, there is the fact that the reviewer is anonymous. I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit too much, indeed. I think they all have done a very nice job. But some of the stories from above happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an easy molecular replacement someone can solve a difficult structure and publish it first. And then the only thing left to the bad reviewer is to change the author's list! (and for the true author what is left is to feel like an idiot). In my humble opinion, we must be strict but not kill ourselves. Trust authors as we trust reviewers. Otherwise, the whole effort might be useless. Maria Dep. Structural Biology IBMB-CSIC Baldiri Reixach 10-12 08028 BARCELONA Spain Tel: (+34) 93 403 4950 Fax: (+34) 93 403 4979 e-mail: maria.s...@ibmb.csic.es On 3 April 2012 16:58, Mark J van Raaij mjvanra...@cnb.csic.es wrote: The remedy for the fact that some reviewers act unethically is not withholding coordinates and structure factors, but a more active role for the authors to denounce these possible violations and more effective investigations by the journals whose reviewers are suspected by the authors of committing these violations. I have witnessed authors being hesitant to complain about possible violations and journals not always taking complaints seriously enough. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 16:45, Bosch, Juergen wrote: Hi Fred, I'll go public on this one. This happened to me. I will not reveal who reviewed my paper and which paper it was only that your naive assumption might not always be correct. I have learned my lesson and exclude people with overlapping interests (even though they actually might be the best critical reviewers for your work). Unfortunately you don't really have control if the journal still decides to pick those excluded reviewers. As a suggestion to people out there, make sure to not encrypt your comments as pdf and PW protect them - that's how I found out about the identity of the reviewer - as it couldn't be changed by the journal. I agree though that it shouldn't happen and I hope it only happens in very few cases. Jürgen On Apr 3, 2012, at 9:10 AM, Dyda wrote: I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/ --
Re: [ccp4bb] very informative - Trends in Data Fabrication
I don't agree, if we know a referee is dishonest we should try and ruin his whole career, not just prevent him from scooping us in this one case. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote: Mark, I know some stories (which of course I'll not post here) from the Crystallography field and from other fields where reviewers profit from the fact that suddenly they have new, interpreted data which fits very well with their own results. Stories like to block a manuscript or ask for more results for the reviewer to be able to submit its own paper (with new ideas) in time, or copy a structure from the figures, or ask for experiments that only the reviewer can do so he/she is included in the paper, or submit as fast as possible in another journal with an extremely short delay of acceptance (e.g. 10 days, without revision?, talking to the editorial board?) things like this. Well, it is not question of making a full list, here!. The whole problem comes from publishing first, from competition. The hope with fraud with X-ray data is that it seems to be detectable, thanks to valuable people that develop methods to detect it. But it is very difficult to demonstrate that your work, ideas or results have been copied. How do you defend from this? And how after giving to them the valuable PDB? Finally, how many crystallographers are in the world? 5000? The concept of ethics can change from one place to another and, more than this, there is the fact that the reviewer is anonymous. I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit too much, indeed. I think they all have done a very nice job. But some of the stories from above happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an easy molecular replacement someone can solve a difficult structure and publish it first. And then the only thing left to the bad reviewer is to change the author's list! (and for the true author what is left is to feel like an idiot). In my humble opinion, we must be strict but not kill ourselves. Trust authors as we trust reviewers. Otherwise, the whole effort might be useless. Maria Dep. Structural Biology IBMB-CSIC Baldiri Reixach 10-12 08028 BARCELONA Spain Tel: (+34) 93 403 4950 Fax: (+34) 93 403 4979 e-mail: maria.s...@ibmb.csic.es On 3 April 2012 16:58, Mark J van Raaij mjvanra...@cnb.csic.es wrote: The remedy for the fact that some reviewers act unethically is not withholding coordinates and structure factors, but a more active role for the authors to denounce these possible violations and more effective investigations by the journals whose reviewers are suspected by the authors of committing these violations. I have witnessed authors being hesitant to complain about possible violations and journals not always taking complaints seriously enough. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 3 Apr 2012, at 16:45, Bosch, Juergen wrote: Hi Fred, I'll go public on this one. This happened to me. I will not reveal who reviewed my paper and which paper it was only that your naive assumption might not always be correct. I have learned my lesson and exclude people with overlapping interests (even though they actually might be the best critical reviewers for your work). Unfortunately you don't really have control if the journal still decides to pick those excluded reviewers. As a suggestion to people out there, make sure to not encrypt your comments as pdf and PW protect them - that's how I found out about the identity of the reviewer - as it couldn't be changed by the journal. I agree though that it shouldn't happen and I hope it only happens in very few cases. Jürgen On Apr 3, 2012, at 9:10 AM, Dyda wrote: I think the argument that this may give a competitive advantage to the referee who him or herself maybe working on the same thing should be mute, as I thought article refereeing was supposed to be a confidential process. Breaching this would be a serious ethical violation. In my experience, before agreeing to review, we see the abstract, I was always thought that I was supposed to decline if there is a potential conflict with my own work. Perhaps naively, but I always assumed that everyone acts like this. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin für Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin für Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
To my mind it just points to the fact that many scientists are generally unable to focus on one task or 'thing' at a time. i.e. very short attention spans... [before the flamer's start ‹ this is meant as a joke] Tony. --- Dr Antony W Oliver Senior Research Fellow CR-UK DNA Repair Enzymes Group Genome Damage and Stability Centre Science Park Road University of Sussex Falmer, Brighton, BN1 9RQ email: antony.oli...@sussex.ac.uk tel (office): +44 (0)1273 678349 tel (lab): +44 (0)1273 677512 On 4/2/12 9:47 AM, Manfred S. Weiss manfred.we...@helmholtz-berlin.de wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activit y Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin für Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin für Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear Colleagues, This is a further instance of likely scientific fraud in macromolecular crystallography, ie under formal investigation at the relevant university. Both Bernhard and the Acta D and F Editors further document aspects in their written pieces related to the need for diffraction data images availability. The call for a 'universal system' by the Editors, in their Editorial, is also what the IUCr Forum on these matters has also been discussing. A possible convergence on local raw data repositories, with each data set doi registered where it underpins a publication, detailed by the IUCr DDD WG thus far, is unlikely to be 'universal' in its global coverage. But setting standards by encouraging raw data archives in our field will afford a much needed clarity in favour of retaining raw data wherever possible. A separate issue will be, in my view, the certain expansion of current validation checks. Indeed it is the standard practice in chemical crystallography submissions to IUCr journals for Co-Editors to validate the structure determination and refinement, including omit map calculations where appropriate. Of course this is most often a much easier task in chemical crystallography, per crystal structure checked, than would be the case for macromolecular crystallography. Again I encourage colleagues to lodge their inputs at the IUCr Forum on any aspect of principle or practice in achieving diffraction raw data archiving. Best wishes, John John R Helliwell On Mon, Apr 2, 2012 at 9:47 AM, Manfred S. Weiss manfred.we...@helmholtz-berlin.de wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin für Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin für Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de -- Professor John R Helliwell DSc
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear Manfred, I understand your surprise and indignation, but for the sake of fairness you might also remember that I argued rather insistently at the end of last year in favour of the deposition of raw diffraction images, which is the crux of this problem. With best wishes, Gerard. -- On Mon, Apr 02, 2012 at 10:47:26AM +0200, Manfred S. Weiss wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin für Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin für Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] very informative - Trends in Data Fabrication
For the latest documentary on trolls in Norway see http://www.imdb.com/title/tt1740707/ The documentary describes both the classification system of Norwegian Trolls and why they are sensitive to sun light, i.e turn to stone. Depending in the species, some Trolls apparently prefer bridges and others caves. They all are attracted to christian blood though. cheers Preben On 4/1/12 10:42 PM, Ethan Merritt wrote: On Sunday, 01 April 2012, Kendall Nettles wrote: What is the single Latin word for troll? Kendall According to Google Translate, it is Troglodytarum. But I'm dubious. I thought trolls lived under bridges rather than in caves. Except for the ones who inhabit the internet, of course. Ethan -- J. Preben Morth, Ph.D Group Leader Membrane Transport Group Nordic EMBL Partnership Centre for Molecular Medicine Norway (NCMM) University of Oslo P.O.Box 1137 Blindern 0318 Oslo, Norway Email: j.p.mo...@ncmm.uio.no Tel: +47 2284 0794 http://www.jpmorth.dk
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Kevin et al., At the risk of being flamed as well, I could not resist this opportunity for shameless self promoting During my Ph.D. I worked on a flavoprotein as well and found flavin bending angles of 10 and 19°. I even published pictures of the electron density of the flavin (J.Mol.Biol.(1989), 208:679-696) and cited a reference from 1987 reporting a flavin bending angle of 20° for another flavoprotein. In this time, one had to presonally modify the PROLSQ restraints by hand, since if something was defined as being flat, it would become flat, no matter what the electron density was trying to say. Trying since there was no Rfree, no maximum likelyhood refinement and no CCP4BB so the maps were heavily biased. Although this period is commonly referred to as the stone-age of protein crystallography, many crystal structures were solved in this time that are still valid today. Before reinventing wheels, one could look a little further back in the literature than the last 7 years. Remains the question how this incorrect FMN definition could remain in the CCP4 package for so long. We need more people like Kevin, who loudly complain about errors in the CCP4 definitions instead of just fixing one's personal definition. Cheers! Herman From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin Jin Sent: Sunday, April 01, 2012 9:06 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication I hope and believe that this is not the case. Even basically-trained crystallographers should be able to calculate andinterpret difference maps of the kind described by Bernhard. And with the EDS and PDB_REDO server, one does not even need to know how to make generate a difference map... You are right! Actually, I am not an experienced protein crystallographer. I have learnt a lot from CCP4BB. I may have paid too much attention to bonding angle and bond length, like in small molecule. This may be an example to share with you. When I worked on those nitroreductase complexed with FMN in 2009 (?), I always observed that the flavin ring presented a strange geometry after refinement. Indeed, I had used the definition of FMN from CCP4 library all the time. In some cases, the methyl group at position of either 7a or 8a was bent off the aromatic ring, if the whole the rest of flavin was restrained in a flat plane. According to my limited knowledge from organic chemistry, carbon of 7 and 8 on the flavin ring is sp2 hybridized in a coplanar manner. How could those methyl groups be bent as sp3 hybridization? Any chemistry behind? With increased resolution (1.6 ~ 1.8 Ang), I observed that the electron density map was a bent along the N5-N10 axis. The bend angle was around ~16 degree. Again, I questioned myself why it was bent? Should this be correct? According to my limited knowledge in chemistry, N10 should be sp3 configuration even if FMN is in its oxidization form, in which the flavin ring should be bent. A quick google immediately gave me a link to a very nice paper published by David W. Rodgers in 2002. http://www.jbc.org/content/277/13/11513.full.pdf+html According to this paper, Yes! In the oxidized enzyme, the flavin ring system adopts a strongly bent (16°) conformation, and the bend increases (25°) in the reduced form of the enzyme,... When I reported this in the group meeting, I was laughed and told that this is just a model bias. It was over interpreted. Nobody has such sharp vision on electron density map. If this was correct, why nobody could find this and report to CCP4 within last 7 years? Eventually, a senior team member emailed to CCP4 about this issue. Since then, the definition of FMN was updated, according to my suggestion. I was asked how did you find it?... why you believed you are so right? I really don't how to answer. Je pense donc je suis Kevin On Sun, Apr 1, 2012 at 8:09 AM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote: On 31/03/12 23:08, Kevin Jin wrote: I really wish PDB could have some people to review those important structures, like paper reviewer. So do the wwPDB, I would imagine. But they can't just magic funding and positions into existence... If the coordinate is downloaded for modeling and docking, people may not check the density and model by themself. However this is not the worst case, since the original data was fabricated. 1. All of data was correct and real
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
OK, following on our psychological displacement: The examples Pheobe gave are mostly of collective nouns http://en.wikipedia.org/wiki/Collective_noun to be distinguished from mass nouns: http://en.wikipedia.org/wiki/Mass_noun Strictly speaking, data is not a collective noun and is the plural of datum. Use of singular form is accepted nowadays but it doesn't mean that it's correct. To quote Merriam-webster: ...Data leads its own life independent of datum... See: http://www.merriam-webster.com/dictionary/data And by the way, what do you answer to how much data did you collect? A lot? just a little? Had we asked: how complete is your data? how many frames did you collect? How many data sets? wouldn't we have got a much more informative answer? Boaz Most crystallographers use the word data as a mass noun - that is, the syntax of data follows that of gravel or mud, not that of pebble/pebbles. People who pounce on the phrase data is routinely say data collection and data processing. But note that the proper way to construct compound nouns such as those is to use the singular form - one would never say rocks collection or apples picking. So if we have to say data are then we should be discussing how (not) to fabricate a datum set. Also note that when people come back from the synchrotron, we ask how much data did you collect not how many. Much is generally used with mass nouns. That doesn't mean we can't ALSO use the word as one with discrete singular and plural forms, especially when we have a few, individual observations rather than a huge pile that blurs into an aggregate. In that case, I see nothing incorrect about discussing an individual datum and using data as the plural form. Sometimes it is the artificial, over-simplified rule that is stupid, not the native speakers of a language. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alp habetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710
Re: [ccp4bb] very informative - Trends in Data Fabrication
I am surprised that James Holton was not listed as a co-author, I understand that he has been expending a great deal of effort into how to accurately fabricate data. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu
Re: [ccp4bb] very informative - Trends in Data Fabrication
If James Holton had been involved, the fabrication would not have been discovered. Herman -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of David Schuller Sent: Monday, April 02, 2012 2:56 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication I am surprised that James Holton was not listed as a co-author, I understand that he has been expending a great deal of effort into how to accurately fabricate data. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately. On Mon, 2 Apr 2012, Manfred S. Weiss wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin f?r Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de Best wishes, --Gerard ** Gerard J. Kleywegt http://xray.bmc.uu.se/gerard mailto:ger...@xray.bmc.uu.se ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. ** Little known gastromathematical curiosity: let z be the radius and a the thickness of a pizza. Then the volume of that pizza is equal to pi*z*z*a ! **
Re: [ccp4bb] very informative - Trends in Data Fabrication
I thought Ethan was looking for the verb -- you know, fishing!!! On Mon, 2 Apr 2012, jens Preben Morth wrote: For the latest documentary on trolls in Norway see http://www.imdb.com/title/tt1740707/ The documentary describes both the classification system of Norwegian Trolls and why they are sensitive to sun light, i.e turn to stone. Depending in the species, some Trolls apparently prefer bridges and others caves. They all are attracted to christian blood though. cheers Preben On 4/1/12 10:42 PM, Ethan Merritt wrote: On Sunday, 01 April 2012, Kendall Nettles wrote: What is the single Latin word for troll? Kendall According to Google Translate, it is Troglodytarum. But I'm dubious. I thought trolls lived under bridges rather than in caves. Except for the ones who inhabit the internet, of course. Ethan -- = Robert M. Sweet E-Dress: sw...@bnl.gov Group Leader, PXRR: Macromolecular ^ (that's L Crystallography Research Resource at NSLSnot 1) http://px.nsls.bnl.gov/ Biology Dept Brookhaven Nat'l Lab. Phones: Upton, NY 11973631 344 3401 (Office) U.S.A. 631 344 2741 (Facsimile) =
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Guys, http://www.youtube.com/watch?v=CobZuaPMQHw second 9 in this 22 sec video -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Gerard DVD Kleywegt Sent: Monday, April 02, 2012 8:04 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication] Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
I'm now preparing for the flood of 'unsubscribe ccp4bb' requests On Apr 2, 2012, at 9:15 AM, Bernhard Rupp (Hofkristallrat a.D.) wrote: Guys, http://www.youtube.com/watch?v=CobZuaPMQHw second 9 in this 22 sec video -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Gerard DVD Kleywegt Sent: Monday, April 02, 2012 8:04 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication] Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear Gerard, inside Germany it's apparently called German Humour. There's a Wikipedia entry for that as well. Go figure: http://en.wikipedia.org/wiki/German_humor Andreas (still living on Sunday time) On 02/04/2012 4:03, Gerard DVD Kleywegt wrote: Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately.
Re: [ccp4bb] very informative - Trends in Data Fabrication
Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 and Louise Jones form the IUCr office has kindly made the article open access. http://journals.iucr.org/f/issues/2012/04/00/issconts.html BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Sunday, April 01, 2012 06:06 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Hofkristallrat außer Dienst, is written as Bernhard - unless you are referring to some other guy with a french name Bernard. As one may extrapolate given my recent paper, I have been called names a lot worse . Ø And the book indeed is a bible of xtallography. Enough of this - it is becoming embarrassing. I wish I had done a more careful job proofing, as over 500 errata attest to, and we all are only seeing further because we are standing on the shoulders of giants. So once again thanks to all the contributors I have pestered with my questions on BB and then some, and to all those who actually read BMC and submitted errata. Best regards, BR - Bernhard Hieronimus Rupp, Hofkristallrat a.D. 001 (925) 209-7429 +43 (676) 571-0536 hofkristall...@gmail.com b...@hofkristallamt.org http://www.ruppweb.org/ -- Once the sun of science is standing low, even dwarfs cast tall shadows --
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear Andreas, That page confirms the old adage: German humour is no laughing matter. --Gerard On Mon, 2 Apr 2012, Andreas F?rster wrote: Dear Gerard, inside Germany it's apparently called German Humour. There's a Wikipedia entry for that as well. Go figure: http://en.wikipedia.org/wiki/German_humor Andreas (still living on Sunday time) On 02/04/2012 4:03, Gerard DVD Kleywegt wrote: Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately. Best wishes, --Gerard ** Gerard J. Kleywegt http://xray.bmc.uu.se/gerard mailto:ger...@xray.bmc.uu.se ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. ** Little known gastromathematical curiosity: let z be the radius and a the thickness of a pizza. Then the volume of that pizza is equal to pi*z*z*a ! **
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
And please consider the date of Sunday's posts. We take this stuff seriously. That's what's nice about science. We ferret out mischief and bring it to the public. Nothing up my sleeve - all tricks will be exposed and dealt with harshly A Buffalo view. Sent via BlackBerry by ATT -Original Message- From: Gerard DVD Kleywegt ger...@xray.bmc.uu.se Sender: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK Date: Mon, 2 Apr 2012 17:03:42 To: CCP4BB@JISCMAIL.AC.UK Reply-To: Gerard DVD Kleywegt ger...@xray.bmc.uu.se Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication] Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately. On Mon, 2 Apr 2012, Manfred S. Weiss wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin f?r Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de Best wishes, --Gerard ** Gerard J. Kleywegt http://xray.bmc.uu.se/gerard mailto:ger...@xray.bmc.uu.se ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. ** Little known gastromathematical curiosity: let z be the radius and a the thickness of a pizza. Then the volume of that pizza is equal to pi*z*z*a ! **
Re: [ccp4bb] very informative - Trends in Data Fabrication
Can we leverage this to push journals to routinely allow reviewers access coordinates and maps? Outright fraud is outrageous, but I'm actually more worried about ligands fit to marginal density and other issues of under-supervised model building. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 08:41:02 -0700 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com) Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: CCP4BB@JISCMAIL.AC.UK Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 and Louise Jones form the IUCr office has kindly made the article open access. http://journals.iucr.org/f/issues/2012/04/00/issconts.html BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Sunday, April 01, 2012 06:06 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Hofkristallrat auA*er Dienst, is written as Bernhard - unless you are referring to some other guy with a french name Bernard. As one may extrapolate given my recent paper, I have been called names a lot worse A* And the book indeed is a bible of xtallography. Enough of this - it is becoming embarrassing. I wish I had done a more careful job proofing, as over 500 errata attest to, and we all are only seeing further because we are standing on the shoulders of giants. So once again thanks to all the contributors I have pestered with my questions on BB and then some, and to all those who actually read BMC and submitted errata. Best regards, BR - Bernhard Hieronimus Rupp, Hofkristallrat a.D. 001 (925) 209-7429 +43 (676) 571-0536 hofkristall...@gmail.com b...@hofkristallamt.org http://www.ruppweb.org/ -- Once the sun of science is standing low, even dwarfs cast tall shadows --
Re: [ccp4bb] very informative - Trends in Data Fabrication
I like your point--somehow we should enlist the evil inclination to power our science, a la Faust. How is it that those hackers are so innovative for so little reward? I remember a Smithsonian article years ago which quoted the calculated mean $/hr rate of money counterfeiters as being ~pennies/hr, and I assume hackers would fit right in there... JPK On Sun, Apr 1, 2012 at 11:45 PM, Artem Evdokimov artem.evdoki...@gmail.comwrote: I can't resist asking: If we assume that the data fabrication techniques and the techniques for discovery of such activities should have the same sort of arms race as the development of viruses and anti-malvare software (but of course on a much more modest scale since structural biology is a relatively niche discipline) - can we then speculate further that eventually the most sophisticated fabrication techniques would be equivalent to de novo structure prediction :) It's really too bad that there's no real money in this (again, relatively speaking - not as much money as there is in software development), because if there was then the structural biology equivalent of 'virus hackers' would in reality approximate the same development trajectory as the most successful (and legitimate) protein modelers. Given the ingenuity of hackers and like-minded people in general, I sometimes wonder if this isn't a better way to develop structure prediction tools... Artem On Sun, Apr 1, 2012 at 10:09 AM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote: On 31/03/12 23:08, Kevin Jin wrote: I really wish PDB could have some people to review those important structures, like paper reviewer. So do the wwPDB, I would imagine. But they can't just magic funding and positions into existence... If the coordinate is downloaded for modeling and docking, people may not check the density and model by themself. However this is not the worst case, since the original data was fabricated. 1. All of data was correct and real, Hmmm... It will be very difficult for people to check the density and coordinated if he/she is not a well-trained crystallographer. I hope and believe that this is not the case. Even basically-trained crystallographers should be able to calculate and interpret difference maps of the kind described by Bernhard. And with the EDS and PDB_REDO server, one does not even need to know how to make generate a difference map... Paul. -- *** Jacob Pearson Keller Northwestern University Medical Scientist Training Program email: j-kell...@northwestern.edu ***
Re: [ccp4bb] very informative - Trends in Data Fabrication
I thought we had evidence for hackers doing this already. J http://www.nature.com/nature/journal/v477/n7365/full/477373e.html (no flames, please-'tis intended to be funny, not factual) Bryan From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Jacob Keller Sent: Monday, April 02, 2012 1:25 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication I like your point--somehow we should enlist the evil inclination to power our science, a la Faust. How is it that those hackers are so innovative for so little reward? I remember a Smithsonian article years ago which quoted the calculated mean $/hr rate of money counterfeiters as being ~pennies/hr, and I assume hackers would fit right in there... JPK On Sun, Apr 1, 2012 at 11:45 PM, Artem Evdokimov artem.evdoki...@gmail.com wrote: I can't resist asking: If we assume that the data fabrication techniques and the techniques for discovery of such activities should have the same sort of arms race as the development of viruses and anti-malvare software (but of course on a much more modest scale since structural biology is a relatively niche discipline) - can we then speculate further that eventually the most sophisticated fabrication techniques would be equivalent to de novo structure prediction :) It's really too bad that there's no real money in this (again, relatively speaking - not as much money as there is in software development), because if there was then the structural biology equivalent of 'virus hackers' would in reality approximate the same development trajectory as the most successful (and legitimate) protein modelers. Given the ingenuity of hackers and like-minded people in general, I sometimes wonder if this isn't a better way to develop structure prediction tools... Artem On Sun, Apr 1, 2012 at 10:09 AM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote: On 31/03/12 23:08, Kevin Jin wrote: I really wish PDB could have some people to review those important structures, like paper reviewer. So do the wwPDB, I would imagine. But they can't just magic funding and positions into existence... If the coordinate is downloaded for modeling and docking, people may not check the density and model by themself. However this is not the worst case, since the original data was fabricated. 1. All of data was correct and real, Hmmm... It will be very difficult for people to check the density and coordinated if he/she is not a well-trained crystallographer. I hope and believe that this is not the case. Even basically-trained crystallographers should be able to calculate and interpret difference maps of the kind described by Bernhard. And with the EDS and PDB_REDO server, one does not even need to know how to make generate a difference map... Paul. -- *** Jacob Pearson Keller Northwestern University Medical Scientist Training Program email: j-kell...@northwestern.edu *** -- Confidentiality Notice: This message is private and may contain confidential and proprietary information. If you have received this message in error, please notify us and remove it from your system and note that you must not copy, distribute or take any action in reliance on it. Any unauthorized use or disclosure of the contents of this message is not permitted and may be unlawful.
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Hm, last I checked my passport said German - still think I can make lots of fun of myself. Some Germans are epigenetically marked with humor-suppressor genes others not. Jürgen On Apr 2, 2012, at 11:03 AM, Gerard DVD Kleywegt wrote: Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately. On Mon, 2 Apr 2012, Manfred S. Weiss wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edumailto:schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin f?r Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.demailto:mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.de Best wishes, --Gerard ** Gerard J. Kleywegt http://xray.bmc.uu.se/gerard mailto:ger...@xray.bmc.uu.se ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. ** Little known gastromathematical curiosity: let z be the radius and a the thickness of a pizza. Then the volume of that pizza is equal to pi*z*z*a ! ** .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
My favorite part of the german humor link: Some German humorists such as Loriothttp://en.wikipedia.org/wiki/Vicco_von_B%C3%BClow use seriousness as means of humor. On Apr 2, 2012, at 1:38 PM, Bosch, Juergen wrote: Hm, last I checked my passport said German - still think I can make lots of fun of myself. Some Germans are epigenetically marked with humor-suppressor genes others not. Jürgen On Apr 2, 2012, at 11:03 AM, Gerard DVD Kleywegt wrote: Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately. On Mon, 2 Apr 2012, Manfred S. Weiss wrote: Dear all, I find this discussion most amazing. Here, we are dealing with the most serious issue that happened to Macromolecular Crystallography since the Alabama case, and the whole discussion is centered around singular and plural and Greek and Latin words and what not. In psychology such phenomenon is referred to as displacement activity. If you are interested, here is the MacMillon definition of it: http://www.macmillandictionary.com/dictionary/british/displacement-activity Cheers, Manfred On 01.04.2012 19:35, Gerard Bricogne wrote: On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote: On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? Good nit-picking :-) . In my mind the quotes around data would have had the same effect as writing 'the word data', and referring to that word by the 'it'. So there is only one word, while its grammatical number is plural. At any rate, I heard a Nobel laureate use it incorrectly just two days ago. We shouldn't learn to write by imitating Nobel laureates, then. With best wishes, Gerard. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edumailto:schul...@cornell.edu -- Dr. Manfred. S. Weiss Helmholtz-Zentrum Berlin f?r Materialien und Energie Macromolecular Crystallography (HZB-MX) Albert-Einstein-Str. 15 D-12489 Berlin GERMANY Fon: +49-30-806213149 Fax: +49-30-806214975 Web: http://www.helmholtz-berlin.de/bessy-mx Email: mswe...@helmholtz-berlin.demailto:mswe...@helmholtz-berlin.de Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V. Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla Sitz Berlin, AG Charlottenburg, 89 HRB 5583 Postadresse: Hahn-Meitner-Platz 1 D-14109 Berlin http://www.helmholtz-berlin.dehttp://www.helmholtz-berlin.de/ Best wishes, --Gerard ** Gerard J. Kleywegt http://xray.bmc.uu.se/gerard mailto:ger...@xray.bmc.uu.se ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. ** Little known gastromathematical curiosity: let z be the radius and a the thickness of a pizza. Then the volume of that pizza is equal to pi*z*z*a ! ** .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 And the summary indicates that outside Germany = English speaking world - which probably unveals its author as American ;-) On 04/02/12 18:25, Gerard DVD Kleywegt wrote: Dear Andreas, That page confirms the old adage: German humour is no laughing matter. --Gerard On Mon, 2 Apr 2012, Andreas F?rster wrote: Dear Gerard, inside Germany it's apparently called German Humour. There's a Wikipedia entry for that as well. Go figure: http://en.wikipedia.org/wiki/German_humor Andreas (still living on Sunday time) On 02/04/2012 4:03, Gerard DVD Kleywegt wrote: Dear Manfred, Outside Germany, such excursions are called humour. If you are interested, here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour --Gerard PS: It was on a Sunday so all levity was perpetrated in people's own time. Today we'll all be serious again and frown and tut-tut appropriately. Best wishes, --Gerard ** Gerard J. Kleywegt http://xray.bmc.uu.se/gerard mailto:ger...@xray.bmc.uu.se ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. ** Little known gastromathematical curiosity: let z be the radius and a the thickness of a pizza. Then the volume of that pizza is equal to pi*z*z*a ! ** - -- - -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFPeed9UxlJ7aRr7hoRAh9tAKDpydssNnLTrxn51ccjsR6Sfr4azwCdHWN1 u2uFraBdBejfkNLF9nnXhCA= =OngV -END PGP SIGNATURE-
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Phoebe, I cannot imagine myself delivering maps and coordinates (after years of work... I insist: after years of work) to a reviewer that could be, for whatever chance, my best competitor (even if I suggested to the editor not to include him/her as a reviewer... but decisions from editors are of all kind). I simply prefer not imagine this after two publications fuelled by clear, direct and strong competition. That was stressful enough, already. If I have to add to this stress the thought that my coordinates can go to the wrong hands, then I think I would just give up or, alternatively, send the work to a lower impact, fast-publishing journal and make my life easier while sending my scientific future to the low-impact bin, killing future opportunities. Competition is there. I see that data to be deposited is strictly confidential. I support the PDB to make the quality check work at the level you mention, but not a reviewer: People are nice but the world is big and competition is crazy… at least enough to make fraud or copy other's work. The latter is less difficult; by copying (simply copy and paste to my computer this nice structure that I was looking for!), there is no need to invent anything. About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. Maria On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu wrote: Can we leverage this to push journals to routinely allow reviewers access coordinates and maps? Outright fraud is outrageous, but I'm actually more worried about ligands fit to marginal density and other issues of under-supervised model building. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 08:41:02 -0700 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com) Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: CCP4BB@JISCMAIL.AC.UK Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 and Louise Jones form the IUCr office has kindly made the article open access. http://journals.iucr.org/f/issues/2012/04/00/issconts.html BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Sunday, April 01, 2012 06:06 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Hofkristallrat auA*er Dienst, is written as Bernhard - unless you are referring to some other guy with a french name Bernard. As one may extrapolate given my recent paper, I have been called names a lot worse A* And the book indeed is a bible of xtallography. Enough of this - it is becoming embarrassing. I wish I had done a more careful job proofing, as over 500 errata attest to, and we all are only seeing further because we are standing on the shoulders of giants. So once again thanks to all the contributors I have pestered with my questions on BB and then some, and to all those who actually read BMC and submitted errata. Best regards, BR - Bernhard Hieronimus Rupp, Hofkristallrat a.D. 001 (925) 209-7429 +43 (676) 571-0536 hofkristall...@gmail.com b...@hofkristallamt.org http://www.ruppweb.org/ -- Once the sun of science is standing low, even dwarfs cast tall shadows -- -- Maria Solà Dep. Structural Biology IBMB-CSIC Baldiri Reixach 10-12 08028 BARCELONA Spain Tel: (+34) 93 403 4950 Fax: (+34) 93 403 4979 e-mail: maria.s...@ibmb.csic.es
Re: [ccp4bb] very informative - Trends in Data Fabrication
On Mon, Apr 2, 2012 at 11:00 AM, Maria Sola i Vilarrubias msv...@ibmb.csic.es wrote: About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. This will often be insufficient, I'm afraid. We generally assume good faith on the part of the authors: if the caption says the 2mFo-DFc map is shown contoured at 1.5sigma, we assume that this is an honest statement, but we also have no way of verifying it until the experimental data are available. I know of at least one case offhand where the maps could not possibly have been contoured at that level - the ligands are not misfit, they are simply not present in the crystals, and the paper is misleading (deliberately or not, I don't know). Most reviewers do not have the patience to spend weeks pursuing these issues. (Although it would certainly help if reviewers insisted that the density around ligands not be shown in isolation.) That aside, I completely understand why someone would be reluctant to share their data with potential competitors. Someone once suggested making the model and maps viewable via a web applet (AstexViewer or similar), but even that sounds like it could be prone to abuse. -Nat
Re: [ccp4bb] very informative - Trends in Data Fabrication
Artem Evdokimov wrote: I can't resist asking: If we assume that the data fabrication techniques and the techniques for discovery of such activities should have the same sort of arms race as the development of viruses and anti-malvare software (but of course on a much more modest scale since structural biology is a relatively niche discipline) - can we then I don't think this assumption holds for structure prediction, except in the extreme asymptotic limit. All of the cases of fabricated data that I've heard of were detected because the fabricated data didn't look like actual experimental data - because our models for calculating data are missing a variety of things that occur experimentally. So a hypothetical arms race might be result in a better model of the various components (and potential sources) of errors during data collection and processing. But this would be a much more interesting development in itself than any use for fabricating data. Pete speculate further that eventually the most sophisticated fabrication techniques would be equivalent to de novo structure prediction :) It's really too bad that there's no real money in this (again, relatively speaking - not as much money as there is in software development), because if there was then the structural biology equivalent of 'virus hackers' would in reality approximate the same development trajectory as the most successful (and legitimate) protein modelers. Given the ingenuity of hackers and like-minded people in general, I sometimes wonder if this isn't a better way to develop structure prediction tools... Artem On Sun, Apr 1, 2012 at 10:09 AM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote: On 31/03/12 23:08, Kevin Jin wrote: I really wish PDB could have some people to review those important structures, like paper reviewer. So do the wwPDB, I would imagine. But they can't just magic funding and positions into existence... If the coordinate is downloaded for modeling and docking, people may not check the density and model by themself. However this is not the worst case, since the original data was fabricated. 1. All of data was correct and real, Hmmm... It will be very difficult for people to check the density and coordinated if he/she is not a well-trained crystallographer. I hope and believe that this is not the case. Even basically-trained crystallographers should be able to calculate and interpret difference maps of the kind described by Bernhard. And with the EDS and PDB_REDO server, one does not even need to know how to make generate a difference map... Paul.
Re: [ccp4bb] very informative - Trends in Data Fabrication
That's very sad, but a good point. I may be a bit naive because I haven't had to worry mas uch about direct competition. However, I do find it very frustrating as a reviewer to try to pass judgement on a crystal structure based only on the standard table 1. Sometimes I'm tempted to write based on the information presented, darned if I know! Maybe 3rd-party validation through the pdb (with a report sent to the reviewers) is more appropriate? Phoebe = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 20:00:48 +0200 From: Maria Sola i Vilarrubias msv...@ibmb.csic.es Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: pr...@uchicago.edu Cc: CCP4BB@jiscmail.ac.uk Dear Phoebe, I cannot imagine myself delivering maps and coordinates (after years of work... I insist: after years of work) to a reviewer that could be, for whatever chance, my best competitor (even if I suggested to the editor not to include him/her as a reviewer... but decisions from editors are of all kind). I simply prefer not imagine this after two publications fuelled by clear, direct and strong competition. That was stressful enough, already. If I have to add to this stress the thought that my coordinates can go to the wrong hands, then I think I would just give up or, alternatively, send the work to a lower impact, fast-publishing journal and make my life easier while sending my scientific future to the low-impact bin, killing future opportunities. Competition is there. I see that data to be deposited is strictly confidential. I support the PDB to make the quality check work at the level you mention, but not a reviewer: People are nice but the world is big and competition is crazy… at least enough to make fraud or copy other's work. The latter is less difficult; by copying (simply copy and paste to my computer this nice structure that I was looking for!), there is no need to invent anything. About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. Maria On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu wrote: Can we leverage this to push journals to routinely allow reviewers access coordinates and maps? Outright fraud is outrageous, but I'm actually more worried about ligands fit to marginal density and other issues of under-supervised model building. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 08:41:02 -0700 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com) Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: CCP4BB@JISCMAIL.AC.UK Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 and Louise Jones form the IUCr office has kindly made the article open access. http://journals.iucr.org/f/issues/2012/04/00/issconts.html BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) Sent: Sunday, April 01, 2012 06:06 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Hofkristallrat auA*er Dienst, is written as Bernhard - unless you are referring to some other guy with a french name Bernard. As one may extrapolate given my recent paper, I have been called names a lot worse A* And the book indeed is a bible of xtallography. Enough of this - it is becoming embarrassing. I wish I had done a more careful job proofing, as over 500 errata attest to, and we all are only seeing further because we are standing on the shoulders of giants. So once again thanks to all the contributors I have pestered with my questions
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear Phoebe, As it happens, validation through the PDB is exactly what the X-ray Validation Task Force proposed (to be honest, it was a suggestion made by George Sheldrick the last time there was a debate like this on the CCP4-BB!), and the wwPDB is currently implementing the pipeline needed to automatically produce a good validation report. A preliminary version of such a report is already available when you deposit a structure now, the IUCr journals already require this for papers describing structures, and there seems to be interest from some other journals. In the meantime, if you're refereeing a paper from a journal that doesn't require the validation report to be submitted with the paper, you can always ask them to get it from the author. Best wishes, Randy - Randy J. Read Department of Haematology, University of Cambridge Cambridge Institute for Medical ResearchTel: +44 1223 336500 Wellcome Trust/MRC Building Fax: +44 1223 336827 Hills RoadE-mail: rj...@cam.ac.uk Cambridge CB2 0XY, U.K. www-structmed.cimr.cam.ac.uk On 2 Apr 2012, at 20:01, Phoebe Rice wrote: That's very sad, but a good point. I may be a bit naive because I haven't had to worry mas uch about direct competition. However, I do find it very frustrating as a reviewer to try to pass judgement on a crystal structure based only on the standard table 1. Sometimes I'm tempted to write based on the information presented, darned if I know! Maybe 3rd-party validation through the pdb (with a report sent to the reviewers) is more appropriate? Phoebe = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 20:00:48 +0200 From: Maria Sola i Vilarrubias msv...@ibmb.csic.es Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: pr...@uchicago.edu Cc: CCP4BB@jiscmail.ac.uk Dear Phoebe, I cannot imagine myself delivering maps and coordinates (after years of work... I insist: after years of work) to a reviewer that could be, for whatever chance, my best competitor (even if I suggested to the editor not to include him/her as a reviewer... but decisions from editors are of all kind). I simply prefer not imagine this after two publications fuelled by clear, direct and strong competition. That was stressful enough, already. If I have to add to this stress the thought that my coordinates can go to the wrong hands, then I think I would just give up or, alternatively, send the work to a lower impact, fast-publishing journal and make my life easier while sending my scientific future to the low-impact bin, killing future opportunities. Competition is there. I see that data to be deposited is strictly confidential. I support the PDB to make the quality check work at the level you mention, but not a reviewer: People are nice but the world is big and competition is crazy… at least enough to make fraud or copy other's work. The latter is less difficult; by copying (simply copy and paste to my computer this nice structure that I was looking for!), there is no need to invent anything. About a wrongly fit compound, the reviewer can ask images about the model in a map calculated at a specific sigma and in different orientations. Maria On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu wrote: Can we leverage this to push journals to routinely allow reviewers access coordinates and maps? Outright fraud is outrageous, but I'm actually more worried about ligands fit to marginal density and other issues of under-supervised model building. = Phoebe A. Rice Dept. of Biochemistry Molecular Biology The University of Chicago phone 773 834 1723 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123 http://www.rsc.org/shop/books/2008/9780854042722.asp Original message Date: Mon, 2 Apr 2012 08:41:02 -0700 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com) Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication To: CCP4BB@JISCMAIL.AC.UK Robbie has restored the PDB_REDO of 3k78 It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 and Louise Jones form the IUCr office has kindly made the article open access. http://journals.iucr.org/f/issues/2012/04/00/issconts.html BR From: CCP4
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
That's pretty funny, isn't it? Andreas On 02/04/2012 6:52, Jacob Keller wrote: Sorry to beat a dead horse, but: * *Antiwitz* (/anti-joke/): A short, often absurd scene, which has the recognizable structure of a joke, but is illogical or lacking a punch-line. Example: /Two thick feet are crossing the street. Says one thick foot to the other thick foot: Hello!/ Other examples: Nachts ist es kälter als draußen (At night it's colder than outside) or Zu Fuß ist es kürzer als über'n Berg (Walking is faster than over the mountain).
Re: [ccp4bb] very informative - Trends in Data Fabrication
I still believe Prof. Dr. Hofkristallrat außer Dienst, is written as Bernhard - unless you are referring to some other guy with a french name Bernard. And the book indeed is a bible of xtallography. Jürgen ausser Dienst ... now I get it ... my German is a lot worse than just spelling names wrong ;-) (and sorry for the 'ss' - no clue where Escet is in my keyboard) and indeed the book is great - maybe get the publication year and count PDB structures before and after it ... We are in year 3 Anno Rupp ... (and no worries Bernhard about the errata ... the Bible likely contains many more ... each chapter is contradicting every other ... at least you are consistent!) Best - A.
[ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul.
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. Related instances are * a phenomenon (singular) vs. several phenomena (plural), * a criterion (singular) vs. several criteria (plural) and many more. And then there is the infamous mix-up between principal (adjective) and principle (noun, as in Principle of Least Action, or Peter's Principle) giving rise to the favourite hero, the Principle Investigator. This phenomena is now so widespread that perhaps compliance with ancient Greek or Latin morphology is no longer a relevant criteria ;-) . With best wishes, Gerard. -- On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote: The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul. -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] very informative - Trends in Data Fabrication
Dear CCP4BBers, The PDB_REDO entry Bernhard referred to in his interesting and very thorough article was automatically deleted because the original PDB entry was obsoleted. Since access to the 'experimental' data of any study is important, we have made a compressed copy of the PDB_REDO entry available at http://www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 Our apologies to those who have looked for this entry in vain. Best wishes, Robbie Joosten (on behalf of the PDB_REDO team) Biochemistry Netherlands Cancer Institute P.S. The whole fraud thing seems to have interfered with the annual April fools' post on CCP4BB. Let's hope this will not happen again. -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Michel Fodje Sent: Saturday, March 31, 2012 21:55 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication Very interesting Response to Detection and analysis of unusual features in the structural model and structure-factor data of a birch pollen allergen doi:10.1107/S1744309112008433 a quote from the response: Author Schwarzenbacher admits to the allegations of data fabrication and deeply apologizes to the co-authors and the scientific community for all the problems this has caused . Note added in proof: subsequent to the acceptance of this article for publication, author Schwarzenbacher withdrew his admission of the allegations. From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp (Hofkristallrat a.D.) [hofkristall...@gmail.com] Sent: Saturday, March 31, 2012 12:42 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication This is an unresolved problem, and no real satisfactory solution exists, because the underlying reasons for zero occupancy can be different. For people who understand this and look at electron density, it is not a problem. For users who rely on some graphics program displaying only atom coordinates, it can be. The same holds for manipulation of B-factors, trading high B-factors against reduced occupancy, and other (almost always purely cosmetic but still confusing or inconsistent) practices. Best, BR From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Nian Huang Sent: Saturday, March 31, 2012 11:29 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication I don't model zero occupancy in my model. But can't the refinement programs just treat those atoms with zero occupancy as missing atoms? Nian Huang On Sat, Mar 31, 2012 at 10:26 AM, Bosch, Juergen jubo...@jhsph.edumailto:jubo...@jhsph.edu wrote: really fascinating, bringing back the discussion for a repository for your collected frames. Jürgen Acta Cryst. (2012). F68, 366-376 doi:10.1107/S1744309112008421http://dx.doi.org/10.1107/S17443091120084 21 Detection and analysis of unusual features in the structural model and structure-factor data of a birch pollen allergen B. Rupphttp://scripts.iucr.org/cgi- bin/citedin?search_on=nameauthor_name=Rupp,%20B. Abstract: Physically improbable features in the model of the birch pollen structure Bet v 1d (PDB entry 3k78http://pdb.pdb.bnl.gov/pdb- bin/opdbshort?3k78) are faithfully reproduced in electron density generated with the deposited structure factors, but these structure factors themselves exhibit properties that are characteristic of data calculated from a simple model and are inconsistent with the data and error model obtained through experimental measurements. The refinement of the 3k78http://pdb.pdb.bnl.gov/pdb-bin/opdbshort?3k78model against these structure factors leads to an isomorphous structure different from the deposited model with an implausibly small R value (0.019). The abnormal refinement is compared with normal refinement of an isomorphous variant structure of Bet v 1l (PDB entry 1fm4http://pdb.pdb.bnl.gov/pdb- bin/opdbshort?1fm4). A variety of analytical tools, including the application of Diederichs plots, R plots and bulk-solvent analysis are discussed as promising aids in validation. The examination of the Bet v 1d structure also cautions against the practice of indicating poorly defined protein chain residues through zero occupancies. The recommendation to preserve diffraction images is amplified. .. Jürgen Bosch Johns Hopkins University Bloomberg School of Public Health Department of Biochemistry Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Office: +1-410-614-4742tel:%2B1-410-614-4742 Lab: +1-410-614-4894tel:%2B1-410-614-4894 Fax: +1-410-955-2926tel:%2B1-410-955-2926 http://web.mac.com/bosch_lab/
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Perhaps the world could use a few more principle investigators? A Buffalo view Sent via BlackBerry by ATT -Original Message- From: Gerard Bricogne g...@globalphasing.com Sender: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK Date: Sun, 1 Apr 2012 15:18:15 To: CCP4BB@JISCMAIL.AC.UK Reply-To: Gerard Bricogne g...@globalphasing.com Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication] Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. Related instances are * a phenomenon (singular) vs. several phenomena (plural), * a criterion (singular) vs. several criteria (plural) and many more. And then there is the infamous mix-up between principal (adjective) and principle (noun, as in Principle of Least Action, or Peter's Principle) giving rise to the favourite hero, the Principle Investigator. This phenomena is now so widespread that perhaps compliance with ancient Greek or Latin morphology is no longer a relevant criteria ;-) . With best wishes, Gerard. -- On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote: The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul. -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
You can find all the principle investigators you want collecting datums ;) at the ESRF, as that is how the French spell it on the application form for beam time! (Unless it has _finally_ been corrected: haven't checked since I submitted my last BAG application in April.) Adrian On 1 Apr 2012, at 17:52, George T. DeTitta wrote: Perhaps the world could use a few more principle investigators? A Buffalo view Sent via BlackBerry by ATT -Original Message- From: Gerard Bricogne g...@globalphasing.com Sender: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK Date: Sun, 1 Apr 2012 15:18:15 To: CCP4BB@JISCMAIL.AC.UK Reply-To: Gerard Bricogne g...@globalphasing.com Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication] Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. Related instances are * a phenomenon (singular) vs. several phenomena (plural), * a criterion (singular) vs. several criteria (plural) and many more. And then there is the infamous mix-up between principal (adjective) and principle (noun, as in Principle of Least Action, or Peter's Principle) giving rise to the favourite hero, the Principle Investigator. This phenomena is now so widespread that perhaps compliance with ancient Greek or Latin morphology is no longer a relevant criteria ;-) . With best wishes, Gerard. -- On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote: The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul. -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Hear, hear! I'm glad to know I'm not the last grump left standing. When I raise this point every year, my students regard me with bemused stares, as though they've just seen a coelacanth swim past their window... On 1 Apr 2012, at 10:18 AM, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. Related instances are * a phenomenon (singular) vs. several phenomena (plural), * a criterion (singular) vs. several criteria (plural) and many more. And then there is the infamous mix-up between principal (adjective) and principle (noun, as in Principle of Least Action, or Peter's Principle) giving rise to the favourite hero, the Principle Investigator. This phenomena is now so widespread that perhaps compliance with ancient Greek or Latin morphology is no longer a relevant criteria ;-) . With best wishes, Gerard. -- On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote: The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul. -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
another singular/plural grump: Recently we can read: phage are. Phage is singular, the plural is phages (and this does not have that much to do with latin or greek). more reading: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3109450/ Quoting Paul Emsley: The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoléculas Centro Nacional de Biotecnología - CSIC c/Darwin 3, Campus Cantoblanco 28049 Madrid tel. 91 585 4616 email: mjvanra...@cnb.csic.es
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Think the jury might be out on this one... A quick snip from WikiDictionary... The plural word phages refers to different types of phage, whereas in common usage the word phage can be both singular and plural, referring in the plural sense to particles of the same type of phage. Maloy et al: Microbial Genetics, 2nd ed., 1984 Tony. --- Mobile Account --- On 1 Apr 2012, at 16:29, VAN RAAIJ , MARK JOHAN mjvanra...@cnb.csic.esmailto:mjvanra...@cnb.csic.es wrote: another singular/plural grump: Recently we can read: phage are. Phage is singular, the plural is phages (and this does not have that much to do with latin or greek). more reading: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3109450/ Quoting Paul Emsley: The PDBe page for 3k78 says: The experimental data has been deposited the data cif file says: data is under question Grump. Is it to late to refer to data as if there were more than one of them? Anyway, the data mtz file is here if you want to refine with it: http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz Paul. Mark J van Raaij Laboratorio M-4 Dpto de Estructura de Macromoléculas Centro Nacional de Biotecnología - CSIC c/Darwin 3, Campus Cantoblanco 28049 Madrid tel. 91 585 4616 email: mjvanra...@cnb.csic.esmailto:mjvanra...@cnb.csic.es
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
Is it to late to refer to data as if there were more than one of them? Is it too late to explain the difference between to and too? --A much mellowed CD
Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]
On 04/01/12 10:18, Gerard Bricogne wrote: Dear Paul, May I join the mostly silent chorus of Greek/Latin-aware grumps who wince when seeing data treated as singular when it is plural. When it are plural? At any rate, I heard a Nobel laureate use it incorrectly just two days ago. -- === All Things Serve the Beam === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu