Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Phil Evans
On 8 Apr 2012, at 21:18, aaleshin wrote:

 What I suggested with respect to the PDB data validation was adding some 
 additional information that would allow to independently validate such 
 parameters as the resolution and data quality (catching of model fabrications 
 would be a byproduct of this process). Does the current system allow to 
 overestimate those parameters? I believe so (but I might be wrong, correct 
 me!). Periodically, people ask at ccp4bb how to determine the resolution of 
 their data, but some idiots may decide to do it on their own and add 30% of 
 noise to their structural factors. As James mentioned, one does not need to 
 be extremely smart to do so, moreover, such an idiot would have less 
 restraints than an educated crystallographer, because the idiot believes 
 that nobody would notice his cheating. His moral principles are not 
 corrupted, because he thinks that the model is correct and no harm is done. 
 But the harm is still there, because people are forced to believe the model 
 more than it deserves.  
 
 The question is still open to me about what percentage of PDB structures 
 overestimates data quality in terms of resolution. Is it possible to make it 
 less dependent on the opinion of persons submitting the data? We all have so 
 different opinions about everything...  
 
 Regards,
 Alex Aleshin

Using the weak high resolution data in a structure determination is not 
cheating. We should use data out to the point where there is no more 
significant and as long as it helps the structure determination and refinement, 
provided that we are using appropriate statistical treatment of the errors. We 
have become addicted to the idea that resolution is a single indicator of 
quality, and that is a gross over-simplification. Resolution tells us how many 
data were used, not their quality nor the quality of the model.

Phil

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread aaleshin
Thank you Phil, for clarification of my point, but it appears as cheating in a 
current situation, when an author has to fit a three dimensional statistics 
into a one-dimentional table. Moreover, many of journal reviewers may never 
worked with the low-resolution data and understand importance of every A^3 
counts. It is not clear to me how to report the resolution of data when it is 
3A in one direction, 3.5A in another and 5A in the third.

Alex

On Apr 9, 2012, at 4:51 AM, Phil Evans wrote:

 On 8 Apr 2012, at 21:18, aaleshin wrote:
 
 What I suggested with respect to the PDB data validation was adding some 
 additional information that would allow to independently validate such 
 parameters as the resolution and data quality (catching of model 
 fabrications would be a byproduct of this process). Does the current system 
 allow to overestimate those parameters? I believe so (but I might be wrong, 
 correct me!). Periodically, people ask at ccp4bb how to determine the 
 resolution of their data, but some idiots may decide to do it on their own 
 and add 30% of noise to their structural factors. As James mentioned, one 
 does not need to be extremely smart to do so, moreover, such an idiot 
 would have less restraints than an educated crystallographer, because the 
 idiot believes that nobody would notice his cheating. His moral principles 
 are not corrupted, because he thinks that the model is correct and no harm 
 is done. But the harm is still there, because people are forced to believe 
 the model more than it deserves.  
 
 The question is still open to me about what percentage of PDB structures 
 overestimates data quality in terms of resolution. Is it possible to make it 
 less dependent on the opinion of persons submitting the data? We all have so 
 different opinions about everything...  
 
 Regards,
 Alex Aleshin
 
 Using the weak high resolution data in a structure determination is not 
 cheating. We should use data out to the point where there is no more 
 significant and as long as it helps the structure determination and 
 refinement, provided that we are using appropriate statistical treatment of 
 the errors. We have become addicted to the idea that resolution is a single 
 indicator of quality, and that is a gross over-simplification. Resolution 
 tells us how many data were used, not their quality nor the quality of the 
 model.
 
 Phil


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Boaz Shaanan
How about such a footnote to Table 1: 

The resolution of data  is 3A in the a direction, 3.5A in b direction  and 5A 
in the c direction

Wouldn't this do the trick?

 Boaz

Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710






From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of aaleshin 
[aales...@burnham.org]
Sent: Monday, April 09, 2012 6:47 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

Thank you Phil, for clarification of my point, but it appears as cheating in a 
current situation, when an author has to fit a three dimensional statistics 
into a one-dimentional table. Moreover, many of journal reviewers may never 
worked with the low-resolution data and understand importance of every A^3 
counts. It is not clear to me how to report the resolution of data when it is 
3A in one direction, 3.5A in another and 5A in the third.

Alex

On Apr 9, 2012, at 4:51 AM, Phil Evans wrote:

 On 8 Apr 2012, at 21:18, aaleshin wrote:

 What I suggested with respect to the PDB data validation was adding some 
 additional information that would allow to independently validate such 
 parameters as the resolution and data quality (catching of model 
 fabrications would be a byproduct of this process). Does the current system 
 allow to overestimate those parameters? I believe so (but I might be wrong, 
 correct me!). Periodically, people ask at ccp4bb how to determine the 
 resolution of their data, but some idiots may decide to do it on their own 
 and add 30% of noise to their structural factors. As James mentioned, one 
 does not need to be extremely smart to do so, moreover, such an idiot 
 would have less restraints than an educated crystallographer, because the 
 idiot believes that nobody would notice his cheating. His moral principles 
 are not corrupted, because he thinks that the model is correct and no harm 
 is done. But the harm is still there, because people are forced to believe 
 the model more than it deserves.

 The question is still open to me about what percentage of PDB structures 
 overestimates data quality in terms of resolution. Is it possible to make it 
 less dependent on the opinion of persons submitting the data? We all have so 
 different opinions about everything...

 Regards,
 Alex Aleshin

 Using the weak high resolution data in a structure determination is not 
 cheating. We should use data out to the point where there is no more 
 significant and as long as it helps the structure determination and 
 refinement, provided that we are using appropriate statistical treatment of 
 the errors. We have become addicted to the idea that resolution is a single 
 indicator of quality, and that is a gross over-simplification. Resolution 
 tells us how many data were used, not their quality nor the quality of the 
 model.

 Phil


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Phil Evans
I've done that in papers

The more fundamental problem is in the end what we want to know are things like 
is residue 43 close to residue 146?, which side chains interact with the 
ligand? etc etc and resolution is only a very rough guide to the correctness 
of such conclusions


Phil

On 9 Apr 2012, at 17:32, Boaz Shaanan wrote:

 How about such a footnote to Table 1: 
 
 The resolution of data  is 3A in the a direction, 3.5A in b direction  and 
 5A in the c direction
 
 Wouldn't this do the trick?
 
 Boaz
 
 Boaz Shaanan, Ph.D.
 Dept. of Life Sciences
 Ben-Gurion University of the Negev
 Beer-Sheva 84105
 Israel
 
 E-mail: bshaa...@bgu.ac.il
 Phone: 972-8-647-2220  Skype: boaz.shaanan
 Fax:   972-8-647-2992 or 972-8-646-1710
 
 
 
 
 
 
 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of aaleshin 
 [aales...@burnham.org]
 Sent: Monday, April 09, 2012 6:47 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
 
 Thank you Phil, for clarification of my point, but it appears as cheating in 
 a current situation, when an author has to fit a three dimensional statistics 
 into a one-dimentional table. Moreover, many of journal reviewers may never 
 worked with the low-resolution data and understand importance of every A^3 
 counts. It is not clear to me how to report the resolution of data when it is 
 3A in one direction, 3.5A in another and 5A in the third.
 
 Alex
 
 On Apr 9, 2012, at 4:51 AM, Phil Evans wrote:
 
 On 8 Apr 2012, at 21:18, aaleshin wrote:
 
 What I suggested with respect to the PDB data validation was adding some 
 additional information that would allow to independently validate such 
 parameters as the resolution and data quality (catching of model 
 fabrications would be a byproduct of this process). Does the current system 
 allow to overestimate those parameters? I believe so (but I might be wrong, 
 correct me!). Periodically, people ask at ccp4bb how to determine the 
 resolution of their data, but some idiots may decide to do it on their 
 own and add 30% of noise to their structural factors. As James mentioned, 
 one does not need to be extremely smart to do so, moreover, such an idiot 
 would have less restraints than an educated crystallographer, because the 
 idiot believes that nobody would notice his cheating. His moral 
 principles are not corrupted, because he thinks that the model is correct 
 and no harm is done. But the harm is still there, because people are forced 
 to believe the model more than it deserves.
 
 The question is still open to me about what percentage of PDB structures 
 overestimates data quality in terms of resolution. Is it possible to make 
 it less dependent on the opinion of persons submitting the data? We all 
 have so different opinions about everything...
 
 Regards,
 Alex Aleshin
 
 Using the weak high resolution data in a structure determination is not 
 cheating. We should use data out to the point where there is no more 
 significant and as long as it helps the structure determination and 
 refinement, provided that we are using appropriate statistical treatment of 
 the errors. We have become addicted to the idea that resolution is a 
 single indicator of quality, and that is a gross over-simplification. 
 Resolution tells us how many data were used, not their quality nor the 
 quality of the model.
 
 Phil


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread David Schuller

On 04/09/12 12:32, Boaz Shaanan wrote:

How about such a footnote to Table 1:

The resolution of data  is 3A in the a direction, 3.5A in b direction  and 5A in 
the c direction

Wouldn't this do the trick?

Usually there's a requirement for a table of statistics, including 
completeness and R in the outer shell. In the case of anisotropic data, 
what constitutes the outer shell?


This is not a rhetorical question, I have some anisotropic data myself 
and will be facing these questions when it comes time to publish.


This looks like a good place to plug the UCLA MBI Diffraction Anisotropy 
Server, which I found to be useful:

http://services.mbi.ucla.edu/anisoscale/

Cheers,

--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edu


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread aaleshin
It is a wonderful server indeed, but its default setting cuts the resolution at 
3 sigma (if I remember correctly). It is too stringent in my opinion. Also, it 
is not clear to me whether to submit all data to the highest resolution point, 
or the data that come from the server? But then again, the question remains at 
what sigma level to cut them?

Aex

On Apr 9, 2012, at 9:46 AM, David Schuller wrote:

 On 04/09/12 12:32, Boaz Shaanan wrote:
 How about such a footnote to Table 1:
 
 The resolution of data  is 3A in the a direction, 3.5A in b direction  and 
 5A in the c direction
 
 Wouldn't this do the trick?
 
 Usually there's a requirement for a table of statistics, including 
 completeness and R in the outer shell. In the case of anisotropic data, 
 what constitutes the outer shell?
 
 This is not a rhetorical question, I have some anisotropic data myself 
 and will be facing these questions when it comes time to publish.
 
 This looks like a good place to plug the UCLA MBI Diffraction Anisotropy 
 Server, which I found to be useful:
 http://services.mbi.ucla.edu/anisoscale/
 
 Cheers,
 
 -- 
 ===
 All Things Serve the Beam
 ===
David J. Schuller
modern man in a post-modern world
MacCHESS, Cornell University
schul...@cornell.edu


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Richard Gillilan
On Apr 9, 2012, at 11:47 AM, aaleshin wrote:

 Thank you Phil, for clarification of my point, but it appears as cheating in 
 a current situation, when an author has to fit a three dimensional statistics 
 into a one-dimentional table. Moreover, many of journal reviewers may never 
 worked with the low-resolution data and understand importance of every A^3 
 counts. It is not clear to me how to report the resolution of data when it is 
 3A in one direction, 3.5A in another and 5A in the third.
 
 Alex
 

In the very low resolution world of SAXS, the whole idea of resolution is 
problematic. One can quote the minimum d-spacing (maximum angle) measured, but 
it is not a useful number to report.  People are much more concerned about the 
quality of the data at maximum d-spacing (lowest angle). Perhaps very 
low-resolution crystallography is starting to enter this regime as well in 
which resolution concerns are turned upside down. 

Granted, SAXS is a heavily averaged experiment which can densely sample q 
space,  but which does not even attempt to produce density. 
But the point I think that is appreciated in the SAXS community, is that the 
connection between extent of data in reciprocal space and model features is not 
simple.

Richard Gillilan
MacCHESS


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Pavel Afonine
Hi Alex,

 It is not clear to me how to report the resolution of data when it is 3A
 in one direction, 3.5A in another and 5A in the third.


can't be easier I guess: just switch from characterizing data sets with one
single number (which is suboptimal, at least, as Phil pointed out earlier)
and show statistics by resolution instead. For example, R-factors, data
completeness, Fobs shown in resolution bins are obviously much more
informative metrics then a single number.

If you want to be even more sophisticated, you can. See for example:

A program to analyze the distributions of unmeasured reflections
J. Appl. Cryst. (2011). 44, 865-872
L. Urzhumtseva and A. Urzhumtsev

Pavel


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread aaleshin
Hi Pavel,
Reporting the table that you suggested would create more red flags for the 
reviewers and readers than explaining how to understand the resolution of my 
data. We need more studies into this issue (correlation between the resolution 
of anisotropic data and model quality). And there should be a common rule how 
to report and interpret such data (IMHO).

Regards,
Alex

On Apr 9, 2012, at 11:02 AM, Pavel Afonine wrote:

 Hi Alex,
 
  It is not clear to me how to report the resolution of data when it is 3A in 
 one direction, 3.5A in another and 5A in the third.
 
 can't be easier I guess: just switch from characterizing data sets with one 
 single number (which is suboptimal, at least, as Phil pointed out earlier) 
 and show statistics by resolution instead. For example, R-factors, data 
 completeness, Fobs shown in resolution bins are obviously much more 
 informative metrics then a single number. 
 
 If you want to be even more sophisticated, you can. See for example:
 
 A program to analyze the distributions of unmeasured reflections
 J. Appl. Cryst. (2011). 44, 865-872
 L. Urzhumtseva and A. Urzhumtsev
 
 Pavel



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Pavel Afonine
Alex,

I think you are mixing two things here: presenting statistics that
characterizes the data and its interpretation.

Looking at data completeness as a single number tells something but not a
lot, while looking at these metrics per resolution reveals a whole lot more
information (for example, distribution of missing data in reciprocal space
may tell you why your maps look funny).

Pavel

On Mon, Apr 9, 2012 at 11:11 AM, aaleshin aales...@burnham.org wrote:

 Hi Pavel,
 Reporting the table that you suggested would create more red flags for the
 reviewers and readers than explaining how to understand the resolution of
 my data. We need more studies into this issue (correlation between the
 resolution of anisotropic data and model quality). And there should be a
 common rule how to report and interpret such data (IMHO).

 Regards,
 Alex

 On Apr 9, 2012, at 11:02 AM, Pavel Afonine wrote:

 Hi Alex,

  It is not clear to me how to report the resolution of data when it is 3A
 in one direction, 3.5A in another and 5A in the third.


 can't be easier I guess: just switch from characterizing data sets with
 one single number (which is suboptimal, at least, as Phil pointed out
 earlier) and show statistics by resolution instead. For example, R-factors,
 data completeness, Fobs shown in resolution bins are obviously much more
 informative metrics then a single number.

 If you want to be even more sophisticated, you can. See for example:

 A program to analyze the distributions of unmeasured reflections
 J. Appl. Cryst. (2011). 44, 865-872
 L. Urzhumtseva and A. Urzhumtsev

 Pavel





Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-09 Thread Felix Frolow
Or as tensor, see classic:
ANISOTROPIC SCALING OF 3-DIMENSIONAL INTENSITY DATA
Author(s): SHAKKED, Z (SHAKKED, Z)
Source: ACTA CRYSTALLOGRAPHICA SECTION A  Volume: 39   Issue: MAY   Pages: 
278-279   DOI: 10.1107/S0108767383000665  Published: 1983


I guess this or similar is implemented  in shelxl.
 Look also in :  J. F. Nye Physical Properties of Crystals: Their 
Representation by Tensors and Matrices


Dr Felix Frolow   
Professor of Structural Biology and Biotechnology
Department of Molecular Microbiology
and Biotechnology
Tel Aviv University 69978, Israel

Acta Crystallographica F, co-editor

e-mail: mbfro...@post.tau.ac.il
Tel:  ++972-3640-8723
Fax: ++972-3640-9407
Cellular: 0547 459 608

On Apr 9, 2012, at 21:02 , Pavel Afonine wrote:

 Hi Alex,
 
  It is not clear to me how to report the resolution of data when it is 3A in 
 one direction, 3.5A in another and 5A in the third.
 
 can't be easier I guess: just switch from characterizing data sets with one 
 single number (which is suboptimal, at least, as Phil pointed out earlier) 
 and show statistics by resolution instead. For example, R-factors, data 
 completeness, Fobs shown in resolution bins are obviously much more 
 informative metrics then a single number. 
 
 If you want to be even more sophisticated, you can. See for example:
 
 A program to analyze the distributions of unmeasured reflections
 J. Appl. Cryst. (2011). 44, 865-872
 L. Urzhumtseva and A. Urzhumtsev
 
 Pavel



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-08 Thread James Holton

On 4/2/2012 6:03 AM, herman.schreu...@sanofi.com wrote:

If James Holton had been involved, the fabrication would not have been
discovered.
Herman


Uhh.  Thanks.  I think?

Apologies for remaining uncharacteristically quiet.  I have been keeping 
up with the discussion, but not sure how much difference one more vote 
would make on the various issues.  Especially since most of this has 
come up before.  I agree that fraud is sick and wrong.  I think backing 
up your data is a good idea, etc. etc.  However, I seem to have been 
declared a leading expert on fake data, so I suppose I ought to say 
something about that.  Not quite sure I want to volunteer to be the 
Defense Against The Dark Arts Teacher (they always seem to end badly).  
But, here goes:


I think the core of the fraud problem lies in our need for models, and 
I mean models in the general scientific sense not just PDB files.  
Fundamental to the practice of science is coming up with a model that 
explains the observations you made, preferably to within experimental 
error.  One is also generally expected to estimate what the experimental 
error was.  That is, if you plot a bunch of points on a graph, you need 
to fit some sort of curve to them, and that curve had better fit to 
within the error bars, or you have some explaining to do.  Protein 
structures are really nothing more than a ~50,000 parameter curve fit to 
~50,000 data points.  So, given that the technology for constructing 
models is widely available (be it gnuplot or refmac), as is the 
technology for estimating errors and generating random numbers, all the 
hard work a would-be fraud needs to make a plausible forgery has already 
been done.  This is not something unique to crystallography!  It is a 
general property of any mature science.


Indeed, fake data, is not only a common tool in science but an 
inextricable part of it.  Simulated diffraction images appear in the 
literature at least as early as Arndt and Wonacott (1976), and I'm sure 
even Moseley and Darwin (1913) made some fake data when trying to 
figure out all the sources of systematic error they were dealing with 
measuring reflected x-ray beams.  At its heart, fake data is a 
control.  Remember controls from science class?  They come in two 
flavors: positive and negative, and you are supposed to have both.  In 
fact, all a fraud really is is someone who in some way, shape or form 
takes a positive control and calls it their experiment.  Pasting gel 
lanes together is an example of this.  I think this is why fraud is so 
hard to prevent in science.  You can't do science without controls, but 
anyone who has access to the technology for doing a control can also 
use it for evil.  The labels are everything.


  Personally, I classify fraud as an intentionally incorrect result.  
This separates it from unintentionally incorrect results (mistakes), 
which are far more common.  Validation is meant to catch the incorrect 
part, but can never be expected to establish intent!  In fact, I expect 
a mildly clever fraud might actually plan to hide behind the we made a 
mistake in the deposition/figure/paper but now can't find the original 
data defense.  The case at hand (Zaborsky et al. 2010) may be a very 
good example of this.  A new validation procedure (Rupp 2012) drew 
attention to the fabricated 3k78 structure as well as real structures 
where Fcalc was accidentally deposited instead Fobs (there are a number 
of these).  Rupp's follow-up on 3k78 found troubling irregularities, but 
could it still be a mistake?  If there is a combination of buttons in 
some GUI somewhere that lets you do this then I imagine at least one 
idiot may have discovered it.  Perhaps even pleased with themselves 
for finding a new way to get their R factor down. The best evidence 
that Fobs simply does not exist for 3k78 was in the response (Zaborsky 
et al. 2012).


The same validation procedure also drew attention to other cases.  Two 
of them 1n0r and 1n0q (Mosavi et al. 2002) were from my beamline (ALS 
8.3.1), so finding the original images was simply a matter of flipping 
through the books of old DVDs I have in my office.  They cost us $0.25 
each in 2002.  Yes, I do back up every image, primarily because figuring 
out which ones were worth backing up was actually a more expensive 
proposition.  Even in adjusted dollars, I think the cost of the whole 
archive is still cheaper than what it would have cost Dan to re-grow his 
crystals and collect the data again in 2012.  It is also nice to be able 
to say that the data for 1n0r were collected on Jan 30 2002 from 9:47 pm 
to 11:48 pm and 1n0q was collected on Mar 15 2002 from 12:52 pm until 
3:48 pm.  I was there!  I saw the whole thing!  Yes, I know, since I am 
the guy who can fake images I am not the best witness (the Defense 
Against the Dark Arts Teacher never is), but for whatever it is worth I 
DO recommend keeping your old images around.  You never know when a 
forgotten slip of 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-08 Thread aaleshin
Since I was the person who started a public outcry to do something, I shell 
explain myself to my critics. Similarly to all of you, I do not care much about 
those few instances of structure fabrication. I might put too much emphases on 
them to initiate the discussion, but they are, indeed, only tiny blips on the 
ocean of science. But, could they be tips of a huge iceberg? That was my 
concern. I believe that an enormous competition in science that we experience 
nowadays  makes many of us desperate, and desperation forces people to cheat.  
Is current validation system at PDB good enough to catch various aspects of 
data cheating? Is there a simple but efficient way to make it more difficult 
and, hence, less desirable? 

Good sportsmen (in terms of sport abilities) sometimes get caught with taking 
performance enhancers. I bet everyone would do it if the drug control did not 
exist. Many sportsmen would do it against their will, just because there was no 
other way to win. Do not you think a similar situation can develop in science? 

 I suppose as social animals we like to think we can trust and be trusted
Well, I suppose that these two antagonistic abilities of social animals (trust 
and cheating) developed in parallel as means to promote the evolution. In a 
very hierarchical society with no legal means to change a social status, 
cheating has been an important tool to contribute ones genes to a society. The 
socially unjust societies still exist and their members may have a slightly 
different view on morality of cheating than those from just societies. 
Moreover, ability to cheat often correlates with the intellect. Could not it be 
called cheating when someone is told to do something in one way, but he does it 
in his own way, because he believes it is more efficient? When a scientist 
feels that he is right about validity of his results, but they do not look good 
enough to be sold to validators, he is supposed to do more research. But he 
is out of time, why not to hide weak spots of the work if he knows that the 
major conclusions are RIGHT? Even if someone will redo the work later, they 
will be reproduced, right? In my opinion, this is the major motif for cheating 
in science.

What I suggested with respect to the PDB data validation was adding some 
additional information that would allow to independently validate such 
parameters as the resolution and data quality (catching of model fabrications 
would be a byproduct of this process). Does the current system allow to 
overestimate those parameters? I believe so (but I might be wrong, correct 
me!). Periodically, people ask at ccp4bb how to determine the resolution of 
their data, but some idiots may decide to do it on their own and add 30% of 
noise to their structural factors. As James mentioned, one does not need to be 
extremely smart to do so, moreover, such an idiot would have less restraints 
than an educated crystallographer, because the idiot believes that nobody 
would notice his cheating. His moral principles are not corrupted, because he 
thinks that the model is correct and no harm is done. But the harm is still 
there, because people are forced to believe the model more than it deserves.  

The question is still open to me about what percentage of PDB structures 
overestimates data quality in terms of resolution. Is it possible to make it 
less dependent on the opinion of persons submitting the data? We all have so 
different opinions about everything...  

People invented laws to create conditions when they can trust each other. 
Sociopaths who do not follow the rules get caught and excluded from a society, 
which maintains the trust. But when the trust is abused, it quickly disappears. 
Many of those who wrote on the matter expressed a strong opinion that the 
system is not broken and we should continue trusting each other. Great! I do 
not mind the status quo. 

Regards,
Alex Aleshin

On Apr 8, 2012, at 8:48 AM, James Holton wrote:

 On 4/2/2012 6:03 AM, herman.schreu...@sanofi.com wrote:
 If James Holton had been involved, the fabrication would not have been
 discovered.
 Herman
 
 Uhh.  Thanks.  I think?
 
 Apologies for remaining uncharacteristically quiet.  I have been keeping
 up with the discussion, but not sure how much difference one more vote
 would make on the various issues.  Especially since most of this has
 come up before.  I agree that fraud is sick and wrong.  I think backing
 up your data is a good idea, etc. etc.  However, I seem to have been
 declared a leading expert on fake data, so I suppose I ought to say
 something about that.  Not quite sure I want to volunteer to be the
 Defense Against The Dark Arts Teacher (they always seem to end badly).
 But, here goes:
 
 I think the core of the fraud problem lies in our need for models, and
 I mean models in the general scientific sense not just PDB files.
 Fundamental to the practice of science is coming up with a model that
 explains the observations you 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-08 Thread Bernhard Rupp (Hofkristallrat a.D.)
You never know when a forgotten slip of the mouse when using AutoDep ten
years ago will come back to haunt you.

On the paper James refers to and found the data, added mystery was that the
postdoc who may have slipped disappeared w/o much of  trace and the PI died.
Dan was the only survivor. Still they found the data.

BR


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-07 Thread Jrh
Dear Ron,

Quite so, and who cannot laugh at the Yes Minister perfect hospital ward 
operating theatre sketch ( Thankyou James W).

Anyway:-
Let's not get too hung up on one detail of your point 3. Your various points, 
including point 3, added several missing elements in this CCp4bb thread. 

Overall what I am saying is that to me it is good that my University at least 
is gearing up to provide a local Data Archive service which, since I wish to 
link my raw data sets in future to my publications via doi registrations, this 
will give a longevity to them that I cannot guarantee with a 'my raw data are 
in my desk drawer' approach. These could be useful in future reuse ie:- I see 
various improvements to understanding diffuse scattering and, secondly,  to 
squeezing more diffraction resolution out of the Bragg data as computer 
hardware and software both improve. Once in my career I nearly made a mistake 
on a space group choice ( I wrote it up as an educational story in 1996 in Acta 
D); if I had made that mistake the literature would finally have caught up i 
suppose and said :- where are the raw data, let's check that space group 
choice. This latter type of challenge of course is catchable via depositing 
processed Bragg data as triclinic; it probably doesn't need raw images. Finally 
I have a project that I have worked for some years on now to solve the 
structure; there are two, possibly several , crystal lattices and diffuse 
streaks. If  I have to finally give up I will make them available via doi on my 
a university raw data archive; meanwhile of course we make new protein and 
recrystallise etc, the other approach!

Greetings,
John

Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol.
Chair School of Chemistry, University of Manchester, Athena Swan Team.
http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html
 
 

On 6 Apr 2012, at 17:23, Ronald E Stenkamp stenk...@u.washington.edu wrote:

 Dear John,
 
 Your points are well taken and they're consistent with policies and practices 
 in the US as well.  
 
 I wonder about the nature of the employer's responsibility though.  I sit on 
 some university committees, and the impression I get is that much of the 
 time, the employers are interested in reducing their legal liabilities, not 
 protecting the integrity of science.  The end result is the same though in 
 that the employers get involved and oversee the handling of scientific 
 misconduct.  
 
 What is unclear to me is whether the system for dealing with misconduct is 
 broken.  It seems to work pretty well from my viewpoint.  No system is 
 perfect for identifying fraud, errors, etc, and I understand the idea that 
 improvements might be possible.  However, too many improvements might break 
 the system as well.
 
 Ron 
 
 On Fri, 6 Apr 2012, John R Helliwell wrote:
 
 Dear Ron,
 Re (3):-
 Yes of course the investigator has that responsibility.
 The additional point I would make is that the employer has a share in
 that responsibility. Indeed in such cases the employer university
 convenes a research fraud investigating committee to form the final
 judgement on continued employment.
 A research fraud policy, at least ours, also includes the need for
 avoding inadvertent loss of raw data, which is also deemed to be
 research malpractice.
 Thus the local data repository, with doi registration for data sets
 that underpin publication, seems to me and many others, ie in other
 research fields, a practical way forward for these data sets.
 It also allows the employer to properly serve the research
 investigations of its employees and be duely diligent to the research
 sponsors whose grants it accepts. That said there is a variation of
 funding that at least our UK agencies will commit to 'Data management
 plans'.
 Greetings,
 John
 
 
 
 2012/4/5 Ronald E Stenkamp stenk...@u.washington.edu:
 This discussion has been interesting, and it's provided an interesting 
 forum for those interested in dealing with fraud in science.  I've not 
 contributed anything to this thread, but the message from Alexander Aleshin 
 prodded me to say some things that I haven't heard expressed before.
 
 1.  The sky is not falling!  The errors in the birch pollen antigen pointed 
 out by Bernhard are interesting, and the reasons behind them might be 
 troubling.  However, the self-correcting functions of scientific research 
 found the errors, and current publication methods permitted an airing of 
 the problem.  It took some effort, but the scientific method prevailed.
 
 2.  Depositing raw data frames will make little difference in identifying 
 and correcting structural problems like this one.  Nor will new 
 requirements for deposition of this or that detail.  What's needed for 
 finding the problems is time and interest on the part of someone who's able 
 to look at a structure critically.  Deposition of additional information 
 could be important for that critical look, but deposition alone (at 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-07 Thread Eric Bennett
I doubt many people completely fail to archive data but maintaining data 
archives can be a pain so I'm not sure what the useful age of the average 
archive is.  Do people who archived to tape keep their tapes in a format that 
can be read by modern tape drives?  Do people who archived data to a hard drive 
10 years ago have something that can still read an Irix EFS-formatted SCSI hard 
drive today and, if not, did they bother to move the data to some other storage 
medium?

-Eric



On Apr 5, 2012, at 9:08 AM, Roger Rowlett wrote:

 FYI, every NSF grant proposal now must have a data management plan that 
 describes how all experimental data will be archived and in what formats. I'm 
 not sure how seriously these plans are monitored, but a plan must be provided 
 nevertheless. Is anyone NOT archiving their original data in some way?
 
 Roger Rowlett
 



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-06 Thread Ronald E Stenkamp
Dear John,

Your points are well taken and they're consistent with policies and practices 
in the US as well.  

I wonder about the nature of the employer's responsibility though.  I sit on 
some university committees, and the impression I get is that much of the time, 
the employers are interested in reducing their legal liabilities, not 
protecting the integrity of science.  The end result is the same though in that 
the employers get involved and oversee the handling of scientific misconduct.  

What is unclear to me is whether the system for dealing with misconduct is 
broken.  It seems to work pretty well from my viewpoint.  No system is perfect 
for identifying fraud, errors, etc, and I understand the idea that improvements 
might be possible.  However, too many improvements might break the system as 
well.

Ron 

On Fri, 6 Apr 2012, John R Helliwell wrote:

 Dear Ron,
 Re (3):-
 Yes of course the investigator has that responsibility.
 The additional point I would make is that the employer has a share in
 that responsibility. Indeed in such cases the employer university
 convenes a research fraud investigating committee to form the final
 judgement on continued employment.
 A research fraud policy, at least ours, also includes the need for
 avoding inadvertent loss of raw data, which is also deemed to be
 research malpractice.
 Thus the local data repository, with doi registration for data sets
 that underpin publication, seems to me and many others, ie in other
 research fields, a practical way forward for these data sets.
 It also allows the employer to properly serve the research
 investigations of its employees and be duely diligent to the research
 sponsors whose grants it accepts. That said there is a variation of
 funding that at least our UK agencies will commit to 'Data management
 plans'.
 Greetings,
 John



 2012/4/5 Ronald E Stenkamp stenk...@u.washington.edu:
 This discussion has been interesting, and it's provided an interesting forum 
 for those interested in dealing with fraud in science.  I've not contributed 
 anything to this thread, but the message from Alexander Aleshin prodded me 
 to say some things that I haven't heard expressed before.

 1.  The sky is not falling!  The errors in the birch pollen antigen pointed 
 out by Bernhard are interesting, and the reasons behind them might be 
 troubling.  However, the self-correcting functions of scientific research 
 found the errors, and current publication methods permitted an airing of the 
 problem.  It took some effort, but the scientific method prevailed.

 2.  Depositing raw data frames will make little difference in identifying 
 and correcting structural problems like this one.  Nor will new requirements 
 for deposition of this or that detail.  What's needed for finding the 
 problems is time and interest on the part of someone who's able to look at a 
 structure critically.  Deposition of additional information could be 
 important for that critical look, but deposition alone (at least with 
 today's software) will not be sufficient to find incorrect structures.

 3.  The responsibility for a fraudulent or wrong or poorly-determined 
 structure lies with the investigator, not the society of crystallographers.  
 My political leanings are left-of-central, but I still believe in individual 
 responsibility for behavior and actions.  If someone messes up a structure, 
 they're accountable for the results.

 4.  Adding to the deposition requirements will not make our science more 
 efficient.  Perhaps it's different in other countries, but the 
 administrative burden for doing research in the United States is growing.  
 It would be interesting to know the balance between the waste that comes 
 from a wrong structure and the waste that comes from having each of us deal 
 with additional deposition requirements.

 5.  The real danger that arises from cases of wrong or fraudulent science is 
 that it erodes the trust we have in each others results.  No one has time or 
 resources to check everything, so science is based on trust.  There are 
 efforts underway outside crystallographic circles to address this larger 
 threat to all science, and we should be participating in those discussions 
 as much as possible.

 Ron

 On Thu, 5 Apr 2012, aaleshin wrote:

 Dear John,Thank you for a very informative letter about the IUCr activities 
 towards archiving the experimental
 data. I feel that I did not explain myself properly. I do not object 
 archiving the raw data, I just believe
 that current methodology of validating data at PDB is insufficiently robust 
 and requires a modification.
 Implementation of the raw image storage and validation will take a 
 considerable time, while the recent
 incidents of a presumable data frauds demonstrate that the issue is urgent. 
 Moreover, presenting the
 calculated structural factors in place of the experimental data is not the 
 only abuse that the current
 validation procedure encourages 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-06 Thread Patrick Loll
Ron makes an excellent point. Many institutions devote far more energy to 
limiting risk than to doing the right thing. This leads administrators to a 
frightening, but logical conclusion: The less science we do, the less chance of 
our doing something that could invite a penalty on the university. This 
translates into rules intended to head off bad behavior, but which in fact make 
it more difficult to do honest science, and increase the administrative burden 
(our IT group has already made great strides in this direction--if you can't 
connect to the network, then you can't use it to violate HIPAA!).
So I agree that we should be cautious about improvements.
Pat


On 6 Apr 2012, at 12:23 PM, Ronald E Stenkamp wrote:

 Dear John,
 
 Your points are well taken and they're consistent with policies and practices 
 in the US as well.  
 
 I wonder about the nature of the employer's responsibility though.  I sit on 
 some university committees, and the impression I get is that much of the 
 time, the employers are interested in reducing their legal liabilities, not 
 protecting the integrity of science.  The end result is the same though in 
 that the employers get involved and oversee the handling of scientific 
 misconduct.  
 
 What is unclear to me is whether the system for dealing with misconduct is 
 broken.  It seems to work pretty well from my viewpoint.  No system is 
 perfect for identifying fraud, errors, etc, and I understand the idea that 
 improvements might be possible.  However, too many improvements might break 
 the system as well.
 
 Ron 
 

---
Patrick J. Loll, Ph. D.  
Professor of Biochemistry  Molecular Biology
Director, Biochemistry Graduate Program
Drexel University College of Medicine
Room 10-102 New College Building
245 N. 15th St., Mailstop 497
Philadelphia, PA  19102-1192  USA

(215) 762-7706
pat.l...@drexelmed.edu


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-06 Thread James Whisstock
As famously observed by the Yes Minister teamthe dream outcome for any 
organization:

http://www.youtube.com/watch?v=x-5zEb1oS9Afeature=youtube_gdata_player

J
Sent from my iPhone

On 07/04/2012, at 3:16 AM, Patrick Loll pat.l...@drexel.edu wrote:

 Ron makes an excellent point. Many institutions devote far more energy to 
 limiting risk than to doing the right thing. This leads administrators to a 
 frightening, but logical conclusion: The less science we do, the less chance 
 of our doing something that could invite a penalty on the university. This 
 translates into rules intended to head off bad behavior, but which in fact 
 make it more difficult to do honest science, and increase the administrative 
 burden (our IT group has already made great strides in this direction--if you 
 can't connect to the network, then you can't use it to violate HIPAA!).
 So I agree that we should be cautious about improvements.
 Pat
 
 
 On 6 Apr 2012, at 12:23 PM, Ronald E Stenkamp wrote:
 
 Dear John,
 
 Your points are well taken and they're consistent with policies and 
 practices in the US as well.  
 
 I wonder about the nature of the employer's responsibility though.  I sit on 
 some university committees, and the impression I get is that much of the 
 time, the employers are interested in reducing their legal liabilities, not 
 protecting the integrity of science.  The end result is the same though in 
 that the employers get involved and oversee the handling of scientific 
 misconduct.  
 
 What is unclear to me is whether the system for dealing with misconduct is 
 broken.  It seems to work pretty well from my viewpoint.  No system is 
 perfect for identifying fraud, errors, etc, and I understand the idea that 
 improvements might be possible.  However, too many improvements might 
 break the system as well.
 
 Ron 
 
 
 ---
 Patrick J. Loll, Ph. D.  
 Professor of Biochemistry  Molecular Biology
 Director, Biochemistry Graduate Program
 Drexel University College of Medicine
 Room 10-102 New College Building
 245 N. 15th St., Mailstop 497
 Philadelphia, PA  19102-1192  USA
 
 (215) 762-7706
 pat.l...@drexelmed.edu


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread John R Helliwell
Dear 'aales...@burnham.org',

Re the pixel detector; yes this is an acknowledged raw data archiving
challenge; possible technical solutions include:- summing to make
coarser images ie in angular range, lossless compression (nicely
described on this CCP4bb by James Holton) or preserving a sufficient
sample of data(but nb this debate is certainly not yet concluded).

Re And all this hassle is for the only real purpose of preventing data fraud?

Well.Why publish data?
Please let me offer some reasons:
• To enhance the reproducibility of a scientific experiment
• To verify or support the validity of deductions from an experiment
• To safeguard against error
• To allow other scholars to conduct further research based on
experiments already conducted
• To allow reanalysis at a later date, especially to extract 'new'
science as new techniques are developed
• To provide example materials for teaching and learning
• To provide long-term preservation of experimental results and future
access to them
• To permit systematic collection for comparative studies
• And, yes, To better safeguard against fraud than is apparently the
case at present

Also to (probably) comply with your funding agency's grant conditions:-
Increasingly, funding agencies are requesting or requiring data
management policies (including provision for retention and access) to
be taken into account when awarding grants. See e.g. the Research
Councils UK Common Principles on Data Policy
(http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
Curation Centre overview of funding policies in the UK
(http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion
on policies relevant to crystallography in other countries. Nb these
policies extend over derived, processed and raw data, ie without
really an adequate clarity of policy from one to the other stages of
the 'data pyramid' ((see
http://www.stm-assoc.org/integration-of-data-and-publications).


And just to mention IUCr Journals Notes for Authors for biological
macromolecular structures, where we have our ie macromolecular
crystallography's version of the 'data pyramid' :-

(1) Derived data
• Atomic coordinates, anisotropic or isotropic displacement
parameters, space group information, secondary structure and
information about biological functionality must be deposited with the
Protein Data Bank before or in concert with article publication; the
article will link to the PDB deposition using the PDB reference code.
• Relevant experimental parameters, unit-cell dimensions are required
as an integral part of article submission and are published within the
article.

(2) Processed experimental data
• Structure factors must be deposited with the Protein Data Bank
before or in concert with article publication; the article will link
to the PDB deposition using the PDB reference code.

(3) Primary experimental data (here I give small and macromolecule
Notes for Authors details):-
For small-unit-cell crystal/molecular structures and macromolecular
structures IUCr journals have no current binding policy regarding
publication of diffraction images or similar raw data entities.
However, the journals welcome efforts made to preserve and provide
primary experimental data sets. Authors are encouraged to make
arrangements for the diffraction data images for their structure to be
archived and available on request.
For articles that present the results of powder diffraction profile
fitting or refinement (Rietveld) methods, the primary diffraction
data, i.e. the numerical intensity of each measured point on the
profile as a function of scattering angle, should be deposited.
Fibre data should contain appropriate information such as a photograph
of the data. As primary diffraction data cannot be satisfactorily
extracted from such figures, the basic digital diffraction data should
be deposited.


Finally to mention that many IUCr Commissions are interested in the
possibility of establishing community practices for the orderly
retention and referencing of raw data sets, and the IUCr would like to
see such data sets become part of the routine record of scientific
research in the future, to the extent that this proves feasible and
cost-effective.
I draw your attention therefore to the IUCr Forum on such matters at:-
http://forums.iucr.org/
Within this Forum you can find for example the ICSU convened Strategic
Coordinating Committee on Information and Data fairly recent report;
within this we learn of many other areas of science efforts on data
archiving and eg that the radio astronomy square kilometre array will
pose the biggest raw data archiving challenge on the planet.[Our needs
are thereby relatively modest.]

The IUCr Diffraction Data Deposition Working Group is actively
addressing all these various issues.
We weclome your input at the IUCr Forum, which will thereby be most
timely. Thankyou.

Best wishes,
Yours 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Herbert J. Bernstein

Dear Colleagues,

  Clearly, no system will be able to perfectly preserve every pixel of
every dataset collected at a cost that can be afforded.  Resources are
finite and we must set priorities.  I would suggest that, in order
of declining priority, we try our best to retain:

  1.  raw data that might tend to refute published results
  2.  raw data that might tend to support published results
  3.  raw data that may be of significant use in currently
ongoing studies either in refutation or support
  4.  raw data that may be of significant use in future
studies

While no archiving system can be perfect, we should not let the
search for a perfect solution prevent us from working with
currently available good solutions, and even in this era of tight
budgets, there are good solutions.

  Regards,
Herbert

On 4/5/12 7:16 AM, John R Helliwell wrote:

Dear 'aales...@burnham.org',

Re the pixel detector; yes this is an acknowledged raw data archiving
challenge; possible technical solutions include:- summing to make
coarser images ie in angular range, lossless compression (nicely
described on this CCP4bb by James Holton) or preserving a sufficient
sample of data(but nb this debate is certainly not yet concluded).

Re And all this hassle is for the only real purpose of preventing data fraud?

Well.Why publish data?
Please let me offer some reasons:
• To enhance the reproducibility of a scientific experiment
• To verify or support the validity of deductions from an experiment
• To safeguard against error
• To allow other scholars to conduct further research based on
experiments already conducted
• To allow reanalysis at a later date, especially to extract 'new'
science as new techniques are developed
• To provide example materials for teaching and learning
• To provide long-term preservation of experimental results and future
access to them
• To permit systematic collection for comparative studies
• And, yes, To better safeguard against fraud than is apparently the
case at present

Also to (probably) comply with your funding agency's grant conditions:-
Increasingly, funding agencies are requesting or requiring data
management policies (including provision for retention and access) to
be taken into account when awarding grants. See e.g. the Research
Councils UK Common Principles on Data Policy
(http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
Curation Centre overview of funding policies in the UK
(http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion
on policies relevant to crystallography in other countries. Nb these
policies extend over derived, processed and raw data, ie without
really an adequate clarity of policy from one to the other stages of
the 'data pyramid' ((see
http://www.stm-assoc.org/integration-of-data-and-publications).


And just to mention IUCr Journals Notes for Authors for biological
macromolecular structures, where we have our ie macromolecular
crystallography's version of the 'data pyramid' :-

(1) Derived data
• Atomic coordinates, anisotropic or isotropic displacement
parameters, space group information, secondary structure and
information about biological functionality must be deposited with the
Protein Data Bank before or in concert with article publication; the
article will link to the PDB deposition using the PDB reference code.
• Relevant experimental parameters, unit-cell dimensions are required
as an integral part of article submission and are published within the
article.

(2) Processed experimental data
• Structure factors must be deposited with the Protein Data Bank
before or in concert with article publication; the article will link
to the PDB deposition using the PDB reference code.

(3) Primary experimental data (here I give small and macromolecule
Notes for Authors details):-
For small-unit-cell crystal/molecular structures and macromolecular
structures IUCr journals have no current binding policy regarding
publication of diffraction images or similar raw data entities.
However, the journals welcome efforts made to preserve and provide
primary experimental data sets. Authors are encouraged to make
arrangements for the diffraction data images for their structure to be
archived and available on request.
For articles that present the results of powder diffraction profile
fitting or refinement (Rietveld) methods, the primary diffraction
data, i.e. the numerical intensity of each measured point on the
profile as a function of scattering angle, should be deposited.
Fibre data should contain appropriate information such as a photograph
of the data. As primary diffraction data cannot be satisfactorily
extracted from such figures, the basic digital diffraction data should
be deposited.


Finally to mention that many IUCr Commissions are interested in the
possibility of establishing community practices for the orderly
retention and referencing of raw data 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Roger Rowlett
FYI, every NSF grant proposal now must have a data management plan that
describes how all experimental data will be archived and in what formats.
I'm not sure how seriously these plans are monitored, but a plan must be
provided nevertheless. Is anyone NOT archiving their original data in some
way?

Roger Rowlett
On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.com wrote:

 Dear 'aales...@burnham.org',

 Re the pixel detector; yes this is an acknowledged raw data archiving
 challenge; possible technical solutions include:- summing to make
 coarser images ie in angular range, lossless compression (nicely
 described on this CCP4bb by James Holton) or preserving a sufficient
 sample of data(but nb this debate is certainly not yet concluded).

 Re And all this hassle is for the only real purpose of preventing data
 fraud?

 Well.Why publish data?
 Please let me offer some reasons:
 • To enhance the reproducibility of a scientific experiment
 • To verify or support the validity of deductions from an experiment
 • To safeguard against error
 • To allow other scholars to conduct further research based on
 experiments already conducted
 • To allow reanalysis at a later date, especially to extract 'new'
 science as new techniques are developed
 • To provide example materials for teaching and learning
 • To provide long-term preservation of experimental results and future
 access to them
 • To permit systematic collection for comparative studies
 • And, yes, To better safeguard against fraud than is apparently the
 case at present

 Also to (probably) comply with your funding agency's grant conditions:-
 Increasingly, funding agencies are requesting or requiring data
 management policies (including provision for retention and access) to
 be taken into account when awarding grants. See e.g. the Research
 Councils UK Common Principles on Data Policy
 (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
 Curation Centre overview of funding policies in the UK
 (
 http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
 ).
 See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion
 on policies relevant to crystallography in other countries. Nb these
 policies extend over derived, processed and raw data, ie without
 really an adequate clarity of policy from one to the other stages of
 the 'data pyramid' ((see
 http://www.stm-assoc.org/integration-of-data-and-publications).


 And just to mention IUCr Journals Notes for Authors for biological
 macromolecular structures, where we have our ie macromolecular
 crystallography's version of the 'data pyramid' :-

 (1) Derived data
 • Atomic coordinates, anisotropic or isotropic displacement
 parameters, space group information, secondary structure and
 information about biological functionality must be deposited with the
 Protein Data Bank before or in concert with article publication; the
 article will link to the PDB deposition using the PDB reference code.
 • Relevant experimental parameters, unit-cell dimensions are required
 as an integral part of article submission and are published within the
 article.

 (2) Processed experimental data
 • Structure factors must be deposited with the Protein Data Bank
 before or in concert with article publication; the article will link
 to the PDB deposition using the PDB reference code.

 (3) Primary experimental data (here I give small and macromolecule
 Notes for Authors details):-
 For small-unit-cell crystal/molecular structures and macromolecular
 structures IUCr journals have no current binding policy regarding
 publication of diffraction images or similar raw data entities.
 However, the journals welcome efforts made to preserve and provide
 primary experimental data sets. Authors are encouraged to make
 arrangements for the diffraction data images for their structure to be
 archived and available on request.
 For articles that present the results of powder diffraction profile
 fitting or refinement (Rietveld) methods, the primary diffraction
 data, i.e. the numerical intensity of each measured point on the
 profile as a function of scattering angle, should be deposited.
 Fibre data should contain appropriate information such as a photograph
 of the data. As primary diffraction data cannot be satisfactorily
 extracted from such figures, the basic digital diffraction data should
 be deposited.


 Finally to mention that many IUCr Commissions are interested in the
 possibility of establishing community practices for the orderly
 retention and referencing of raw data sets, and the IUCr would like to
 see such data sets become part of the routine record of scientific
 research in the future, to the extent that this proves feasible and
 cost-effective.
 I draw your attention therefore to the IUCr Forum on such matters at:-
 http://forums.iucr.org/
 Within this Forum you can find for example the ICSU convened Strategic
 Coordinating Committee on Information and Data 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Bosch, Juergen
I would say everybody keeps probably too many junk datasets around - at least I 
do. And I run into the trouble of having to buy new TB plates every now and 
then.
I think on average per year my group acquires currently ~700 GB of raw images 
(compressed), now if we were to only keep the useful datasets we probably would 
be down to 10% of that. But as always you hope for the best and keep some data 
considered junk in 2009 which might be useful in 2015.

Jürgen

On Apr 5, 2012, at 9:08 AM, Roger Rowlett wrote:


FYI, every NSF grant proposal now must have a data management plan that 
describes how all experimental data will be archived and in what formats. I'm 
not sure how seriously these plans are monitored, but a plan must be provided 
nevertheless. Is anyone NOT archiving their original data in some way?

Roger Rowlett

On Apr 5, 2012 7:16 AM, John R Helliwell 
jrhelliw...@gmail.commailto:jrhelliw...@gmail.com wrote:
Dear 'aales...@burnham.orgmailto:aales...@burnham.org',

Re the pixel detector; yes this is an acknowledged raw data archiving
challenge; possible technical solutions include:- summing to make
coarser images ie in angular range, lossless compression (nicely
described on this CCP4bb by James Holton) or preserving a sufficient
sample of data(but nb this debate is certainly not yet concluded).

Re And all this hassle is for the only real purpose of preventing data fraud?

Well.Why publish data?
Please let me offer some reasons:
• To enhance the reproducibility of a scientific experiment
• To verify or support the validity of deductions from an experiment
• To safeguard against error
• To allow other scholars to conduct further research based on
experiments already conducted
• To allow reanalysis at a later date, especially to extract 'new'
science as new techniques are developed
• To provide example materials for teaching and learning
• To provide long-term preservation of experimental results and future
access to them
• To permit systematic collection for comparative studies
• And, yes, To better safeguard against fraud than is apparently the
case at present

Also to (probably) comply with your funding agency's grant conditions:-
Increasingly, funding agencies are requesting or requiring data
management policies (including provision for retention and access) to
be taken into account when awarding grants. See e.g. the Research
Councils UK Common Principles on Data Policy
(http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
Curation Centre overview of funding policies in the UK
(http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion
on policies relevant to crystallography in other countries. Nb these
policies extend over derived, processed and raw data, ie without
really an adequate clarity of policy from one to the other stages of
the 'data pyramid' ((see
http://www.stm-assoc.org/integration-of-data-and-publications).


And just to mention IUCr Journals Notes for Authors for biological
macromolecular structures, where we have our ie macromolecular
crystallography's version of the 'data pyramid' :-

(1) Derived data
• Atomic coordinates, anisotropic or isotropic displacement
parameters, space group information, secondary structure and
information about biological functionality must be deposited with the
Protein Data Bank before or in concert with article publication; the
article will link to the PDB deposition using the PDB reference code.
• Relevant experimental parameters, unit-cell dimensions are required
as an integral part of article submission and are published within the
article.

(2) Processed experimental data
• Structure factors must be deposited with the Protein Data Bank
before or in concert with article publication; the article will link
to the PDB deposition using the PDB reference code.

(3) Primary experimental data (here I give small and macromolecule
Notes for Authors details):-
For small-unit-cell crystal/molecular structures and macromolecular
structures IUCr journals have no current binding policy regarding
publication of diffraction images or similar raw data entities.
However, the journals welcome efforts made to preserve and provide
primary experimental data sets. Authors are encouraged to make
arrangements for the diffraction data images for their structure to be
archived and available on request.
For articles that present the results of powder diffraction profile
fitting or refinement (Rietveld) methods, the primary diffraction
data, i.e. the numerical intensity of each measured point on the
profile as a function of scattering angle, should be deposited.
Fibre data should contain appropriate information such as a photograph
of the data. As primary diffraction data cannot be satisfactorily
extracted from such figures, the basic digital diffraction data should
be deposited.


Finally to mention that many IUCr Commissions are interested in the

[ccp4bb] Category 4 Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Jrh
Dear Herbert,
Category 4, in Manchester, we find is tricky, for want of a better word. 
Needless to say that we have collaborators on our Crystallography Research 
Service who request data sets from eg ten years ago, that are now urgent for 
publication writing up. So we are keeping everything, although only recent 
years the raw diffraction images, and nb soon to be assisted by the Univ 
Manchester centralised Data Repository for its researchers. (Incidentally I 
have kept all of my film oscillation, and inc later Laue data, back to approx 
1977, which fills a whole wall shelf worth, ~ 10 metres.)
Greetings,
John

Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol.
Chair School of Chemistry, University of Manchester, Athena Swan Team.
http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html
 
 

On 5 Apr 2012, at 13:50, Herbert J. Bernstein y...@bernstein-plus-sons.com 
wrote:

 Dear Colleagues,
 
  Clearly, no system will be able to perfectly preserve every pixel of
 every dataset collected at a cost that can be afforded.  Resources are
 finite and we must set priorities.  I would suggest that, in order
 of declining priority, we try our best to retain:
 
  1.  raw data that might tend to refute published results
  2.  raw data that might tend to support published results
  3.  raw data that may be of significant use in currently
 ongoing studies either in refutation or support
  4.  raw data that may be of significant use in future
 studies
 
 While no archiving system can be perfect, we should not let the
 search for a perfect solution prevent us from working with
 currently available good solutions, and even in this era of tight
 budgets, there are good solutions.
 
  Regards,
Herbert
 
 On 4/5/12 7:16 AM, John R Helliwell wrote:
 Dear 'aales...@burnham.org',
 
 Re the pixel detector; yes this is an acknowledged raw data archiving
 challenge; possible technical solutions include:- summing to make
 coarser images ie in angular range, lossless compression (nicely
 described on this CCP4bb by James Holton) or preserving a sufficient
 sample of data(but nb this debate is certainly not yet concluded).
 
 Re And all this hassle is for the only real purpose of preventing data 
 fraud?
 
 Well.Why publish data?
 Please let me offer some reasons:
 • To enhance the reproducibility of a scientific experiment
 • To verify or support the validity of deductions from an experiment
 • To safeguard against error
 • To allow other scholars to conduct further research based on
 experiments already conducted
 • To allow reanalysis at a later date, especially to extract 'new'
 science as new techniques are developed
 • To provide example materials for teaching and learning
 • To provide long-term preservation of experimental results and future
 access to them
 • To permit systematic collection for comparative studies
 • And, yes, To better safeguard against fraud than is apparently the
 case at present
 
 Also to (probably) comply with your funding agency's grant conditions:-
 Increasingly, funding agencies are requesting or requiring data
 management policies (including provision for retention and access) to
 be taken into account when awarding grants. See e.g. the Research
 Councils UK Common Principles on Data Policy
 (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
 Curation Centre overview of funding policies in the UK
 (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
 See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion
 on policies relevant to crystallography in other countries. Nb these
 policies extend over derived, processed and raw data, ie without
 really an adequate clarity of policy from one to the other stages of
 the 'data pyramid' ((see
 http://www.stm-assoc.org/integration-of-data-and-publications).
 
 
 And just to mention IUCr Journals Notes for Authors for biological
 macromolecular structures, where we have our ie macromolecular
 crystallography's version of the 'data pyramid' :-
 
 (1) Derived data
 • Atomic coordinates, anisotropic or isotropic displacement
 parameters, space group information, secondary structure and
 information about biological functionality must be deposited with the
 Protein Data Bank before or in concert with article publication; the
 article will link to the PDB deposition using the PDB reference code.
 • Relevant experimental parameters, unit-cell dimensions are required
 as an integral part of article submission and are published within the
 article.
 
 (2) Processed experimental data
 • Structure factors must be deposited with the Protein Data Bank
 before or in concert with article publication; the article will link
 to the PDB deposition using the PDB reference code.
 
 (3) Primary experimental data (here I give small and macromolecule
 Notes for Authors details):-
 For small-unit-cell crystal/molecular structures and macromolecular
 structures IUCr journals have no 

[ccp4bb] Via Annual Reports...Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Jrh
Dear Roger,
At the recent ICSTI Workshop on Delivering Data in science the NSF presenter, 
when I asked about monitoring, replied that the PIs' annual reports should 
include data management aspects.
See http://www.icsti.org/spip.php?rubrique42
Best wishes,
John

Prof John R Helliwell DSc FInstP CPhys FRSC CChem F Soc Biol.
Chair School of Chemistry, University of Manchester, Athena Swan Team.
http://www.chemistry.manchester.ac.uk/aboutus/athena/index.html
 
 

On 5 Apr 2012, at 14:08, Roger Rowlett rrowl...@colgate.edu wrote:

 FYI, every NSF grant proposal now must have a data management plan that 
 describes how all experimental data will be archived and in what formats. I'm 
 not sure how seriously these plans are monitored, but a plan must be provided 
 nevertheless. Is anyone NOT archiving their original data in some way?
 
 Roger Rowlett
 
 On Apr 5, 2012 7:16 AM, John R Helliwell jrhelliw...@gmail.com wrote:
 Dear 'aales...@burnham.org',
 
 Re the pixel detector; yes this is an acknowledged raw data archiving
 challenge; possible technical solutions include:- summing to make
 coarser images ie in angular range, lossless compression (nicely
 described on this CCP4bb by James Holton) or preserving a sufficient
 sample of data(but nb this debate is certainly not yet concluded).
 
 Re And all this hassle is for the only real purpose of preventing data 
 fraud?
 
 Well.Why publish data?
 Please let me offer some reasons:
 • To enhance the reproducibility of a scientific experiment
 • To verify or support the validity of deductions from an experiment
 • To safeguard against error
 • To allow other scholars to conduct further research based on
 experiments already conducted
 • To allow reanalysis at a later date, especially to extract 'new'
 science as new techniques are developed
 • To provide example materials for teaching and learning
 • To provide long-term preservation of experimental results and future
 access to them
 • To permit systematic collection for comparative studies
 • And, yes, To better safeguard against fraud than is apparently the
 case at present
 
 Also to (probably) comply with your funding agency's grant conditions:-
 Increasingly, funding agencies are requesting or requiring data
 management policies (including provision for retention and access) to
 be taken into account when awarding grants. See e.g. the Research
 Councils UK Common Principles on Data Policy
 (http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx) and the Digital
 Curation Centre overview of funding policies in the UK
 (http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies).
 See also http://forums.iucr.org/viewtopic.php?f=21t=58 for discussion
 on policies relevant to crystallography in other countries. Nb these
 policies extend over derived, processed and raw data, ie without
 really an adequate clarity of policy from one to the other stages of
 the 'data pyramid' ((see
 http://www.stm-assoc.org/integration-of-data-and-publications).
 
 
 And just to mention IUCr Journals Notes for Authors for biological
 macromolecular structures, where we have our ie macromolecular
 crystallography's version of the 'data pyramid' :-
 
 (1) Derived data
 • Atomic coordinates, anisotropic or isotropic displacement
 parameters, space group information, secondary structure and
 information about biological functionality must be deposited with the
 Protein Data Bank before or in concert with article publication; the
 article will link to the PDB deposition using the PDB reference code.
 • Relevant experimental parameters, unit-cell dimensions are required
 as an integral part of article submission and are published within the
 article.
 
 (2) Processed experimental data
 • Structure factors must be deposited with the Protein Data Bank
 before or in concert with article publication; the article will link
 to the PDB deposition using the PDB reference code.
 
 (3) Primary experimental data (here I give small and macromolecule
 Notes for Authors details):-
 For small-unit-cell crystal/molecular structures and macromolecular
 structures IUCr journals have no current binding policy regarding
 publication of diffraction images or similar raw data entities.
 However, the journals welcome efforts made to preserve and provide
 primary experimental data sets. Authors are encouraged to make
 arrangements for the diffraction data images for their structure to be
 archived and available on request.
 For articles that present the results of powder diffraction profile
 fitting or refinement (Rietveld) methods, the primary diffraction
 data, i.e. the numerical intensity of each measured point on the
 profile as a function of scattering angle, should be deposited.
 Fibre data should contain appropriate information such as a photograph
 of the data. As primary diffraction data cannot be satisfactorily
 extracted from such figures, the basic digital diffraction data should
 be deposited.
 
 
 Finally to mention that 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread aaleshin
Dear John,
Thank you for a very informative letter about the IUCr activities towards 
archiving the experimental data. I feel that I did not explain myself properly. 
I do not object archiving the raw data, I just believe that current methodology 
of validating data at PDB is insufficiently robust and requires a modification. 
Implementation of the raw image storage and validation will take a considerable 
time, while the recent incidents of a presumable data frauds demonstrate that 
the issue is urgent. Moreover, presenting the calculated structural factors in 
place of the experimental data is not the only abuse that the current 
validation procedure encourages to do. There might be more numerous occurances 
of data massaging like overestimation of the resolution or data quality, the 
system does not allow to verify them. IUCr and PDB follows the American 
taxation policy, where the responsibility for a fraud is placed on people, and 
the agency does not take sufficient actions to prevent it. I believe it is 
inefficient and inhumane. Making a routine  check of submitted data at a bit 
lower level would reduce a temptation to overestimate the unclearly defined 
quality statistics and make the model fabrication more difficult to accomplish. 
Many people do it unknowingly, and catching them afterwards makes no good.

I suggested to turn the current incidence, which might be too complex for 
burning heretics, into something productive that is done as soon as possible, 
something that will prevent fraud from occurring.

Since my persistent trolling at ccp4bb did not take any effect (until now), I 
wrote a bad-English letter to the PDB administration, encouraging them to 
take urgent actions. Those who are willing to count grammar mistakes in it can 
reading the message below.

With best regards,
Alexander Aleshin, staff scientist
Sanford-Burnham Medical Research Institute 
10901 North Torrey Pines Road
La Jolla, California 92037

Dear PDB administrators;

I am wringing to you regarding the recently publicized story about submission 
of calculated structural factors to the PDB entry 3k79 
(http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This presumable 
fraud (or a mistake) occurred just several years after another, more massive 
fabrication of PDB structures (Acta Cryst. (2010). D66, 115) that affected many 
scientists including myself. The repetitiveness of these events indicates that 
the current mechanism of structure validation by PDB is not sufficiently 
robust. Moreover, it is completely incapable of detecting smaller mischief such 
as overestimation of the data resolution and quality.

There are two approaches to handling fraud problems: (1) raising 
policing and punishment, or (2) making a fraud too difficult to implement. 
Obviously, the second approach is more humane and efficient.

This issue has been discussed on several occasions by the ccp4bb 
community, and some members began promoting the idea of submitting raw 
crystallographic images as a fraud repellent. However, this validation approach 
is not easy and cheap, moreover, it requires a considerable manpower to conduct 
it on a day-to-day basis. Indeed, indexing data sets is sometimes a nontrivial 
problem and cannot be accomplished automatically. For this reason, submitting 
the indexed and partially integrated data (such as .x files from HKL2000 or the 
output.mtz file from Mosfilm) appears as a cheaper substitute to the image 
storing/validating.

Analysis of the partially integrated data provides almost same 
means to the fraud prevention as the images.  Indeed, the observed cases of 
data fraud suggest that they would likely be attempted by a 
biochemist-crystallographer, who is insufficiently educated to fabricate the 
partially processed data. A method developer, on contrary, does not have a 
reasonable incentive to forge a particular structure, unless he teams up with a 
similarly minded biologist. But the latter scenario is very improbable and has 
not been detected yet.

The most valuable benefit in using the partially processed data as 
a validation tool would be the standardization of definition for the data 
resolution and detection of inappropriate massaging of experimental data.

Implementation of this approach requires minuscule adaptation of 
the current system, which most of practicing crystallographers would accept (in 
my humble opinion). The requirement to the data storage would be only ~1000 
fold higher than the current one, and transferring the new data to PDB could be 
still done over the Internet. Moreover, storing the raw data is not required 
after the validation is done.

A program such as Scala of CCP4 could be easily adopted to process 
the validation data and compare them with a conventional set of structural 
factors.  Precise consistency of the two sets is not necessary. They only need 
to agree within statistically 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Ronald E Stenkamp
This discussion has been interesting, and it's provided an interesting forum 
for those interested in dealing with fraud in science.  I've not contributed 
anything to this thread, but the message from Alexander Aleshin prodded me to 
say some things that I haven't heard expressed before.

1.  The sky is not falling!  The errors in the birch pollen antigen pointed out 
by Bernhard are interesting, and the reasons behind them might be troubling.  
However, the self-correcting functions of scientific research found the errors, 
and current publication methods permitted an airing of the problem.  It took 
some effort, but the scientific method prevailed.   

2.  Depositing raw data frames will make little difference in identifying and 
correcting structural problems like this one.  Nor will new requirements for 
deposition of this or that detail.  What's needed for finding the problems is 
time and interest on the part of someone who's able to look at a structure 
critically.  Deposition of additional information could be important for that 
critical look, but deposition alone (at least with today's software) will not 
be sufficient to find incorrect structures.

3.  The responsibility for a fraudulent or wrong or poorly-determined structure 
lies with the investigator, not the society of crystallographers.  My political 
leanings are left-of-central, but I still believe in individual responsibility 
for behavior and actions.  If someone messes up a structure, they're 
accountable for the results.  

4.  Adding to the deposition requirements will not make our science more 
efficient.  Perhaps it's different in other countries, but the administrative 
burden for doing research in the United States is growing.  It would be 
interesting to know the balance between the waste that comes from a wrong 
structure and the waste that comes from having each of us deal with additional 
deposition requirements.  

5.  The real danger that arises from cases of wrong or fraudulent science is 
that it erodes the trust we have in each others results.  No one has time or 
resources to check everything, so science is based on trust.  There are efforts 
underway outside crystallographic circles to address this larger threat to all 
science, and we should be participating in those discussions as much as 
possible.  

Ron

On Thu, 5 Apr 2012, aaleshin wrote:

 Dear John,Thank you for a very informative letter about the IUCr activities 
 towards archiving the experimental
 data. I feel that I did not explain myself properly. I do not object 
 archiving the raw data, I just believe
 that current methodology of validating data at PDB is insufficiently robust 
 and requires a modification.
 Implementation of the raw image storage and validation will take a 
 considerable time, while the recent
 incidents of a presumable data frauds demonstrate that the issue is urgent. 
 Moreover, presenting the
 calculated structural factors in place of the experimental data is not the 
 only abuse that the current
 validation procedure encourages to do. There might be more numerous 
 occurances of data massaging like
 overestimation of the resolution or data quality, the system does not allow 
 to verify them. IUCr and PDB
 follows the American taxation policy, where the responsibility for a fraud is 
 placed on people, and the agency
 does not take sufficient actions to prevent it. I believe it is inefficient 
 and inhumane. Making a routine
  check of submitted data at a bit lower level would reduce a temptation to 
 overestimate the unclearly defined
 quality statistics and make the model fabrication more difficult to 
 accomplish. Many people do it unknowingly,
 and catching them afterwards makes no good.
 
 I suggested to turn the current incidence, which might be too complex for 
 burning heretics, into something
 productive that is done as soon as possible, something that will prevent 
 fraud from occurring.
 
 Since my persistent trolling at ccp4bb did not take any effect (until now), 
 I wrote a bad-English letter
 to the PDB administration, encouraging them to take urgent actions. Those who 
 are willing to count grammar
 mistakes in it can reading the message below.
 
 With best regards,
 Alexander Aleshin, staff scientist
 Sanford-Burnham Medical Research Institute 
 10901 North Torrey Pines Road
 La Jolla, California 92037
 
 Dear PDB administrators;
 
 I am wringing to you regarding the recently publicized story about submission 
 of calculated structural factors
 to the PDB entry 3k79 
 (http://journals.iucr.org/f/issues/2012/04/00/issconts.html). This presumable 
 fraud (or
 a mistake) occurred just several years after another, more massive 
 fabrication of PDB structures (Acta Cryst.
 (2010). D66, 115) that affected many scientists including myself. The 
 repetitiveness of these events indicates
 that the current mechanism of structure validation by PDB is not sufficiently 
 robust. Moreover, it is
 completely incapable of 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Bernhard Rupp (Hofkristallrat a.D.)
I also don't really worry about the images as a primary means of fraud
prevention, although such may be
a useful side effect. These cases are spectacular but so rare that it indeed
would not primarily justify the effort. 
That it can be a useful political instrument to make that argument and get
funding, may be, but that is a bit
of a double edged sword and harm can be done see (5)

The real point to me seems - 
a) is there something in the images and in between casually indexed main
reflections we do not use 
right now that allows us to ultimately get better structures?
I think there is, and it has been told before, from superstructures,
modulation, diffuse contributions etc etc.
A processed data file does not help here. But do we need the old image data
for that or rather use new ones from
modern detectors? Where is the cost/benefit cutoff here? 

b) looking at how some structures are refined, there is little reason to
believe that data processing would be done more
competently by untrained casual users (except that much of the data
processing is done with the help of beam
line personnel who rather know how to do it). Had we images, the next step
then could be PDB_reprocess. 
A processed data file does not help much there either.

c) Discarding your primary data is generally considered bad form...
 
@AlexA:  Arguing with the PDB is not really useful. They did not generate
the bad data.

Best, BR

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ronald
E Stenkamp
Sent: Thursday, April 05, 2012 1:04 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

This discussion has been interesting, and it's provided an interesting forum
for those interested in dealing with fraud in science.  I've not contributed
anything to this thread, but the message from Alexander Aleshin prodded me
to say some things that I haven't heard expressed before.

1.  The sky is not falling!  The errors in the birch pollen antigen pointed
out by Bernhard are interesting, and the reasons behind them might be
troubling.  However, the self-correcting functions of scientific research
found the errors, and current publication methods permitted an airing of the
problem.  It took some effort, but the scientific method prevailed.   

2.  Depositing raw data frames will make little difference in identifying
and correcting structural problems like this one.  Nor will new requirements
for deposition of this or that detail.  What's needed for finding the
problems is time and interest on the part of someone who's able to look at a
structure critically.  Deposition of additional information could be
important for that critical look, but deposition alone (at least with
today's software) will not be sufficient to find incorrect structures.

3.  The responsibility for a fraudulent or wrong or poorly-determined
structure lies with the investigator, not the society of crystallographers.
My political leanings are left-of-central, but I still believe in individual
responsibility for behavior and actions.  If someone messes up a structure,
they're accountable for the results.  

4.  Adding to the deposition requirements will not make our science more
efficient.  Perhaps it's different in other countries, but the
administrative burden for doing research in the United States is growing.
It would be interesting to know the balance between the waste that comes
from a wrong structure and the waste that comes from having each of us deal
with additional deposition requirements.  

5.  The real danger that arises from cases of wrong or fraudulent science is
that it erodes the trust we have in each others results.  No one has time or
resources to check everything, so science is based on trust.  There are
efforts underway outside crystallographic circles to address this larger
threat to all science, and we should be participating in those discussions
as much as possible.  

Ron

On Thu, 5 Apr 2012, aaleshin wrote:

 Dear John,Thank you for a very informative letter about the IUCr 
 activities towards archiving the experimental data. I feel that I did 
 not explain myself properly. I do not object archiving the raw data, I
just believe that current methodology of validating data at PDB is
insufficiently robust and requires a modification.
 Implementation of the raw image storage and validation will take a 
 considerable time, while the recent incidents of a presumable data 
 frauds demonstrate that the issue is urgent. Moreover, presenting the 
 calculated structural factors in place of the experimental data is not 
 the only abuse that the current validation procedure encourages to do. 
 There might be more numerous occurances of data massaging like 
 overestimation of the resolution or data quality, the system does not 
 allow to verify them. IUCr and PDB follows the American taxation 
 policy, where the responsibility for a fraud is placed on people, and 
 the agency

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread aaleshin
Well, looks like my opinion about importance of data validation at the moment 
of their submission does not catch much support, it is sad but understandable. 

Automatic redoing the pdb structures by professionals is a good idea, I myself 
suggested a similar thing 10 years ago at Accelrys (we were developing a tool 
that allowed detecting and remodeling changes in protein-ligand structures due 
to ligand binding), but  there was not much financial interest. How much the 
raw images would enhance the remodeling process is an open question, but good 
luck in getting it funded. 

 c) Discarding your primary data is generally considered bad form...
Agreed, but it is a big burden on labs to maintain archives of their raw data 
indefinitely. Even IRS allows to discard them after some time. What is wrong 
with partially integrated data in terms of structure validation? 

 @AlexA:  Arguing with the PDB is not really useful. 
I did not argue yet, but I'll take your advice.

 They did not generate the bad data.
This is a genuine American thinking! But they might create conditions that 
would prevent their deposition.

I think I should stop heating up this discussion. 

Regards,
Alex

On Apr 5, 2012, at 2:11 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote:

 I also don't really worry about the images as a primary means of fraud
 prevention, although such may be
 a useful side effect. These cases are spectacular but so rare that it indeed
 would not primarily justify the effort.
 That it can be a useful political instrument to make that argument and get
 funding, may be, but that is a bit
 of a double edged sword and harm can be done see (5)
 
 The real point to me seems -
 a) is there something in the images and in between casually indexed main
 reflections we do not use
 right now that allows us to ultimately get better structures?
 I think there is, and it has been told before, from superstructures,
 modulation, diffuse contributions etc etc.
 A processed data file does not help here. But do we need the old image data
 for that or rather use new ones from
 modern detectors? Where is the cost/benefit cutoff here?
 
 b) looking at how some structures are refined, there is little reason to
 believe that data processing would be done more
 competently by untrained casual users (except that much of the data
 processing is done with the help of beam
 line personnel who rather know how to do it). Had we images, the next step
 then could be PDB_reprocess.
 A processed data file does not help much there either.
 
 c) Discarding your primary data is generally considered bad form...
 
 @AlexA:  Arguing with the PDB is not really useful. They did not generate
 the bad data.
 
 Best, BR
 
 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Ronald
 E Stenkamp
 Sent: Thursday, April 05, 2012 1:04 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
 
 This discussion has been interesting, and it's provided an interesting forum
 for those interested in dealing with fraud in science.  I've not contributed
 anything to this thread, but the message from Alexander Aleshin prodded me
 to say some things that I haven't heard expressed before.
 
 1.  The sky is not falling!  The errors in the birch pollen antigen pointed
 out by Bernhard are interesting, and the reasons behind them might be
 troubling.  However, the self-correcting functions of scientific research
 found the errors, and current publication methods permitted an airing of the
 problem.  It took some effort, but the scientific method prevailed.
 
 2.  Depositing raw data frames will make little difference in identifying
 and correcting structural problems like this one.  Nor will new requirements
 for deposition of this or that detail.  What's needed for finding the
 problems is time and interest on the part of someone who's able to look at a
 structure critically.  Deposition of additional information could be
 important for that critical look, but deposition alone (at least with
 today's software) will not be sufficient to find incorrect structures.
 
 3.  The responsibility for a fraudulent or wrong or poorly-determined
 structure lies with the investigator, not the society of crystallographers.
 My political leanings are left-of-central, but I still believe in individual
 responsibility for behavior and actions.  If someone messes up a structure,
 they're accountable for the results.
 
 4.  Adding to the deposition requirements will not make our science more
 efficient.  Perhaps it's different in other countries, but the
 administrative burden for doing research in the United States is growing.
 It would be interesting to know the balance between the waste that comes
 from a wrong structure and the waste that comes from having each of us deal
 with additional deposition requirements.
 
 5.  The real danger that arises from cases of wrong or fraudulent science is
 that it erodes

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Bernhard Rupp (Hofkristallrat a.D.)
Ojweh

 c) Discarding your primary data is generally considered bad form...
Agreed, but it is a big burden on labs to maintain archives of their raw
data indefinitely. 
Even IRS allows to discard them after some time. 

But you DO have to file in the first place, right? How long to keep is an
entirely different question. 

 What is wrong with partially integrated data in terms of structure
validation? 

Who thinks something is wrong with that idea? Section 3.1 under figure 3 of
said incendiary pamphlet 
states:  '...yadayadawhen unmerged data or images for proper
reprocessing are not available
owing to the unfortunate absence of a formal obligation to deposit unmerged
intensity data or diffraction images.'

 They did not generate the bad data.
This is a genuine American thinking! 

Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-)
видеть вас на Лубянке.

But they might create conditions that would prevent their deposition.

Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB
deposition
a total pain for everybody, you don't get compliance, you get defiance. Ever
seen
any happy faces in a TSA check line? 

Anyhow, image deposition will come.

Over and out, BR 


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread aaleshin
Alright, if the image deposition is the only way out, then I am for it, but 
please make sure that synchrotrons will do it for me...

On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote:

 Ojweh
 
 c) Discarding your primary data is generally considered bad form...
 Agreed, but it is a big burden on labs to maintain archives of their raw
 data indefinitely. 
 Even IRS allows to discard them after some time. 
 
 But you DO have to file in the first place, right? How long to keep is an
 entirely different question. 
 
 What is wrong with partially integrated data in terms of structure
 validation? 
 
 Who thinks something is wrong with that idea? Section 3.1 under figure 3 of
 said incendiary pamphlet 
 states:  '...yadayadawhen unmerged data or images for proper
 reprocessing are not available
 owing to the unfortunate absence of a formal obligation to deposit unmerged
 intensity data or diffraction images.'
 
 They did not generate the bad data.
 This is a genuine American thinking! 
 
 Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-)
 видеть вас на Лубянке.
 
 But they might create conditions that would prevent their deposition.
 
 Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB
 deposition
 a total pain for everybody, you don't get compliance, you get defiance. Ever
 seen
 any happy faces in a TSA check line? 
 
 Anyhow, image deposition will come.
 
 Over and out, BR 
 
 


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread Bosch, Juergen
How should they ?
They have no clue which of the 20 datasets was actually useful to solve your 
structure.

If you ask James Holton he has (suggested) to go back to the archived data 
after a certain time and try to solve the undeposited structures then :-)
[Where is James anyhow ? Haven't seen a post recently from him]
Seriously, I think it is in our own interest to submit the corresponding images 
which led to a structure solution somewhere. And as others mentioned bad data 
or good data can always serve for educational purposes.
Just as an example
http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/1Y13

Jürgen

On Apr 5, 2012, at 11:46 PM, aaleshin wrote:

 Alright, if the image deposition is the only way out, then I am for it, but 
 please make sure that synchrotrons will do it for me...
 
 On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote:
 
 Ojweh
 
 c) Discarding your primary data is generally considered bad form...
 Agreed, but it is a big burden on labs to maintain archives of their raw
 data indefinitely. 
 Even IRS allows to discard them after some time. 
 
 But you DO have to file in the first place, right? How long to keep is an
 entirely different question. 
 
 What is wrong with partially integrated data in terms of structure
 validation? 
 
 Who thinks something is wrong with that idea? Section 3.1 under figure 3 of
 said incendiary pamphlet 
 states:  '...yadayadawhen unmerged data or images for proper
 reprocessing are not available
 owing to the unfortunate absence of a formal obligation to deposit unmerged
 intensity data or diffraction images.'
 
 They did not generate the bad data.
 This is a genuine American thinking! 
 
 Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-)
 видеть вас на Лубянке.
 
 But they might create conditions that would prevent their deposition.
 
 Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB
 deposition
 a total pain for everybody, you don't get compliance, you get defiance. Ever
 seen
 any happy faces in a TSA check line? 
 
 Anyhow, image deposition will come.
 
 Over and out, BR 
 
 

..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://web.mac.com/bosch_lab/






Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-05 Thread aaleshin
Did you play as a child a game called a broken phone? It is when someone 
tells something quickly to a neighbor, and so on until the words come back to 
the author. Very funny game. 

My original thesis was that downloading/depositing the raw images would be a 
pain in the neck for crystallographers, so why would not to begin with the 
partially processed data, like .x files from HKL2000?  People should be trained 
to hardships gradually...



On Apr 5, 2012, at 8:57 PM, Bosch, Juergen wrote:

 How should they ?
 They have no clue which of the 20 datasets was actually useful to solve your 
 structure.
 
 If you ask James Holton he has (suggested) to go back to the archived data 
 after a certain time and try to solve the undeposited structures then :-)
 [Where is James anyhow ? Haven't seen a post recently from him]
 Seriously, I think it is in our own interest to submit the corresponding 
 images which led to a structure solution somewhere. And as others mentioned 
 bad data or good data can always serve for educational purposes.
 Just as an example
 http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/1Y13
 
 Jürgen
 
 On Apr 5, 2012, at 11:46 PM, aaleshin wrote:
 
 Alright, if the image deposition is the only way out, then I am for it, but 
 please make sure that synchrotrons will do it for me...
 
 On Apr 5, 2012, at 7:58 PM, Bernhard Rupp (Hofkristallrat a.D.) wrote:
 
 Ojweh
 
 c) Discarding your primary data is generally considered bad form...
 Agreed, but it is a big burden on labs to maintain archives of their raw
 data indefinitely. 
 Even IRS allows to discard them after some time. 
 
 But you DO have to file in the first place, right? How long to keep is an
 entirely different question. 
 
 What is wrong with partially integrated data in terms of structure
 validation? 
 
 Who thinks something is wrong with that idea? Section 3.1 under figure 3 of
 said incendiary pamphlet 
 states:  '...yadayadawhen unmerged data or images for proper
 reprocessing are not available
 owing to the unfortunate absence of a formal obligation to deposit unmerged
 intensity data or diffraction images.'
 
 They did not generate the bad data.
 This is a genuine American thinking! 
 
 Ok, the US citizens on BB might take this one up on my behalf, gospodin ;-)
 видеть вас на Лубянке.
 
 But they might create conditions that would prevent their deposition.
 
 Sure. We are back to the 2007 Reid shoe bomber argument. If you make PDB
 deposition
 a total pain for everybody, you don't get compliance, you get defiance. Ever
 seen
 any happy faces in a TSA check line? 
 
 Anyhow, image deposition will come.
 
 Over and out, BR 
 
 
 
 ..
 Jürgen Bosch
 Johns Hopkins University
 Bloomberg School of Public Health
 Department of Biochemistry  Molecular Biology
 Johns Hopkins Malaria Research Institute
 615 North Wolfe Street, W8708
 Baltimore, MD 21205
 Office: +1-410-614-4742
 Lab:  +1-410-614-4894
 Fax:  +1-410-955-2926
 http://web.mac.com/bosch_lab/
 
 
 
 


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-04 Thread Frank von Delft
No James, you're not alone - astonishing petty pile-on (bullying?) on 
this board the last few days.


Wikipedia says:

   In Internet slang http://en.wikipedia.org/wiki/Internet_slang, a
   *troll* is someone who posts inflammatory,^[2]
   http://en.wikipedia.org/wiki/Troll_%28Internet%29#cite_note-1
   extraneous http://en.wiktionary.org/wiki/extraneous#Adjective, or
   off-topic http://en.wikipedia.org/wiki/Off-topic messages in an
   online community, such as an online discussion forum, chat room, or
   blog, with the primary intent of provoking readers into an emotional
   http://en.wikipedia.org/wiki/Emotion response^[3]
   http://en.wikipedia.org/wiki/Troll_%28Internet%29#cite_note-PCMAG_def-2
   or of otherwise disrupting normal on-topic discussion.

The emotional and disruptive response certainly fit the definition, but 
that's about all.  And while Kevin's tiny blips in my inbox were trivial 
to delete and ignore, the resulting email hurricane of pompous 
indignation was not.


Yuk.
phx



On 04/04/2012 00:29, James Stroud wrote:
I read the first part of the page you linked to. I'm not sure what the 
decent into troll etymology says about the CCP4BB 
community--especially in response to your seemingly innocent post.


My understanding is that the goal of the CCP4BB is to educate and not 
belittle the naivety of other members of the community. I hope I am 
not alone.


James



On Apr 3, 2012, at 4:33 PM, Kevin Jin wrote:


Dear All,
 Here may be another example for the importance of  image storage.
http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html
Regards,
Kevin





Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-04 Thread Eric Bennett

Then everyone's data can be lost at once in the next cloud failure.  Progress!


The hardware failed in such a way that we could not forensically restore the 
data.  What we were able to recover has been made available via a snapshot, 
although the data is in such a state that it may have little to no utility...
-Amazon to some of its cloud customers following their major crash last year


http://articles.businessinsider.com/2011-04-28/tech/29958976_1_amazon-customer-customers-data-data-loss


-Eric



On Apr 3, 2012, at 9:22 PM, Zhijie Li wrote:

 Hi,
  
 Regarding the online image file storage issue, I just googled cloud storage 
 and had a look at the current pricing of such services. To my surprise, some 
 companies are offering unlimited storage for as low as $5 a month. So that's 
 $600 for 10 years. I am afraid that these companies will feel really sorry to 
 learn that there are some monsters called crystallographers living on our 
 planet.
  
 In our lab, some pre-21st century data sets were stored on tapes, newer ones 
 on DVD discs and IDE hard drives. All these media have become or will become 
 obsolete pretty soon. Not to mention the positive relationship of getting CRC 
 errors with the medium's age. Admittedly, it may become quite a job to upload 
 all image files that the whole crystallographic community generates per year. 
 But for individual labs, I think clouding data might become something worth 
 thinking of.
  
 Zhijie
  
  



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-04 Thread aaleshin
People who raise their voices for a prolonged storage of raw images miss a 
simple fact that the volume of collected data increases proportionally if not 
faster than the cost of storage space drops. I just had an opportunity to 
collect data with the PILATUS detector at SSRL and say you that monster allows 
slicing the data 4-5 times thinner than other detectors do. Some people also 
like collecting very redundant data sets. Even now, transferring and storage of 
raw data from a synchrotron is a pain in the neck, but in a few years it may 
become simply impractical. And all this hassle is for the only real purpose of 
preventing data fraud? An't there a cheaper and more adequate solutions to the 
problem? 

I also wonder why after the first occurrence of data fraud several years ago, 
PDB did not take any action to prevent its appearance in the future? Or 
administrative actions are simply impossible nowadays without a mega-dollar 
grant?

On Apr 4, 2012, at 3:45 PM, Eric Bennett wrote:

 
 Then everyone's data can be lost at once in the next cloud failure.  Progress!
 
 
 The hardware failed in such a way that we could not forensically restore the 
 data.  What we were able to recover has been made available via a snapshot, 
 although the data is in such a state that it may have little to no utility...
 -Amazon to some of its cloud customers following their major crash last year
 
 
 http://articles.businessinsider.com/2011-04-28/tech/29958976_1_amazon-customer-customers-data-data-loss
 
 
 -Eric
 
 
 
 On Apr 3, 2012, at 9:22 PM, Zhijie Li wrote:
 
 Hi,
  
 Regarding the online image file storage issue, I just googled cloud 
 storage and had a look at the current pricing of such services. To my 
 surprise, some companies are offering unlimited storage for as low as $5 a 
 month. So that's $600 for 10 years. I am afraid that these companies will 
 feel really sorry to learn that there are some monsters called 
 crystallographers living on our planet.
  
 In our lab, some pre-21st century data sets were stored on tapes, newer ones 
 on DVD discs and IDE hard drives. All these media have become or will become 
 obsolete pretty soon. Not to mention the positive relationship of getting 
 CRC errors with the medium's age. Admittedly, it may become quite a job to 
 upload all image files that the whole crystallographic community generates 
 per year. But for individual labs, I think clouding data might become 
 something worth thinking of.
  
 Zhijie
  
  
 



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-04 Thread Zhijie Li
Hi Eric,

My previous email may have been a little misleading, but I do not recommend 
deleting the originals from the hard drives/discs/tapes. Clouded data should be 
better viewed as an extra copy (considering that our lab/office are quite prone 
to catch fire, and theft too), or a copy that can be easily accessed from 
anywhere. A disaster on the Clouding servers certainly would not be accepted as 
an valid excuse for not being able to provide the raw images when their very 
existence is in question. 

Zhijie




From: Eric Bennett 
Sent: Wednesday, April 04, 2012 6:45 PM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication




Then everyone's data can be lost at once in the next cloud failure.  Progress! 




The hardware failed in such a way that we could not forensically restore the 
data.  What we were able to recover has been made available via a snapshot, 
although the data is in such a state that it may have little to no utility...
-Amazon to some of its cloud customers following their major crash last year




http://articles.businessinsider.com/2011-04-28/tech/29958976_1_amazon-customer-customers-data-data-loss




-Eric






On Apr 3, 2012, at 9:22 PM, Zhijie Li wrote:


  Hi,

  Regarding the online image file storage issue, I just googled cloud storage 
and had a look at the current pricing of such services. To my surprise, some 
companies are offering unlimited storage for as low as $5 a month. So that's 
$600 for 10 years. I am afraid that these companies will feel really sorry to 
learn that there are some monsters called crystallographers living on our 
planet. 

  In our lab, some pre-21st century data sets were stored on tapes, newer ones 
on DVD discs and IDE hard drives. All these media have become or will become 
obsolete pretty soon. Not to mention the positive relationship of getting CRC 
errors with the medium's age. Admittedly, it may become quite a job to upload 
all image files that the whole crystallographic community generates per year. 
But for individual labs, I think clouding data might become something worth 
thinking of.

  Zhijie





Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Mark J van Raaij
In fact, I would put it even stronger, if we know a referee is being dishonest, 
it is our duty to make sure he is removed from science, blacklisted from the 
journal etc.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote:

 Mark,
 
 I know some stories (which of course I'll not post here)  from the 
 Crystallography field and from other fields where reviewers profit from the 
 fact that suddenly they have new, interpreted data which fits very well with 
 their own results. Stories like to block a manuscript or ask for more results 
 for the reviewer to be able to submit its own paper (with new ideas) in 
 time, or copy a structure from the figures, or ask for experiments that only 
 the reviewer can do so he/she is included in the paper, or submit as fast as 
 possible in another journal with an extremely short delay of acceptance (e.g. 
 10 days,  without revision?, talking to the editorial board?) things like 
 this. Well, it is not question of making a full list, here!. The whole 
 problem comes from publishing first, from competition.  
 
 The hope with fraud with X-ray data is that it seems to be detectable, thanks 
 to valuable people that develop methods to detect it. But it is very 
 difficult to demonstrate that your work, ideas or results have been copied. 
 How do you defend from this? And how after giving to them the valuable PDB?
 
 Finally, how many crystallographers are in the world? 5000?  The concept of 
 ethics can change from one place to another and, more than this, there is the 
 fact that the reviewer is anonymous.
 
 I try to respond to my reviewers the best I can and I really trust their 
 criteria, sometimes a bit too much, indeed. I think they all have done a very 
 nice job. But some of the stories from above happened to me or close to me 
 and I feel really insecure with the idea of sending a manuscript, the X-ray 
 data and the PDB, altogether, to a reviewer shielded by anonymity. It's too 
 risky: with an easy molecular replacement someone can solve a difficult 
 structure and publish it first. And then the only thing left to the bad 
 reviewer is to change the author's list! (and for the true author what is 
 left is to feel like an idiot).
 
 In my humble opinion, we must be strict but not kill ourselves. Trust authors 
 as we trust reviewers. Otherwise, the whole effort might be useless.
 
 Maria
 
 Dep. Structural Biology
 IBMB-CSIC
 Baldiri Reixach 10-12
 08028 BARCELONA
 Spain
 Tel: (+34) 93 403 4950
 Fax: (+34) 93 403 4979
 e-mail: maria.s...@ibmb.csic.es
 
 On 3 April 2012 16:58, Mark J van Raaij mjvanra...@cnb.csic.es wrote:
 The remedy for the fact that some reviewers act unethically is not 
 withholding coordinates and structure factors, but a more active role for the 
 authors to denounce these possible violations and more effective 
 investigations by the journals whose reviewers are suspected by the authors 
 of committing these violations.
 I have witnessed authors being hesitant to complain about possible violations 
 and journals not always taking complaints seriously enough.
 
 Mark J van Raaij
 Laboratorio M-4
 Dpto de Estructura de Macromoleculas
 Centro Nacional de Biotecnologia - CSIC
 c/Darwin 3
 E-28049 Madrid, Spain
 tel. (+34) 91 585 4616
 http://www.cnb.csic.es/~mjvanraaij
 
 
 
 On 3 Apr 2012, at 16:45, Bosch, Juergen wrote:
 
  Hi Fred,
 
  I'll go public on this one. This happened to me. I will not reveal who 
  reviewed my paper and which paper it was only that your naive assumption 
  might not always be correct. I have learned my lesson and exclude people 
  with overlapping interests (even though they actually might be the best 
  critical reviewers for your work). Unfortunately you don't really have 
  control if the journal still decides to pick those excluded reviewers.
  As a suggestion to people out there, make sure to not encrypt your comments 
  as pdf and PW protect them - that's how I found out about the identity of 
  the reviewer - as it couldn't be changed by the journal.
 
  I agree though that it shouldn't happen and I hope it only happens in very 
  few cases.
 
  Jürgen
 
 
  On Apr 3, 2012, at 9:10 AM, Dyda wrote:
 
  I think the argument that this may give a competitive advantage
  to the referee who him or herself maybe working on the same thing
  should be mute, as I thought article refereeing was supposed to
  be a confidential process. Breaching this would be a serious
  ethical violation. In my experience, before agreeing to review,
  we see the abstract, I was always thought that I was supposed to
  decline if there is a potential conflict with my own work.
  Perhaps naively, but I always assumed that everyone acts like this.
 
 
  ..
  Jürgen Bosch
  Johns Hopkins University
  Bloomberg 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Herbert J. Bernstein

Dear Colleagues,

  One thing that would help is avoiding misappropriated priority of 
research
results would be to join the math and physics community in their robust 
use of open-access
preprints in arXiv.  Such public preprints establish reliable timelines 
for research credit

and help to ensure timely access to new results by the entire community.
Fully peer-reviewed publications in real journals are still desirable, 
but to make
this work, our journals would have to be willing to accept papers for 
which such

a preprint system has been used.  To understand the complexity of the issue,
see

http://nanoscale.blogspot.com/2008/01/arxiv-and-publishing.html

I believe the IUCr is willing to accept papers that are posted on a 
preprint server (somebody

correct me if I am wrong).

  It works for the math and physics community.  Perhaps it would work 
for the

crystallographic community.


On 4/3/12 1:28 PM, Mark J van Raaij wrote:

In fact, I would put it even stronger, if we know a referee is being dishonest, 
it is our duty to make sure he is removed from science, blacklisted from the 
journal etc.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote:

   

Mark,

I know some stories (which of course I'll not post here)  from the Crystallography field 
and from other fields where reviewers profit from the fact that suddenly they have new, 
interpreted data which fits very well with their own results. Stories like to block a 
manuscript or ask for more results for the reviewer to be able to submit its own paper 
(with new ideas) in time, or copy a structure from the figures, or ask for 
experiments that only the reviewer can do so he/she is included in the paper, or submit 
as fast as possible in another journal with an extremely short delay of acceptance (e.g. 
10 days,  without revision?, talking to the editorial board?) things like this. Well, it 
is not question of making a full list, here!. The whole problem comes from publishing 
first, from competition.

The hope with fraud with X-ray data is that it seems to be detectable, thanks 
to valuable people that develop methods to detect it. But it is very difficult 
to demonstrate that your work, ideas or results have been copied. How do you 
defend from this? And how after giving to them the valuable PDB?

Finally, how many crystallographers are in the world? 5000?  The concept of 
ethics can change from one place to another and, more than this, there is the 
fact that the reviewer is anonymous.

I try to respond to my reviewers the best I can and I really trust their criteria, sometimes a bit 
too much, indeed. I think they all have done a very nice job. But some of the stories from above 
happened to me or close to me and I feel really insecure with the idea of sending a manuscript, the 
X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. It's too risky: with an 
easy molecular replacement someone can solve a difficult structure and publish it first. And then 
the only thing left to the bad reviewer is to change the author's list! (and for the 
true author what is left is to feel like an idiot).

In my humble opinion, we must be strict but not kill ourselves. Trust authors 
as we trust reviewers. Otherwise, the whole effort might be useless.

Maria

Dep. Structural Biology
IBMB-CSIC
Baldiri Reixach 10-12
08028 BARCELONA
Spain
Tel: (+34) 93 403 4950
Fax: (+34) 93 403 4979
e-mail: maria.s...@ibmb.csic.es

On 3 April 2012 16:58, Mark J van Raaijmjvanra...@cnb.csic.es  wrote:
The remedy for the fact that some reviewers act unethically is not withholding 
coordinates and structure factors, but a more active role for the authors to 
denounce these possible violations and more effective investigations by the 
journals whose reviewers are suspected by the authors of committing these 
violations.
I have witnessed authors being hesitant to complain about possible violations 
and journals not always taking complaints seriously enough.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 16:45, Bosch, Juergen wrote:

 

Hi Fred,

I'll go public on this one. This happened to me. I will not reveal who reviewed 
my paper and which paper it was only that your naive assumption might not 
always be correct. I have learned my lesson and exclude people with overlapping 
interests (even though they actually might be the best critical reviewers for 
your work). Unfortunately you don't really have control if the journal still 
decides to pick those excluded reviewers.
As a suggestion to people out there, make sure to not encrypt your comments as 
pdf and PW protect them - 

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Tom Peat
I agree with Herbert that a pre-print setup is one way to establish priority 
and get useful comments for an author. 
And I know this has been discussed before, but another way is to remove the 
anonymous aspect of the review, as this would achieve the same as the community 
pre-print distribution (at least in many ways). 
I would be happy to give my name when reviewing, as I feel it is my job to 
improve the paper, and I can still face my colleagues after the exercise. 
cheers, tom


Tom Peat
Biophysics Group
CSIRO, CMSE
343 Royal Parade
Parkville, VIC, 3052
+613 9662 7304
+614 57 539 419
tom.p...@csiro.au

From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Herbert J. 
Bernstein [y...@bernstein-plus-sons.com]
Sent: Wednesday, April 04, 2012 4:33 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

Dear Colleagues,

   One thing that would help is avoiding misappropriated priority of
research
results would be to join the math and physics community in their robust
use of open-access
preprints in arXiv.  Such public preprints establish reliable timelines
for research credit
and help to ensure timely access to new results by the entire community.
Fully peer-reviewed publications in real journals are still desirable,
but to make
this work, our journals would have to be willing to accept papers for
which such
a preprint system has been used.  To understand the complexity of the issue,
see

http://nanoscale.blogspot.com/2008/01/arxiv-and-publishing.html

I believe the IUCr is willing to accept papers that are posted on a
preprint server (somebody
correct me if I am wrong).

   It works for the math and physics community.  Perhaps it would work
for the
crystallographic community.


On 4/3/12 1:28 PM, Mark J van Raaij wrote:
 In fact, I would put it even stronger, if we know a referee is being 
 dishonest, it is our duty to make sure he is removed from science, 
 blacklisted from the journal etc.

 Mark J van Raaij
 Laboratorio M-4
 Dpto de Estructura de Macromoleculas
 Centro Nacional de Biotecnologia - CSIC
 c/Darwin 3
 E-28049 Madrid, Spain
 tel. (+34) 91 585 4616
 http://www.cnb.csic.es/~mjvanraaij



 On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote:


 Mark,

 I know some stories (which of course I'll not post here)  from the 
 Crystallography field and from other fields where reviewers profit from the 
 fact that suddenly they have new, interpreted data which fits very well with 
 their own results. Stories like to block a manuscript or ask for more 
 results for the reviewer to be able to submit its own paper (with new 
 ideas) in time, or copy a structure from the figures, or ask for experiments 
 that only the reviewer can do so he/she is included in the paper, or submit 
 as fast as possible in another journal with an extremely short delay of 
 acceptance (e.g. 10 days,  without revision?, talking to the editorial 
 board?) things like this. Well, it is not question of making a full list, 
 here!. The whole problem comes from publishing first, from competition.

 The hope with fraud with X-ray data is that it seems to be detectable, 
 thanks to valuable people that develop methods to detect it. But it is very 
 difficult to demonstrate that your work, ideas or results have been copied. 
 How do you defend from this? And how after giving to them the valuable PDB?

 Finally, how many crystallographers are in the world? 5000?  The concept of 
 ethics can change from one place to another and, more than this, there is 
 the fact that the reviewer is anonymous.

 I try to respond to my reviewers the best I can and I really trust their 
 criteria, sometimes a bit too much, indeed. I think they all have done a 
 very nice job. But some of the stories from above happened to me or close to 
 me and I feel really insecure with the idea of sending a manuscript, the 
 X-ray data and the PDB, altogether, to a reviewer shielded by anonymity. 
 It's too risky: with an easy molecular replacement someone can solve a 
 difficult structure and publish it first. And then the only thing left to 
 the bad reviewer is to change the author's list! (and for the true 
 author what is left is to feel like an idiot).

 In my humble opinion, we must be strict but not kill ourselves. Trust 
 authors as we trust reviewers. Otherwise, the whole effort might be useless.

 Maria

 Dep. Structural Biology
 IBMB-CSIC
 Baldiri Reixach 10-12
 08028 BARCELONA
 Spain
 Tel: (+34) 93 403 4950
 Fax: (+34) 93 403 4979
 e-mail: maria.s...@ibmb.csic.es

 On 3 April 2012 16:58, Mark J van Raaijmjvanra...@cnb.csic.es  wrote:
 The remedy for the fact that some reviewers act unethically is not 
 withholding coordinates and structure factors, but a more active role for 
 the authors to denounce these possible violations and more effective 
 investigations by the journals whose reviewers are suspected by the authors 
 of committing these violations.
 I

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Bryan Lepore
On the topic of MX fraud : could not an encryption algorithm be
applied to answer the question of truth or falsity of a pdb/wwpdb/pdbe
entry? has anyone proposed such an idea before?

for example (admittedly this is a mess):

* a detector parameter - perhaps the serial number - is used as a
public key. the detector parameter is shared among
beamlines/companies/*pdb. specifically, the experimentor requests it
at beamtime.

* experimentor voluntarily encrypts something, using GPLv3 programs,
small but essential to the deposition materials, like the R-free set
indices (or please suggest something better), using their private key.
maybe symmetric cipher would work better for this. or the Free R set
indices are used to generate a key.

* at deposition time, the *pdb unencrypts the relevant entry
components using their private key related to the detector used.
existing deposition methods pass or fail based on this (so maybe not
the Free R set).

* why do this : at deposition time, *pdb will have a yes-or-no result
from a single string of characters. can be a stop-gap measure until
images can be archived easily. all elements of the chain are required
to be free and unencumbered by proprietary interests. importantly, it
is voluntary. this will prevent entries such as Schwarzenbacher or
Ajees getting past deposition - so admittedly, not many.

references:
http://en.wikipedia.org/wiki/RSA_(algorithm)
http://en.wikipedia.org/wiki/Diffie-Hellman_key_exchange

-Bryan


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Dale Tronrud
   I'm not sure how encryption can solve a problem of truth or falsity.
Public key encryption only says that the message that is decrypted using
the public key must have been encrypted by someone who knows the private
key.  A person can use their private key to encrypt a lie as well as the
truth.

   I don't quite follow your prescription, but if you are saying that
the beamline gives the depositor a code when they collect data that
proves data were collected, how does the beamline personal know the
contents of the crystal?  Couldn't one simply collect HEWL and then
deposit any model they like?

   The beamline could encrypt all images with their private key, and
the data integration program could decrypt the images using the public
key.  That way when a depositor presents a set of images it could be
proved that those images came, unmodified, from that beamline.  The
images would still have to be deposited, however.  (And this provides
no protection against forgeries of home source data sets.)

Dale Tronrud

On 04/03/12 13:19, Bryan Lepore wrote:
 On the topic of MX fraud : could not an encryption algorithm be
 applied to answer the question of truth or falsity of a pdb/wwpdb/pdbe
 entry? has anyone proposed such an idea before?
 
 for example (admittedly this is a mess):
 
 * a detector parameter - perhaps the serial number - is used as a
 public key. the detector parameter is shared among
 beamlines/companies/*pdb. specifically, the experimentor requests it
 at beamtime.
 
 * experimentor voluntarily encrypts something, using GPLv3 programs,
 small but essential to the deposition materials, like the R-free set
 indices (or please suggest something better), using their private key.
 maybe symmetric cipher would work better for this. or the Free R set
 indices are used to generate a key.
 
 * at deposition time, the *pdb unencrypts the relevant entry
 components using their private key related to the detector used.
 existing deposition methods pass or fail based on this (so maybe not
 the Free R set).
 
 * why do this : at deposition time, *pdb will have a yes-or-no result
 from a single string of characters. can be a stop-gap measure until
 images can be archived easily. all elements of the chain are required
 to be free and unencumbered by proprietary interests. importantly, it
 is voluntary. this will prevent entries such as Schwarzenbacher or
 Ajees getting past deposition - so admittedly, not many.
 
 references:
 http://en.wikipedia.org/wiki/RSA_(algorithm)
 http://en.wikipedia.org/wiki/Diffie-Hellman_key_exchange
 
 -Bryan


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Kevin Jin
Dear All,
 Here may be another example for the importance of  image storage.

http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html

Regards,

Kevin


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Bryan Lepore
On Tue, Apr 3, 2012 at 5:16 PM, Dale Tronrud det...@uoxray.uoregon.edu wrote:
 I'm not sure how encryption can solve a problem of truth or falsity.

AFAIU any given checksum will tell you if a file is corrupted or not.
My brain decided to interpret that as true or false. and 

  A person can use their private key to encrypt a lie as well as the truth.
 [...] I don't quite follow your prescription,

...I admitted it is a mess - and sorry to mix up the various
algorithms. also I must emphasize I do not have a clear picture of how
encryption would work here.

can I step back - it *seems* that following facts point to a checksum
of sorts for a *pdb entry:

* random number generator seed
* randomly chosen Free R set
* integer indices of the Free R set
* detector things - serial number, or fingerprint of sorts - known to *pdb only.

... by checksum of sorts for a *pdb entry, what that means is an
easy way to verify if all parts of the entry originated with
diffraction images. detector things indicates that I am wondering if
something besides an SN on a detector would be useful.

... so a scenario that comes to mind is the deposition team runs the
checksum (or whatever), and gets the Free R set (for example). they
run the battery of tests. they find that refinement is a disaster.
they go check the detector specs they have, etc., etc., there were no
images used.

  The beamline could encrypt all images with their private key, and[...] it 
 could be
 proved that those images came, unmodified, from that beamline.

would encryption of images significantly increase the integration
time? Also, I am not following the image deposition forum elsewhere.

... anyways, this sounds like it was just an excercise. Thanks anyway.

Regards,

-Bryan


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Bernhard Rupp (Hofkristallrat a.D.)
Orcus,

 

if you put yourself persistently into the face of guys who play hard, you
need to learn to

take a few hits and shake it off. Maybe a little retrospection on why your
postings might

perhaps possibly maybe perceived as somewhat self-promoting and ungracious
could be helpful.

 

The skill of presentation is at least as important in Science as being
right.

 

Best, BR 

 

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin
Jin
Sent: Tuesday, April 03, 2012 3:34 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

 

Dear All, 

 Here may be another example for the importance of  image storage. 

 

http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html

 

Regards,

 

Kevin

 



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Kevin Jin
Thanks of your education. I got it.

By the way, what does Orcus mean here?

Regards,

Kevin

On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) 
hofkristall...@gmail.com wrote:

 Orcus,

 ** **

 if you put yourself persistently into the face of guys who play hard, you
 need to learn to

 take a few hits and shake it off. Maybe a little retrospection on why your
 postings might

 perhaps possibly maybe perceived as somewhat self-promoting and ungracious
 could be helpful.

 ** **

 The skill of presentation is at least as important in Science as being
 right.

 ** **

 Best, BR 

 ** **

 *From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of 
 *Kevin
 Jin
 *Sent:* Tuesday, April 03, 2012 3:34 PM

 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* Re: [ccp4bb] very informative - Trends in Data Fabrication

 ** **

 Dear All, 

  Here may be another example for the importance of  image storage. 

  

 http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html

  

 Regards,

  

 Kevin

 ** **




-- 
Kevin Jin

Sharing knowledge each other is always very joyful..

Website: http://www.jinkai.org/


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Bosch, Juergen
Trollus maximus perhaps ? But it could have different meanings e.g. in German 
there is something going south if it went down the orcus :-)

Don't worry to much and relax.

Jürgen

On Apr 3, 2012, at 8:22 PM, Kevin Jin wrote:

Thanks of your education. I got it.

By the way, what does Orcus mean here?

Regards,

Kevin

On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) 
hofkristall...@gmail.commailto:hofkristall...@gmail.com wrote:
Orcus,

if you put yourself persistently into the face of guys who play hard, you need 
to learn to
take a few hits and shake it off. Maybe a little retrospection on why your 
postings might
perhaps possibly maybe perceived as somewhat self-promoting and ungracious 
could be helpful.

The skill of presentation is at least as important in Science as being right.

Best, BR

From: CCP4 bulletin board 
[mailto:CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin 
Jin
Sent: Tuesday, April 03, 2012 3:34 PM

To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication


Dear All,
 Here may be another example for the importance of  image storage.

http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html

Regards,

Kevin





--
Kevin Jin

Sharing knowledge each other is always very joyful..

Website: http://www.jinkai.org/



..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://web.mac.com/bosch_lab/






Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Katherine Sippel
Might I suggest looking to Sean Seaver and the P212121.com as an example of
a a successful crystallographer science blogger though the site has shifted
more towards a consumable supplier in recent years.

I would also consider looking into adding an RSS feed to your site so that
those people interested in your articles can be informed without spamming
the boards. The gods of the interwebz have blessed us with the gift of RSS
so that we may be made aware of when someone might be yelling something
potentially interesting into the void (that and to bring us silly pictures
of cats covered in phonetically spelled captions when we have a failed
experiment).

It is my hope that this will not discourage you from taking every
opportunity to improve your writing skills but help you find a more
appropriate means of disseminating your product.

Cheers,

Katherine

On Tue, Apr 3, 2012 at 7:22 PM, Kevin Jin kevin...@gmail.com wrote:

 Thanks of your education. I got it.

 By the way, what does Orcus mean here?

 Regards,

 Kevin

 On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) 
 hofkristall...@gmail.com wrote:

 Orcus,

 ** **

 if you put yourself persistently into the face of guys who play hard, you
 need to learn to

 take a few hits and shake it off. Maybe a little retrospection on why
 your postings might

 perhaps possibly maybe perceived as somewhat self-promoting and
 ungracious could be helpful.

 ** **

 The skill of presentation is at least as important in Science as being
 right.

 ** **

 Best, BR 

 ** **

 *From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of
 *Kevin Jin
 *Sent:* Tuesday, April 03, 2012 3:34 PM

 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* Re: [ccp4bb] very informative - Trends in Data Fabrication

 ** **

 Dear All, 

  Here may be another example for the importance of  image storage. 

  

 http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html

  

 Regards,

  

 Kevin

 ** **




 --
 Kevin Jin

 Sharing knowledge each other is always very joyful..

 Website: http://www.jinkai.org/





Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Kendall Nettles
My intent with the troll joke was to give a humorous reminder that a little 
self promotion is ok, but a couple times a day is annoying. Orcus means troll, 
as in Internet troll, meaning one who subverts the intended use of the site and 
is annoying people. You have made a number of on topic posts that were very 
nice, but also a number that were clearly off topic and viewed as self 
promotion, with links to your consulting service. A couple times a day is a bit 
much.  No one wants to be rude, so we try to humor you into toning it down a 
little. Compared to many Internet forums, this is likely one of the nicer 
responses you could expect.
all the best,
Kendall

On Apr 3, 2012, at 8:22 PM, Kevin Jin 
kevin...@gmail.commailto:kevin...@gmail.com wrote:

Thanks of your education. I got it.

By the way, what does Orcus mean here?

Regards,

Kevin

On Tue, Apr 3, 2012 at 5:11 PM, Bernhard Rupp (Hofkristallrat a.D.) 
hofkristall...@gmail.commailto:hofkristall...@gmail.com wrote:
Orcus,

if you put yourself persistently into the face of guys who play hard, you need 
to learn to
take a few hits and shake it off. Maybe a little retrospection on why your 
postings might
perhaps possibly maybe perceived as somewhat self-promoting and ungracious 
could be helpful.

The skill of presentation is at least as important in Science as being right.

Best, BR

From: CCP4 bulletin board 
[mailto:CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Kevin 
Jin
Sent: Tuesday, April 03, 2012 3:34 PM

To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

Dear All,
 Here may be another example for the importance of  image storage.

http://www.jinkai.org/DERA/DERA_1O0Y_3R12.html

Regards,

Kevin




--
Kevin Jin

Sharing knowledge each other is always very joyful..

Website: http://www.jinkai.org/




Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Zhijie Li
Hi,

Regarding the online image file storage issue, I just googled cloud storage 
and had a look at the current pricing of such services. To my surprise, some 
companies are offering unlimited storage for as low as $5 a month. So that's 
$600 for 10 years. I am afraid that these companies will feel really sorry to 
learn that there are some monsters called crystallographers living on our 
planet. 

In our lab, some pre-21st century data sets were stored on tapes, newer ones on 
DVD discs and IDE hard drives. All these media have become or will become 
obsolete pretty soon. Not to mention the positive relationship of getting CRC 
errors with the medium's age. Admittedly, it may become quite a job to upload 
all image files that the whole crystallographic community generates per year. 
But for individual labs, I think clouding data might become something worth 
thinking of.

Zhijie



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread James Stroud

On Apr 3, 2012, at 7:19 PM, Katherine Sippel wrote:

 I would also consider looking into adding an RSS feed to your site so that 
 those people interested in your articles can be informed without spamming the 
 boards.

Why continue to punish him? Adding an RSS feed means installing and configuring 
an RSS server. Aren't there rules against cruel and inhumane punishment?

There are many free newsfeed disseminators. Twitter is the most famous. There 
are others, maybe better, so I'm not being a twittervangelist here.

My point is this: free and easy is better than difficult.

James



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Kendall Nettles
James makes an important point. I've come to regret my joke as showing poor 
manners. I hesitate to add to more email that no one cares about, but I do 
think it is important to contribute the idea that the positive tone of this 
forum needs to be protected.  I apologize, and  suggest my comments should have 
been offered directly and off-line in order to be constructive and not 
off-putting to others who would want to contribute or ask questions.

Kendall

On Apr 3, 2012, at 10:01 PM, James Stroud 
xtald...@gmail.commailto:xtald...@gmail.com wrote:


On Apr 3, 2012, at 7:19 PM, Katherine Sippel wrote:

I would also consider looking into adding an RSS feed to your site so that 
those people interested in your articles can be informed without spamming the 
boards.

Why continue to punish him? Adding an RSS feed means installing and configuring 
an RSS server. Aren't there rules against cruel and inhumane punishment?

There are many free newsfeed disseminators. Twitter is the most famous. There 
are others, maybe better, so I'm not being a twittervangelist here.

My point is this: free and easy is better than difficult.

James



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Ravi Nookala
The sad situation is that more and more scientists are becoming 
desperate (for funding or tenure or both) and are told 'publish or 
perish'; they become obsessed with impact factors, sensationalise the 
data in the process (be it complete fabrication or 'massaging' the 
results) and rush to publish to be the 'first' to do so.


This was recently highlighted in the following article: 
http://www.nature.com/nature/journal/v483/n7391/full/483531a.html


I personally think that whole review process should be open and 
transparent where the coordinates are available for everyone to see 
(after deposition and with authors' consent) along with the names and 
comments of the reviewers. If sloppy mistakes are made (deliberately or 
otherwise), they will be picked up by the wider scientific community if 
not the reviewers.


Regards
Ravi

On 02/04/2012 19:00, Maria Sola i Vilarrubias wrote:

Dear Phoebe,

I cannot imagine myself delivering maps and coordinates (after years 
of work... I insist: after years of work) to a  reviewer that could 
be, for whatever chance, my best competitor (even if I suggested to 
the editor not to include him/her as a reviewer... but decisions from 
editors are of all kind).


I simply prefer not imagine this after two publications fuelled by 
clear, direct and strong competition. That was stressful enough, 
already. If I have to add to this stress the thought that my 
coordinates can go to the wrong hands, then I think I would just 
give up or, alternatively, send the work to a lower impact, 
fast-publishing journal and make my life easier while sending my 
scientific future to the low-impact bin, killing future opportunities.


Competition is there. I see that data to be deposited is strictly 
confidential. I support the PDB to make the quality check work at the 
level you mention, but not a reviewer:  People are nice but the world 
is big and competition is crazy... at least enough to make fraud or 
copy other's work. The latter is less difficult; by copying (simply 
copy and paste to my computer this nice structure that I was looking 
for!), there is no need to invent anything.


About a wrongly fit compound, the reviewer can ask images about the 
model in a map calculated at a specific sigma and in different 
orientations.


Maria


On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu 
mailto:pr...@uchicago.edu wrote:


Can we leverage this to push journals to routinely allow reviewers
access coordinates and maps?

Outright fraud is outrageous, but I'm actually more worried about
ligands fit to marginal density and other issues of
under-supervised model building.

=
Phoebe A. Rice
Dept. of Biochemistry  Molecular Biology
The University of Chicago
phone 773 834 1723 tel:773%20834%201723

http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


 Original message 
Date: Mon, 2 Apr 2012 08:41:02 -0700
From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK
mailto:CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp
(Hofkristallrat a.D.) hofkristall...@gmail.com
mailto:hofkristall...@gmail.com)
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK

   Robbie has restored the PDB_REDO of 3k78



   It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2
http://www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2



   and Louise Jones form the IUCr office has kindly
   made the article open access.



 http://journals.iucr.org/f/issues/2012/04/00/issconts.html



   BR







   From: CCP4 bulletin board
   [mailto:CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK]
On Behalf Of Bernhard
   Rupp (Hofkristallrat a.D.)
   Sent: Sunday, April 01, 2012 06:06
   To: CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
   Subject: Re: [ccp4bb] very informative - Trends in
   Data Fabrication



 Hofkristallrat auA*er Dienst, is written as
   Bernhard - unless you are referring to some other
   guy with a french name Bernard.



   As one may extrapolate given my recent paper, I have
   been called names a lot worse



   A*  And the book indeed is a bible of xtallography.



   Enough of this - it is becoming embarrassing. I wish
   I had done a more careful job proofing, as over 500
   errata attest to,

   and we all are only seeing further because we are
   standing on the shoulders of giants. So once again
   thanks

   to all the contributors I have pestered with my
   questions on BB and then some, and to all those who
   actually read BMC

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Herman . Schreuder
If journals would require that not only coordinates, but also structure factors 
would be made publicly available immediately AFTER publication, any sloppy 
author will be caught within days by the Rups, redo people and Bricognes. 
Anyone who would then still submit and publish questionable data has choosen 
the wrong metier and, as has been mentioned before, should probably look for a 
job in the financial sector. 
 
my 2 cents,
Herman 




From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
Ravi Nookala
Sent: Tuesday, April 03, 2012 9:31 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication


The sad situation is that more and more scientists are becoming 
desperate (for funding or tenure or both) and are told 'publish or perish'; 
they become obsessed with impact factors, sensationalise the data in the 
process (be it complete fabrication or 'massaging' the results) and rush to 
publish to be the 'first' to do so. 

This was recently highlighted in the following article: 
http://www.nature.com/nature/journal/v483/n7391/full/483531a.html

I personally think that whole review process should be open and 
transparent where the coordinates are available for everyone to see (after 
deposition and with authors' consent) along with the names and comments of the 
reviewers. If sloppy mistakes are made (deliberately or otherwise), they will 
be picked up by the wider scientific community if not the reviewers. 

Regards
Ravi

On 02/04/2012 19:00, Maria Sola i Vilarrubias wrote: 

Dear Phoebe,

I cannot imagine myself delivering maps and coordinates (after 
years of work... I insist: after years of work) to a  reviewer that could be, 
for whatever chance, my best competitor (even if I suggested to the editor not 
to include him/her as a reviewer... but decisions from editors are of all 
kind). 

I simply prefer not imagine this after two publications fuelled 
by clear, direct and strong competition. That was stressful enough, already. If 
I have to add to this stress the thought that my coordinates can go to the 
wrong hands, then I think I would just give up or, alternatively, send the 
work to a lower impact, fast-publishing journal and make my life easier while 
sending my scientific future to the low-impact bin, killing future 
opportunities. 

Competition is there. I see that data to be deposited is 
strictly confidential. I support the PDB to make the quality check work at the 
level you mention, but not a reviewer:  People are nice but the world is big 
and competition is crazy... at least enough to make fraud or copy other's work. 
The latter is less difficult; by copying (simply copy and paste to my computer 
this nice structure that I was looking for!), there is no need to invent 
anything.

About a wrongly fit compound, the reviewer can ask images about 
the model in a map calculated at a specific sigma and in different 
orientations. 

Maria



On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu wrote:


Can we leverage this to push journals to routinely 
allow reviewers access coordinates and maps?

Outright fraud is outrageous, but I'm actually more 
worried about ligands fit to marginal density and other issues of 
under-supervised model building.

=
Phoebe A. Rice
Dept. of Biochemistry  Molecular Biology
The University of Chicago
phone 773 834 1723 tel:773%20834%201723 

http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


 Original message 
Date: Mon, 2 Apr 2012 08:41:02 -0700
From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on 
behalf of Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com)
Subject: Re: [ccp4bb] very informative - Trends in 
Data Fabrication
To: CCP4BB@JISCMAIL.AC.UK

   Robbie has restored the PDB_REDO of 3k78



   It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Dyda
I think that to review a paper containing a structure derived from
crystallographic data should indeed involve the referee having access
to coordinates and to the electron density. Without this access it
is not possible to judge the quality and very often even the 
soundness of statements in the paper.

I think the argument that this may give a competitive advantage
to the referee who him or herself maybe working on the same thing
should be mute, as I thought article refereeing was supposed to
be a confidential process. Breaching this would be a serious 
ethical violation. In my experience, before agreeing to review,
we see the abstract, I was always thought that I was supposed to
decline if there is a potential conflict with my own work. 
Perhaps naively, but I always assumed that everyone acts like this.

Unfortunately however, there is another serious issue.

After a very troubling experience with a paper I reviewed, I discussed
this issue with journal editors. What they said was that they already
have a hell of time to find people who agree to referee, by raising the
task level (asking refs to look at coords and density) they feared
that no one would agree.  Actually, perhaps many have  noticed the  
large number  of 5 liner referee reports saying really not much about a
full length research article. People simply don't have the time to
put the effort in. So I am not  sure how realistic is to ask even more,
for something that at some level, is pro bono work.

   
Fred
***
Fred Dyda, Ph.D.   Phone:301-402-4496
Laboratory of Molecular BiologyFax: 301-496-0201
DHHS/NIH/NIDDK e-mail:fred.d...@nih.gov  
Bldg. 5. Room 303 
Bethesda, MD 20892-0560  URGENT message e-mail: 2022476...@mms.att.net
Google maps coords: 39.000597, -77.102102
http://www2.niddk.nih.gov/NIDDKLabs/IntramuralFaculty/DydaFred
***


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread James Whisstock
Hi

I was thinking about the last statement in the Acta editorial  - It is 
important to note, however, that in neither of these cases was a single frame 
of data collected. Not one..  This brought me back to the images..  

To date there is no global acceptance that original diffractiom images must 
be deposited (though I personally think there should be).  Many of the 
arguments around this issue relate to the time and space required to house such 
data.  However (and apologies if this has already been raised and I have missed 
it), if our sole intent is to ascertain that there's no trouble at t'mill then 
deposition of a modest wedge of data and / or a 0 and 90, while not ideal, may 
be sufficient to provide a decent additional check and balance, particularly if 
such images, headers etc were automatically analysed as part of the already 
excellent validation tools in development.  

I'm sure there are a number of clever ways (that could be unadvertised or kept 
confidential to the pdb) that could be used to check off sufficient variables 
within such data such that it should (?) be very difficult to falsify images 
without triggering alarm bells.

Of course this would probably then drive those that are truly bonkers to 
attempt to fabricate realistically noisy false diffraction images, however I 
would hope that such a scheme might make things just a little more difficult 
for those with fraudulent intent, particularly if no one (apart from the 
developers) knows precisely how and what the checking software checks!

While it seems sad that it's come to this cell biologists and biochemists have 
had to deal with more and more sophisticated versions of the photoshopped 
western for years.  Accordingly, most high profile journals run figures 
through commercial software that does a reasonable job of detection of such 
issues.

J



Sent from my iPhone

On 03/04/2012, at 11:10 PM, Dyda d...@ulti.niddk.nih.gov wrote:

 I think that to review a paper containing a structure derived from
 crystallographic data should indeed involve the referee having access
 to coordinates and to the electron density. Without this access it
 is not possible to judge the quality and very often even the 
 soundness of statements in the paper.
 
 I think the argument that this may give a competitive advantage
 to the referee who him or herself maybe working on the same thing
 should be mute, as I thought article refereeing was supposed to
 be a confidential process. Breaching this would be a serious 
 ethical violation. In my experience, before agreeing to review,
 we see the abstract, I was always thought that I was supposed to
 decline if there is a potential conflict with my own work. 
 Perhaps naively, but I always assumed that everyone acts like this.
 
 Unfortunately however, there is another serious issue.
 
 After a very troubling experience with a paper I reviewed, I discussed
 this issue with journal editors. What they said was that they already
 have a hell of time to find people who agree to referee, by raising the
 task level (asking refs to look at coords and density) they feared
 that no one would agree.  Actually, perhaps many have  noticed the  
 large number  of 5 liner referee reports saying really not much about a
 full length research article. People simply don't have the time to
 put the effort in. So I am not  sure how realistic is to ask even more,
 for something that at some level, is pro bono work.
 
 
 Fred
 ***
 Fred Dyda, Ph.D.   Phone:301-402-4496
 Laboratory of Molecular BiologyFax: 301-496-0201
 DHHS/NIH/NIDDK e-mail:fred.d...@nih.gov  
 Bldg. 5. Room 303 
 Bethesda, MD 20892-0560  URGENT message e-mail: 2022476...@mms.att.net
 Google maps coords: 39.000597, -77.102102
 http://www2.niddk.nih.gov/NIDDKLabs/IntramuralFaculty/DydaFred
 ***



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Bosch, Juergen
Hi Fred,

I'll go public on this one. This happened to me. I will not reveal who reviewed 
my paper and which paper it was only that your naive assumption might not 
always be correct. I have learned my lesson and exclude people with overlapping 
interests (even though they actually might be the best critical reviewers for 
your work). Unfortunately you don't really have control if the journal still 
decides to pick those excluded reviewers.
As a suggestion to people out there, make sure to not encrypt your comments as 
pdf and PW protect them - that's how I found out about the identity of the 
reviewer - as it couldn't be changed by the journal.

I agree though that it shouldn't happen and I hope it only happens in very few 
cases.

Jürgen


On Apr 3, 2012, at 9:10 AM, Dyda wrote:

I think the argument that this may give a competitive advantage
to the referee who him or herself maybe working on the same thing
should be mute, as I thought article refereeing was supposed to
be a confidential process. Breaching this would be a serious
ethical violation. In my experience, before agreeing to review,
we see the abstract, I was always thought that I was supposed to
decline if there is a potential conflict with my own work.
Perhaps naively, but I always assumed that everyone acts like this.


..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://web.mac.com/bosch_lab/






Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Mark J van Raaij
The remedy for the fact that some reviewers act unethically is not withholding 
coordinates and structure factors, but a more active role for the authors to 
denounce these possible violations and more effective investigations by the 
journals whose reviewers are suspected by the authors of committing these 
violations.
I have witnessed authors being hesitant to complain about possible violations 
and journals not always taking complaints seriously enough.

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 16:45, Bosch, Juergen wrote:

 Hi Fred,
 
 I'll go public on this one. This happened to me. I will not reveal who 
 reviewed my paper and which paper it was only that your naive assumption 
 might not always be correct. I have learned my lesson and exclude people with 
 overlapping interests (even though they actually might be the best critical 
 reviewers for your work). Unfortunately you don't really have control if the 
 journal still decides to pick those excluded reviewers.
 As a suggestion to people out there, make sure to not encrypt your comments 
 as pdf and PW protect them - that's how I found out about the identity of the 
 reviewer - as it couldn't be changed by the journal.
 
 I agree though that it shouldn't happen and I hope it only happens in very 
 few cases.
 
 Jürgen
 
 
 On Apr 3, 2012, at 9:10 AM, Dyda wrote:
 
 I think the argument that this may give a competitive advantage
 to the referee who him or herself maybe working on the same thing
 should be mute, as I thought article refereeing was supposed to
 be a confidential process. Breaching this would be a serious 
 ethical violation. In my experience, before agreeing to review,
 we see the abstract, I was always thought that I was supposed to
 decline if there is a potential conflict with my own work. 
 Perhaps naively, but I always assumed that everyone acts like this.
 
 
 ..
 Jürgen Bosch
 Johns Hopkins University
 Bloomberg School of Public Health
 Department of Biochemistry  Molecular Biology
 Johns Hopkins Malaria Research Institute
 615 North Wolfe Street, W8708
 Baltimore, MD 21205
 Office: +1-410-614-4742
 Lab:  +1-410-614-4894
 Fax:  +1-410-955-2926
 http://web.mac.com/bosch_lab/
 
 
 
 


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Maria Sola i Vilarrubias
Mark,

I know some stories (which of course I'll not post here)  from the
Crystallography field and from other fields where reviewers profit from the
fact that suddenly they have new, interpreted data which fits very well
with their own results. Stories like to block a manuscript or ask for more
results for the reviewer to be able to submit its own paper (with new
ideas) in time, or copy a structure from the figures, or ask for
experiments that only the reviewer can do so he/she is included in the
paper, or submit as fast as possible in another journal with an extremely
short delay of acceptance (e.g. 10 days,  without revision?, talking to the
editorial board?) things like this. Well, it is not question of making a
full list, here!. The whole problem comes from publishing first, from
competition.

The hope with fraud with X-ray data is that it seems to be detectable,
thanks to valuable people that develop methods to detect it. But it is very
difficult to demonstrate that your work, ideas or results have been copied.
How do you defend from this? And how after giving to them the valuable PDB?

Finally, how many crystallographers are in the world? 5000?  The concept of
ethics can change from one place to another and, more than this, there is
the fact that the reviewer is anonymous.

I try to respond to my reviewers the best I can and I really trust their
criteria, sometimes a bit too much, indeed. I think they all have done a
very nice job. But some of the stories from above happened to me or close
to me and I feel really insecure with the idea of sending a manuscript, the
X-ray data and the PDB, altogether, to a reviewer shielded by anonymity.
It's too risky: with an easy molecular replacement someone can solve a
difficult structure and publish it first. And then the only thing left to
the bad reviewer is to change the author's list! (and for the true
author what is left is to feel like an idiot).

In my humble opinion, we must be strict but not kill ourselves. Trust
authors as we trust reviewers. Otherwise, the whole effort might be useless.

Maria

Dep. Structural Biology
IBMB-CSIC
Baldiri Reixach 10-12
08028 BARCELONA
Spain
Tel: (+34) 93 403 4950
Fax: (+34) 93 403 4979
e-mail: maria.s...@ibmb.csic.es

On 3 April 2012 16:58, Mark J van Raaij mjvanra...@cnb.csic.es wrote:

 The remedy for the fact that some reviewers act unethically is not
 withholding coordinates and structure factors, but a more active role for
 the authors to denounce these possible violations and more effective
 investigations by the journals whose reviewers are suspected by the authors
 of committing these violations.
 I have witnessed authors being hesitant to complain about possible
 violations and journals not always taking complaints seriously enough.

 Mark J van Raaij
 Laboratorio M-4
 Dpto de Estructura de Macromoleculas
 Centro Nacional de Biotecnologia - CSIC
 c/Darwin 3
 E-28049 Madrid, Spain
 tel. (+34) 91 585 4616
 http://www.cnb.csic.es/~mjvanraaij



 On 3 Apr 2012, at 16:45, Bosch, Juergen wrote:

  Hi Fred,
 
  I'll go public on this one. This happened to me. I will not reveal who
 reviewed my paper and which paper it was only that your naive assumption
 might not always be correct. I have learned my lesson and exclude people
 with overlapping interests (even though they actually might be the best
 critical reviewers for your work). Unfortunately you don't really have
 control if the journal still decides to pick those excluded reviewers.
  As a suggestion to people out there, make sure to not encrypt your
 comments as pdf and PW protect them - that's how I found out about the
 identity of the reviewer - as it couldn't be changed by the journal.
 
  I agree though that it shouldn't happen and I hope it only happens in
 very few cases.
 
  Jürgen
 
 
  On Apr 3, 2012, at 9:10 AM, Dyda wrote:
 
  I think the argument that this may give a competitive advantage
  to the referee who him or herself maybe working on the same thing
  should be mute, as I thought article refereeing was supposed to
  be a confidential process. Breaching this would be a serious
  ethical violation. In my experience, before agreeing to review,
  we see the abstract, I was always thought that I was supposed to
  decline if there is a potential conflict with my own work.
  Perhaps naively, but I always assumed that everyone acts like this.
 
 
  ..
  Jürgen Bosch
  Johns Hopkins University
  Bloomberg School of Public Health
  Department of Biochemistry  Molecular Biology
  Johns Hopkins Malaria Research Institute
  615 North Wolfe Street, W8708
  Baltimore, MD 21205
  Office: +1-410-614-4742
  Lab:  +1-410-614-4894
  Fax:  +1-410-955-2926
  http://web.mac.com/bosch_lab/
 
 
 
 




--


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Mark J van Raaij
I don't agree, if we know a referee is dishonest we should try and ruin his 
whole career, not just prevent him from scooping us in this one case. 

Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij



On 3 Apr 2012, at 19:13, Maria Sola i Vilarrubias wrote:

 Mark,
 
 I know some stories (which of course I'll not post here)  from the 
 Crystallography field and from other fields where reviewers profit from the 
 fact that suddenly they have new, interpreted data which fits very well with 
 their own results. Stories like to block a manuscript or ask for more results 
 for the reviewer to be able to submit its own paper (with new ideas) in 
 time, or copy a structure from the figures, or ask for experiments that only 
 the reviewer can do so he/she is included in the paper, or submit as fast as 
 possible in another journal with an extremely short delay of acceptance (e.g. 
 10 days,  without revision?, talking to the editorial board?) things like 
 this. Well, it is not question of making a full list, here!. The whole 
 problem comes from publishing first, from competition.  
 
 The hope with fraud with X-ray data is that it seems to be detectable, thanks 
 to valuable people that develop methods to detect it. But it is very 
 difficult to demonstrate that your work, ideas or results have been copied. 
 How do you defend from this? And how after giving to them the valuable PDB?
 
 Finally, how many crystallographers are in the world? 5000?  The concept of 
 ethics can change from one place to another and, more than this, there is the 
 fact that the reviewer is anonymous.
 
 I try to respond to my reviewers the best I can and I really trust their 
 criteria, sometimes a bit too much, indeed. I think they all have done a very 
 nice job. But some of the stories from above happened to me or close to me 
 and I feel really insecure with the idea of sending a manuscript, the X-ray 
 data and the PDB, altogether, to a reviewer shielded by anonymity. It's too 
 risky: with an easy molecular replacement someone can solve a difficult 
 structure and publish it first. And then the only thing left to the bad 
 reviewer is to change the author's list! (and for the true author what is 
 left is to feel like an idiot).
 
 In my humble opinion, we must be strict but not kill ourselves. Trust authors 
 as we trust reviewers. Otherwise, the whole effort might be useless.
 
 Maria
 
 Dep. Structural Biology
 IBMB-CSIC
 Baldiri Reixach 10-12
 08028 BARCELONA
 Spain
 Tel: (+34) 93 403 4950
 Fax: (+34) 93 403 4979
 e-mail: maria.s...@ibmb.csic.es
 
 On 3 April 2012 16:58, Mark J van Raaij mjvanra...@cnb.csic.es wrote:
 The remedy for the fact that some reviewers act unethically is not 
 withholding coordinates and structure factors, but a more active role for the 
 authors to denounce these possible violations and more effective 
 investigations by the journals whose reviewers are suspected by the authors 
 of committing these violations.
 I have witnessed authors being hesitant to complain about possible violations 
 and journals not always taking complaints seriously enough.
 
 Mark J van Raaij
 Laboratorio M-4
 Dpto de Estructura de Macromoleculas
 Centro Nacional de Biotecnologia - CSIC
 c/Darwin 3
 E-28049 Madrid, Spain
 tel. (+34) 91 585 4616
 http://www.cnb.csic.es/~mjvanraaij
 
 
 
 On 3 Apr 2012, at 16:45, Bosch, Juergen wrote:
 
  Hi Fred,
 
  I'll go public on this one. This happened to me. I will not reveal who 
  reviewed my paper and which paper it was only that your naive assumption 
  might not always be correct. I have learned my lesson and exclude people 
  with overlapping interests (even though they actually might be the best 
  critical reviewers for your work). Unfortunately you don't really have 
  control if the journal still decides to pick those excluded reviewers.
  As a suggestion to people out there, make sure to not encrypt your comments 
  as pdf and PW protect them - that's how I found out about the identity of 
  the reviewer - as it couldn't be changed by the journal.
 
  I agree though that it shouldn't happen and I hope it only happens in very 
  few cases.
 
  Jürgen
 
 
  On Apr 3, 2012, at 9:10 AM, Dyda wrote:
 
  I think the argument that this may give a competitive advantage
  to the referee who him or herself maybe working on the same thing
  should be mute, as I thought article refereeing was supposed to
  be a confidential process. Breaching this would be a serious
  ethical violation. In my experience, before agreeing to review,
  we see the abstract, I was always thought that I was supposed to
  decline if there is a potential conflict with my own work.
  Perhaps naively, but I always assumed that everyone acts like this.
 
 
  ..
  Jürgen Bosch
  Johns Hopkins University
  Bloomberg School of Public Health
  

Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Manfred S. Weiss

Dear all,

I find this discussion most amazing. Here, we are dealing with the most
serious issue
that happened to Macromolecular Crystallography since the Alabama case,
and the
whole discussion is centered around singular and plural and Greek and
Latin words
and what not.

In psychology such phenomenon is referred to as displacement activity.

If you are interested, here is the MacMillon definition of it:

http://www.macmillandictionary.com/dictionary/british/displacement-activity

Cheers,

Manfred


On 01.04.2012 19:35, Gerard Bricogne wrote:

On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:

On 04/01/12 10:18, Gerard Bricogne wrote:

Dear Paul,

   May I join the mostly silent chorus of Greek/Latin-aware grumps who
wince when seeing data treated as singular when it is plural.

When it are plural?

  Good nit-picking :-) . In my mind the quotes around data would have
had the same effect as writing 'the word data', and referring to that word
by the 'it'. So there is only one word, while its grammatical number is
plural.



At any rate, I heard a Nobel laureate use it incorrectly just two days ago.

  We shouldn't learn to write by imitating Nobel laureates, then.


  With best wishes,

   Gerard.


--
===
All Things Serve the Beam
===
David J. Schuller
modern man in a post-modern world
MacCHESS, Cornell University
schul...@cornell.edu


--
Dr. Manfred. S. Weiss
Helmholtz-Zentrum Berlin für Materialien und Energie
Macromolecular Crystallography (HZB-MX)
Albert-Einstein-Str. 15
D-12489 Berlin
GERMANY
Fon:   +49-30-806213149
Fax:   +49-30-806214975
Web:   http://www.helmholtz-berlin.de/bessy-mx
Email: mswe...@helmholtz-berlin.de




Helmholtz-Zentrum Berlin für Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. 
Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Antony Oliver
To my mind it just points to the fact that many scientists are generally
unable to focus on one task or 'thing' at a time.
i.e. very short attention spans...

[before the flamer's start ‹ this is meant as a joke]

Tony.

---
Dr Antony W Oliver

Senior Research Fellow
CR-UK DNA Repair Enzymes Group
Genome Damage and Stability Centre
Science Park Road
University of Sussex
Falmer, Brighton, BN1 9RQ

email: antony.oli...@sussex.ac.uk
tel (office): +44 (0)1273 678349
tel (lab): +44 (0)1273 677512






On 4/2/12 9:47 AM, Manfred S. Weiss manfred.we...@helmholtz-berlin.de
wrote:

Dear all,

I find this discussion most amazing. Here, we are dealing with the most
serious issue
that happened to Macromolecular Crystallography since the Alabama case,
and the
whole discussion is centered around singular and plural and Greek and
Latin words
and what not.

In psychology such phenomenon is referred to as displacement activity.

If you are interested, here is the MacMillon definition of it:

http://www.macmillandictionary.com/dictionary/british/displacement-activit
y

Cheers,

Manfred


On 01.04.2012 19:35, Gerard Bricogne wrote:
 On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:
 On 04/01/12 10:18, Gerard Bricogne wrote:
 Dear Paul,

May I join the mostly silent chorus of Greek/Latin-aware
grumps who
 wince when seeing data treated as singular when it is plural.
 When it are plural?
   Good nit-picking :-) . In my mind the quotes around data would
have
 had the same effect as writing 'the word data', and referring to that
word
 by the 'it'. So there is only one word, while its grammatical number is
 plural.


 At any rate, I heard a Nobel laureate use it incorrectly just two days
ago.
   We shouldn't learn to write by imitating Nobel laureates, then.


   With best wishes,

Gerard.

 --
 ===
 All Things Serve the Beam
 ===
 David J. Schuller
 modern man in a post-modern world
 MacCHESS, Cornell University
 schul...@cornell.edu

--
Dr. Manfred. S. Weiss
Helmholtz-Zentrum Berlin für Materialien und Energie
Macromolecular Crystallography (HZB-MX)
Albert-Einstein-Str. 15
D-12489 Berlin
GERMANY
Fon:   +49-30-806213149
Fax:   +49-30-806214975
Web:   http://www.helmholtz-berlin.de/bessy-mx
Email: mswe...@helmholtz-berlin.de




Helmholtz-Zentrum Berlin für Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher
Forschungszentren e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv.
Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread John R Helliwell
Dear Colleagues,
This is a further instance of likely scientific fraud in
macromolecular crystallography, ie under formal investigation at the
relevant university.

Both Bernhard and the Acta D and F Editors further document aspects in
their written pieces related to the need for diffraction data images
availability. The call for a 'universal system' by the  Editors, in
their Editorial, is also what the IUCr Forum on these matters has also
been discussing. A possible convergence on local raw data
repositories, with each data set doi registered where it underpins a
publication, detailed by the IUCr DDD WG thus far, is unlikely to be
'universal' in its global coverage. But setting standards by
encouraging raw data archives in our field will afford a much needed
clarity in favour of retaining raw data wherever possible. A separate
issue will be, in my view, the certain expansion of current validation
checks. Indeed it is the standard practice in chemical crystallography
submissions to IUCr journals for Co-Editors to validate the structure
determination and refinement, including omit map calculations where
appropriate. Of course this is most often a much easier task in
chemical crystallography, per crystal structure checked, than would be
the case for macromolecular crystallography.

Again I encourage colleagues to lodge their inputs at the IUCr Forum
on any aspect of principle or practice in achieving diffraction raw
data archiving.

Best wishes,
John

John R Helliwell


On Mon, Apr 2, 2012 at 9:47 AM, Manfred S. Weiss
manfred.we...@helmholtz-berlin.de wrote:
 Dear all,

 I find this discussion most amazing. Here, we are dealing with the most
 serious issue
 that happened to Macromolecular Crystallography since the Alabama case,
 and the
 whole discussion is centered around singular and plural and Greek and
 Latin words
 and what not.

 In psychology such phenomenon is referred to as displacement activity.

 If you are interested, here is the MacMillon definition of it:

 http://www.macmillandictionary.com/dictionary/british/displacement-activity

 Cheers,

 Manfred



 On 01.04.2012 19:35, Gerard Bricogne wrote:

 On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:

 On 04/01/12 10:18, Gerard Bricogne wrote:

 Dear Paul,

       May I join the mostly silent chorus of Greek/Latin-aware grumps
 who
 wince when seeing data treated as singular when it is plural.

 When it are plural?

      Good nit-picking :-) . In my mind the quotes around data would have
 had the same effect as writing 'the word data', and referring to that
 word
 by the 'it'. So there is only one word, while its grammatical number is
 plural.


 At any rate, I heard a Nobel laureate use it incorrectly just two days
 ago.

      We shouldn't learn to write by imitating Nobel laureates, then.


      With best wishes,

           Gerard.

 --
 ===
 All Things Serve the Beam
 ===
                                David J. Schuller
                                modern man in a post-modern world
                                MacCHESS, Cornell University
                                schul...@cornell.edu


 --
 Dr. Manfred. S. Weiss
 Helmholtz-Zentrum Berlin für Materialien und Energie
 Macromolecular Crystallography (HZB-MX)
 Albert-Einstein-Str. 15
 D-12489 Berlin
 GERMANY
 Fon:   +49-30-806213149
 Fax:   +49-30-806214975
 Web:   http://www.helmholtz-berlin.de/bessy-mx
 Email: mswe...@helmholtz-berlin.de


 

 Helmholtz-Zentrum Berlin für Materialien und Energie GmbH

 Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren
 e.V.

 Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv.
 Vorsitzende Dr. Beatrix Vierkorn-Rudolph
 Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

 Sitz Berlin, AG Charlottenburg, 89 HRB 5583

 Postadresse:
 Hahn-Meitner-Platz 1
 D-14109 Berlin

 http://www.helmholtz-berlin.de



-- 
Professor John R Helliwell DSc


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Gerard Bricogne
Dear Manfred,

 I understand your surprise and indignation, but for the sake of
fairness you might also remember that I argued rather insistently at the end
of last year in favour of the deposition of raw diffraction images, which is
the crux of this problem.


 With best wishes,
 
  Gerard.

--
On Mon, Apr 02, 2012 at 10:47:26AM +0200, Manfred S. Weiss wrote:
 Dear all,

 I find this discussion most amazing. Here, we are dealing with the most
 serious issue
 that happened to Macromolecular Crystallography since the Alabama case,
 and the
 whole discussion is centered around singular and plural and Greek and
 Latin words
 and what not.

 In psychology such phenomenon is referred to as displacement activity.

 If you are interested, here is the MacMillon definition of it:

 http://www.macmillandictionary.com/dictionary/british/displacement-activity

 Cheers,

 Manfred


 On 01.04.2012 19:35, Gerard Bricogne wrote:
 On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:
 On 04/01/12 10:18, Gerard Bricogne wrote:
 Dear Paul,

May I join the mostly silent chorus of Greek/Latin-aware grumps 
 who
 wince when seeing data treated as singular when it is plural.
 When it are plural?
   Good nit-picking :-) . In my mind the quotes around data would 
 have
 had the same effect as writing 'the word data', and referring to that 
 word
 by the 'it'. So there is only one word, while its grammatical number is
 plural.


 At any rate, I heard a Nobel laureate use it incorrectly just two days 
 ago.
   We shouldn't learn to write by imitating Nobel laureates, then.


   With best wishes,

Gerard.

 --
 ===
 All Things Serve the Beam
 ===
 David J. Schuller
 modern man in a post-modern world
 MacCHESS, Cornell University
 schul...@cornell.edu

 --
 Dr. Manfred. S. Weiss
 Helmholtz-Zentrum Berlin für Materialien und Energie
 Macromolecular Crystallography (HZB-MX)
 Albert-Einstein-Str. 15
 D-12489 Berlin
 GERMANY
 Fon:   +49-30-806213149
 Fax:   +49-30-806214975
 Web:   http://www.helmholtz-berlin.de/bessy-mx
 Email: mswe...@helmholtz-berlin.de


 

 Helmholtz-Zentrum Berlin für Materialien und Energie GmbH

 Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren 
 e.V.

 Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. 
 Vorsitzende Dr. Beatrix Vierkorn-Rudolph
 Geschäftsführerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

 Sitz Berlin, AG Charlottenburg, 89 HRB 5583

 Postadresse:
 Hahn-Meitner-Platz 1
 D-14109 Berlin

 http://www.helmholtz-berlin.de

-- 

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread jens Preben Morth

For the latest documentary on trolls in Norway see
http://www.imdb.com/title/tt1740707/

The documentary describes both the classification system of Norwegian 
Trolls and why they are sensitive to sun light,  i.e turn to stone.
Depending in the species, some Trolls apparently prefer bridges and 
others caves. They all are attracted to christian blood though.

cheers
Preben

On 4/1/12 10:42 PM, Ethan Merritt wrote:

On Sunday, 01 April 2012, Kendall Nettles wrote:

What is the single Latin word for troll?

Kendall


According to Google Translate, it is Troglodytarum.
But I'm dubious.
I thought trolls lived under bridges rather than in caves.
Except for the ones who inhabit the internet, of course.

Ethan


--
J. Preben Morth, Ph.D
Group Leader
Membrane Transport Group
Nordic EMBL Partnership
Centre for Molecular Medicine Norway (NCMM)
University of Oslo
P.O.Box 1137 Blindern
0318 Oslo, Norway

Email: j.p.mo...@ncmm.uio.no
Tel: +47 2284 0794

http://www.jpmorth.dk


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Herman . Schreuder
Dear Kevin et al.,
 
At the risk of being flamed as well, I could not resist this opportunity for 
shameless self promoting During my Ph.D. I worked on a flavoprotein as well 
and found flavin bending angles of 10 and 19°. I even published pictures of the 
electron density of the flavin (J.Mol.Biol.(1989), 208:679-696) and cited a 
reference from 1987 reporting a flavin bending angle of 20° for another 
flavoprotein. In this time, one had to presonally modify the PROLSQ restraints 
by hand, since if something was defined as being flat, it would become flat, no 
matter what the electron density was trying to say. Trying since there was no 
Rfree, no maximum likelyhood refinement and no CCP4BB so the maps were heavily 
biased. 
 
Although this period is commonly referred to as the stone-age of protein 
crystallography, many crystal structures were solved in this time that are 
still valid today. Before reinventing wheels, one could look a little further 
back in the literature than the last 7 years. Remains the question how this 
incorrect FMN definition could remain in the CCP4 package for so long. We need 
more people like Kevin, who loudly complain about errors in the CCP4 
definitions instead of just fixing one's personal definition.
 
Cheers!
Herman




From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
Kevin Jin
Sent: Sunday, April 01, 2012 9:06 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication


I hope and believe that this is not the case.  Even basically-trained 
crystallographers should be able to calculate andinterpret difference maps 
of the kind described by Bernhard.  And with the EDS and PDB_REDO server, one 
does not even need to know how to make generate a difference map...
 
You are right!  
 
Actually, I am not an experienced protein crystallographer. I have 
learnt a lot from CCP4BB. I may have paid too much attention to bonding angle 
and bond length, like in small molecule. This may be an example to share with 
you. 
 
When I worked on those nitroreductase complexed with FMN in 2009 (?), I 
always observed that the flavin ring presented a strange geometry after 
refinement. Indeed, I had used the definition of FMN from CCP4 library all the 
time.
 
In some cases, the methyl group at position of either 7a or 8a was bent 
off the aromatic ring, if the whole the rest of flavin was restrained in a flat 
plane.  According to my limited knowledge from organic chemistry, carbon of 7 
and 8 on the flavin ring is sp2 hybridized in a coplanar manner. How could 
those methyl groups be bent as sp3 hybridization? Any chemistry behind?
 
With increased resolution (1.6 ~ 1.8 Ang), I observed that the electron 
density map was a bent along the N5-N10 axis. The bend angle was around ~16 
degree.   Again, I questioned myself why it was bent? Should this be correct?
 
According to my limited knowledge in chemistry, N10 should be sp3 
configuration even if FMN is in its oxidization form, in which the flavin ring 
should be bent. A quick google immediately gave me a link to a very nice 
paper published by David W. Rodgers in 2002.  
 
http://www.jbc.org/content/277/13/11513.full.pdf+html
 
According to this paper, Yes!  In the oxidized enzyme, the flavin ring 
system adopts a strongly bent (16°) conformation, and the bend increases (25°) 
in the reduced form of the enzyme,...
 
When I reported this in the group meeting, I was laughed and told that 
this is just a model bias. It was over interpreted.  Nobody has such sharp 
vision on electron density map.  If this was correct, why nobody could find 
this and report to CCP4 within last 7 years? 
 
Eventually, a senior team member emailed to CCP4 about this issue. 
Since then, the definition of FMN was updated, according to my suggestion. 
 
I was asked how did you find it?... why you believed you are so 
right?  I really don't how to answer. 
 
Je pense donc je suis
 
Kevin
 
 
On Sun, Apr 1, 2012 at 8:09 AM, Paul Emsley 
paul.ems...@bioch.ox.ac.uk wrote:
 On 31/03/12 23:08, Kevin Jin wrote:


 I really wish PDB could have some people to review those important
 structures, like paper reviewer.


 So do the wwPDB, I would imagine. 

 But they can't just magic funding and positions into existence...

 If the coordinate is downloaded for modeling and docking, people may 
not
 check the density and model by themself. However this is not the 
worst case,
 since the original data was fabricated.


 1. All of data was correct and real

Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Boaz Shaanan
OK, following on our psychological displacement:

The examples Pheobe gave are mostly of collective nouns

http://en.wikipedia.org/wiki/Collective_noun

to be distinguished from mass nouns:

http://en.wikipedia.org/wiki/Mass_noun

Strictly speaking, data is not a collective noun and is the plural of datum. 
Use of singular form is accepted nowadays but it doesn't mean that it's 
correct. To quote Merriam-webster: ...Data leads its own life independent of 
datum...
See:
http://www.merriam-webster.com/dictionary/data

And by the way, what do you answer to how much data did you collect? A lot? 
just a little? 
Had we asked: how complete is your data? how many frames did you collect? 
How many data sets? wouldn't we have got a much more informative answer?

  Boaz

Most crystallographers use the word data as a mass noun - that is, the syntax
of data follows that of gravel or mud, not that of pebble/pebbles.  
People
who pounce on the phrase data is routinely say data collection and data
processing.  But note that the proper way to construct compound nouns such as
those is to use the singular form - one would never say rocks collection or
apples picking.  So if we have to say data are then we should be discussing
how (not) to fabricate a datum set.  Also note that when people come back
from the synchrotron, we ask how much data did you collect not how many.
Much is generally used with mass nouns.

That doesn't mean we can't ALSO use the word as one with discrete singular and
plural forms, especially when we have a few, individual observations rather than
a huge pile that blurs into an aggregate.  In that case, I see nothing incorrect
about discussing an individual datum and using data as the plural form.

Sometimes it is the artificial, over-simplified rule that is stupid, not the 
native
speakers of a language.


=
Phoebe A. Rice
Dept. of Biochemistry  Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alp
habetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp

Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread David Schuller
I am surprised that James Holton was not listed as a co-author, I 
understand that he has been expending a great deal of effort into how to 
accurately fabricate data.


--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edu


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Herman . Schreuder
If James Holton had been involved, the fabrication would not have been
discovered. 
Herman

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
David Schuller
Sent: Monday, April 02, 2012 2:56 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

I am surprised that James Holton was not listed as a co-author, I
understand that he has been expending a great deal of effort into how to
accurately fabricate data.

--
===
All Things Serve the Beam
===
David J. Schuller
modern man in a post-modern world
MacCHESS, Cornell University
schul...@cornell.edu


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Gerard DVD Kleywegt

Dear Manfred,

Outside Germany, such excursions are called humour. If you are interested, 
here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour


--Gerard

PS: It was on a Sunday so all levity was perpetrated in people's own time. 
Today we'll all be serious again and frown and tut-tut appropriately.




On Mon, 2 Apr 2012, Manfred S. Weiss wrote:


Dear all,

I find this discussion most amazing. Here, we are dealing with the most
serious issue
that happened to Macromolecular Crystallography since the Alabama case,
and the
whole discussion is centered around singular and plural and Greek and
Latin words
and what not.

In psychology such phenomenon is referred to as displacement activity.

If you are interested, here is the MacMillon definition of it:

http://www.macmillandictionary.com/dictionary/british/displacement-activity

Cheers,

Manfred


On 01.04.2012 19:35, Gerard Bricogne wrote:

On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:

On 04/01/12 10:18, Gerard Bricogne wrote:

Dear Paul,

   May I join the mostly silent chorus of Greek/Latin-aware grumps 
who

wince when seeing data treated as singular when it is plural.

When it are plural?

  Good nit-picking :-) . In my mind the quotes around data would have
had the same effect as writing 'the word data', and referring to that 
word

by the 'it'. So there is only one word, while its grammatical number is
plural.


At any rate, I heard a Nobel laureate use it incorrectly just two days 
ago.

  We shouldn't learn to write by imitating Nobel laureates, then.


  With best wishes,

   Gerard.


--
===
All Things Serve the Beam
===
David J. Schuller
modern man in a post-modern world
MacCHESS, Cornell University
schul...@cornell.edu


--
Dr. Manfred. S. Weiss
Helmholtz-Zentrum Berlin f?r Materialien und Energie
Macromolecular Crystallography (HZB-MX)
Albert-Einstein-Str. 15
D-12489 Berlin
GERMANY
Fon:   +49-30-806213149
Fax:   +49-30-806214975
Web:   http://www.helmholtz-berlin.de/bessy-mx
Email: mswe...@helmholtz-berlin.de




Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren 
e.V.


Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. 
Vorsitzende Dr. Beatrix Vierkorn-Rudolph

Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de




Best wishes,

--Gerard

**
   Gerard J. Kleywegt

  http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
**
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
**
   Little known gastromathematical curiosity: let z be the
   radius and a the thickness of a pizza. Then the volume
of that pizza is equal to pi*z*z*a !
**


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Robert Sweet

I thought Ethan was looking for the verb -- you know, fishing!!!


On Mon, 2 Apr 2012, jens Preben Morth wrote:


For the latest documentary on trolls in Norway see
http://www.imdb.com/title/tt1740707/

The documentary describes both the classification system of Norwegian Trolls 
and why they are sensitive to sun light,  i.e turn to stone.
Depending in the species, some Trolls apparently prefer bridges and others 
caves. They all are attracted to christian blood though.

cheers
Preben

On 4/1/12 10:42 PM, Ethan Merritt wrote:

On Sunday, 01 April 2012, Kendall Nettles wrote:

What is the single Latin word for troll?

Kendall


According to Google Translate, it is Troglodytarum.
But I'm dubious.
I thought trolls lived under bridges rather than in caves.
Except for the ones who inhabit the internet, of course.

Ethan





--
=
Robert M. Sweet E-Dress: sw...@bnl.gov
Group Leader, PXRR: Macromolecular   ^ (that's L
  Crystallography Research Resource at NSLSnot 1)
  http://px.nsls.bnl.gov/
Biology Dept
Brookhaven Nat'l Lab.   Phones:
Upton, NY  11973631 344 3401  (Office)
U.S.A.  631 344 2741  (Facsimile)
=


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Bernhard Rupp (Hofkristallrat a.D.)
Guys,

http://www.youtube.com/watch?v=CobZuaPMQHw

second 9 in this 22 sec video 


-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Gerard
DVD Kleywegt
Sent: Monday, April 02, 2012 8:04 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very
informative - Trends in Data Fabrication]

Dear Manfred,

Outside Germany, such excursions are called humour. If you are interested,
here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour

--Gerard


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Francis E Reyes
I'm now preparing for the flood of 'unsubscribe ccp4bb' requests


On Apr 2, 2012, at 9:15 AM, Bernhard Rupp (Hofkristallrat a.D.) wrote:

 Guys,
 
 http://www.youtube.com/watch?v=CobZuaPMQHw
 
 second 9 in this 22 sec video 
 
 
 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Gerard
 DVD Kleywegt
 Sent: Monday, April 02, 2012 8:04 AM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very
 informative - Trends in Data Fabrication]
 
 Dear Manfred,
 
 Outside Germany, such excursions are called humour. If you are interested,
 here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour
 
 --Gerard


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Andreas Förster

Dear Gerard,

inside Germany it's apparently called German Humour.  There's a 
Wikipedia entry for that as well.  Go figure:


http://en.wikipedia.org/wiki/German_humor


Andreas

(still living on Sunday time)


On 02/04/2012 4:03, Gerard DVD Kleywegt wrote:

Dear Manfred,

Outside Germany, such excursions are called humour. If you are
interested, here is the Wikipedia page for it:
http://en.wikipedia.org/wiki/Humour

--Gerard

PS: It was on a Sunday so all levity was perpetrated in people's own
time. Today we'll all be serious again and frown and tut-tut appropriately.



Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Bernhard Rupp (Hofkristallrat a.D.)
Robbie has restored the PDB_REDO of 3k78

 

It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2

 

and Louise Jones form the IUCr office has kindly made the article open
access.

 

http://journals.iucr.org/f/issues/2012/04/00/issconts.html

 

BR

 

 

 

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
Bernhard Rupp (Hofkristallrat a.D.)
Sent: Sunday, April 01, 2012 06:06
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication

 

 Hofkristallrat außer Dienst, is written as Bernhard - unless you are
referring to some other guy with a french name Bernard. 

 

As one may extrapolate given my recent paper, I have been called names a lot
worse….

 

Ø  And the book indeed is a bible of xtallography.

 

Enough of this - it is becoming embarrassing. I wish I had done a more
careful job proofing, as over 500 errata attest to,

and we all are only seeing further because we are standing on the shoulders
of giants. So once again thanks

to all the contributors I have pestered with my questions on BB and then
some, and to all those who actually read BMC and 

submitted errata. 

 

Best regards, BR

-
Bernhard Hieronimus Rupp, Hofkristallrat a.D.
001 (925) 209-7429
+43 (676) 571-0536
hofkristall...@gmail.com
b...@hofkristallamt.org
http://www.ruppweb.org/
--
Once the sun of science is standing low, even dwarfs cast tall shadows
--

 

 



Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Gerard DVD Kleywegt

Dear Andreas,

That page confirms the old adage: German humour is no laughing matter.

--Gerard


On Mon, 2 Apr 2012, Andreas F?rster wrote:


Dear Gerard,

inside Germany it's apparently called German Humour.  There's a Wikipedia 
entry for that as well.  Go figure:


http://en.wikipedia.org/wiki/German_humor


Andreas

(still living on Sunday time)


On 02/04/2012 4:03, Gerard DVD Kleywegt wrote:

Dear Manfred,

Outside Germany, such excursions are called humour. If you are
interested, here is the Wikipedia page for it:
http://en.wikipedia.org/wiki/Humour

--Gerard

PS: It was on a Sunday so all levity was perpetrated in people's own
time. Today we'll all be serious again and frown and tut-tut appropriately.






Best wishes,

--Gerard

**
   Gerard J. Kleywegt

  http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
**
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
**
   Little known gastromathematical curiosity: let z be the
   radius and a the thickness of a pizza. Then the volume
of that pizza is equal to pi*z*z*a !
**


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread George T. DeTitta
And please consider the date of Sunday's posts. 

We take this stuff seriously. That's what's nice about science. We ferret out 
mischief and bring it to the public. Nothing up my sleeve - all tricks will be 
exposed and dealt with harshly 

A Buffalo view. 


Sent via BlackBerry by ATT

-Original Message-
From: Gerard DVD Kleywegt ger...@xray.bmc.uu.se
Sender: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK
Date: Mon, 2 Apr 2012 17:03:42 
To: CCP4BB@JISCMAIL.AC.UK
Reply-To: Gerard DVD Kleywegt ger...@xray.bmc.uu.se
Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - 
Trends in Data Fabrication]

Dear Manfred,

Outside Germany, such excursions are called humour. If you are interested,
here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour

--Gerard

PS: It was on a Sunday so all levity was perpetrated in people's own time.
Today we'll all be serious again and frown and tut-tut appropriately.



On Mon, 2 Apr 2012, Manfred S. Weiss wrote:

 Dear all,

 I find this discussion most amazing. Here, we are dealing with the most
 serious issue
 that happened to Macromolecular Crystallography since the Alabama case,
 and the
 whole discussion is centered around singular and plural and Greek and
 Latin words
 and what not.

 In psychology such phenomenon is referred to as displacement activity.

 If you are interested, here is the MacMillon definition of it:

 http://www.macmillandictionary.com/dictionary/british/displacement-activity

 Cheers,

 Manfred


 On 01.04.2012 19:35, Gerard Bricogne wrote:
 On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:
 On 04/01/12 10:18, Gerard Bricogne wrote:
 Dear Paul,

May I join the mostly silent chorus of Greek/Latin-aware grumps
 who
 wince when seeing data treated as singular when it is plural.
 When it are plural?
   Good nit-picking :-) . In my mind the quotes around data would have
 had the same effect as writing 'the word data', and referring to that
 word
 by the 'it'. So there is only one word, while its grammatical number is
 plural.


 At any rate, I heard a Nobel laureate use it incorrectly just two days
 ago.
   We shouldn't learn to write by imitating Nobel laureates, then.


   With best wishes,

Gerard.

 --
 ===
 All Things Serve the Beam
 ===
 David J. Schuller
 modern man in a post-modern world
 MacCHESS, Cornell University
 schul...@cornell.edu

 --
 Dr. Manfred. S. Weiss
 Helmholtz-Zentrum Berlin f?r Materialien und Energie
 Macromolecular Crystallography (HZB-MX)
 Albert-Einstein-Str. 15
 D-12489 Berlin
 GERMANY
 Fon:   +49-30-806213149
 Fax:   +49-30-806214975
 Web:   http://www.helmholtz-berlin.de/bessy-mx
 Email: mswe...@helmholtz-berlin.de


 

 Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH

 Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren
 e.V.

 Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv.
 Vorsitzende Dr. Beatrix Vierkorn-Rudolph
 Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

 Sitz Berlin, AG Charlottenburg, 89 HRB 5583

 Postadresse:
 Hahn-Meitner-Platz 1
 D-14109 Berlin

 http://www.helmholtz-berlin.de



Best wishes,

--Gerard

**
Gerard J. Kleywegt

   http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
**
The opinions in this message are fictional.  Any similarity
to actual opinions, living or dead, is purely coincidental.
**
Little known gastromathematical curiosity: let z be the
radius and a the thickness of a pizza. Then the volume
 of that pizza is equal to pi*z*z*a !
**


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Phoebe Rice
Can we leverage this to push journals to routinely allow reviewers access 
coordinates and maps? 

Outright fraud is outrageous, but I'm actually more worried about ligands fit 
to marginal density and other issues of under-supervised model building.   

=
Phoebe A. Rice
Dept. of Biochemistry  Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


 Original message 
Date: Mon, 2 Apr 2012 08:41:02 -0700
From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of Bernhard Rupp 
(Hofkristallrat a.D.) hofkristall...@gmail.com)
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication  
To: CCP4BB@JISCMAIL.AC.UK

   Robbie has restored the PDB_REDO of 3k78



   It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2



   and Louise Jones form the IUCr office has kindly
   made the article open access.



   http://journals.iucr.org/f/issues/2012/04/00/issconts.html



   BR







   From: CCP4 bulletin board
   [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard
   Rupp (Hofkristallrat a.D.)
   Sent: Sunday, April 01, 2012 06:06
   To: CCP4BB@JISCMAIL.AC.UK
   Subject: Re: [ccp4bb] very informative - Trends in
   Data Fabrication



Hofkristallrat auA*er Dienst, is written as
   Bernhard - unless you are referring to some other
   guy with a french name Bernard.



   As one may extrapolate given my recent paper, I have
   been called names a lot worse



   A*  And the book indeed is a bible of xtallography.



   Enough of this - it is becoming embarrassing. I wish
   I had done a more careful job proofing, as over 500
   errata attest to,

   and we all are only seeing further because we are
   standing on the shoulders of giants. So once again
   thanks

   to all the contributors I have pestered with my
   questions on BB and then some, and to all those who
   actually read BMC and

   submitted errata.



   Best regards, BR

   -
   Bernhard Hieronimus Rupp, Hofkristallrat a.D.
   001 (925) 209-7429
   +43 (676) 571-0536
   hofkristall...@gmail.com
   b...@hofkristallamt.org
   http://www.ruppweb.org/
   --
   Once the sun of science is standing low, even dwarfs
   cast tall shadows
   --






Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Jacob Keller
I like your point--somehow we should enlist the evil inclination to power
our science, a la Faust. How is it that those hackers are so innovative for
so little reward? I remember a Smithsonian article years ago which quoted
the calculated mean $/hr rate of money counterfeiters as being ~pennies/hr,
and I assume hackers would fit right in there...

JPK

On Sun, Apr 1, 2012 at 11:45 PM, Artem Evdokimov
artem.evdoki...@gmail.comwrote:

 I can't resist asking: If we assume that the data fabrication
 techniques and the techniques for discovery of such activities should
 have the same sort of arms race as the development of viruses and
 anti-malvare software (but of course on a much more modest scale since
 structural biology is a relatively niche discipline) - can we then
 speculate further that eventually the most sophisticated fabrication
 techniques would be equivalent to de novo structure prediction :) It's
 really too bad that there's no real money in this (again, relatively
 speaking - not as much money as there is in software development),
 because if there was then the structural biology equivalent of 'virus
 hackers' would in reality approximate the same development trajectory
 as the most successful (and legitimate) protein modelers. Given the
 ingenuity of hackers and like-minded people in general, I sometimes
 wonder if this isn't a better way to develop structure prediction
 tools...

 Artem

 On Sun, Apr 1, 2012 at 10:09 AM, Paul Emsley paul.ems...@bioch.ox.ac.uk
 wrote:
  On 31/03/12 23:08, Kevin Jin wrote:
 
 
  I really wish PDB could have some people to review those important
  structures, like paper reviewer.
 
 
  So do the wwPDB, I would imagine.
 
  But they can't just magic funding and positions into existence...
 
  If the coordinate is downloaded for modeling and docking, people may not
  check the density and model by themself. However this is not the worst
 case,
  since the original data was fabricated.
 
 
  1. All of data was correct and real,
 
 
  Hmmm...
 
   It will be very difficult for people to check the density and
 coordinated
  if he/she is not a well-trained crystallographer.
 
 
  I hope and believe that this is not the case.  Even basically-trained
  crystallographers should be able to calculate and interpret difference
 maps
  of the kind described by Bernhard.  And with the EDS and PDB_REDO server,
  one does not even need to know how to make generate a difference map...
 
  Paul.
 
 




-- 
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
email: j-kell...@northwestern.edu
***


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Prince, D Bryan
I thought we had evidence for hackers doing this already. J



http://www.nature.com/nature/journal/v477/n7365/full/477373e.html



(no flames, please-'tis intended to be funny, not factual)



Bryan



From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
Jacob Keller
Sent: Monday, April 02, 2012 1:25 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication



I like your point--somehow we should enlist the evil inclination to
power our science, a la Faust. How is it that those hackers are so
innovative for so little reward? I remember a Smithsonian article years
ago which quoted the calculated mean $/hr rate of money counterfeiters
as being ~pennies/hr, and I assume hackers would fit right in there...



JPK

On Sun, Apr 1, 2012 at 11:45 PM, Artem Evdokimov
artem.evdoki...@gmail.com wrote:

I can't resist asking: If we assume that the data fabrication
techniques and the techniques for discovery of such activities should
have the same sort of arms race as the development of viruses and
anti-malvare software (but of course on a much more modest scale since
structural biology is a relatively niche discipline) - can we then
speculate further that eventually the most sophisticated fabrication
techniques would be equivalent to de novo structure prediction :) It's
really too bad that there's no real money in this (again, relatively
speaking - not as much money as there is in software development),
because if there was then the structural biology equivalent of 'virus
hackers' would in reality approximate the same development trajectory
as the most successful (and legitimate) protein modelers. Given the
ingenuity of hackers and like-minded people in general, I sometimes
wonder if this isn't a better way to develop structure prediction
tools...

Artem


On Sun, Apr 1, 2012 at 10:09 AM, Paul Emsley
paul.ems...@bioch.ox.ac.uk wrote:
 On 31/03/12 23:08, Kevin Jin wrote:


 I really wish PDB could have some people to review those important
 structures, like paper reviewer.


 So do the wwPDB, I would imagine.

 But they can't just magic funding and positions into existence...

 If the coordinate is downloaded for modeling and docking, people may
not
 check the density and model by themself. However this is not the worst
case,
 since the original data was fabricated.


 1. All of data was correct and real,


 Hmmm...

  It will be very difficult for people to check the density and
coordinated
 if he/she is not a well-trained crystallographer.


 I hope and believe that this is not the case.  Even basically-trained
 crystallographers should be able to calculate and interpret difference
maps
 of the kind described by Bernhard.  And with the EDS and PDB_REDO
server,
 one does not even need to know how to make generate a difference
map...

 Paul.









--
***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
email: j-kell...@northwestern.edu
***


--
Confidentiality Notice: This message is private and may contain confidential 
and proprietary information. If you have received this message in error, please 
notify us and remove it from your system and note that you must not copy, 
distribute or take any action in reliance on it. Any unauthorized use or 
disclosure of the contents of this message is not permitted and may be unlawful.
 


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Bosch, Juergen
Hm, last I checked my passport said German - still think I can make lots of fun 
of myself. Some Germans are epigenetically marked with humor-suppressor genes 
others not.

Jürgen

On Apr 2, 2012, at 11:03 AM, Gerard DVD Kleywegt wrote:

Dear Manfred,

Outside Germany, such excursions are called humour. If you are interested,
here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour

--Gerard

PS: It was on a Sunday so all levity was perpetrated in people's own time.
Today we'll all be serious again and frown and tut-tut appropriately.



On Mon, 2 Apr 2012, Manfred S. Weiss wrote:

Dear all,

I find this discussion most amazing. Here, we are dealing with the most
serious issue
that happened to Macromolecular Crystallography since the Alabama case,
and the
whole discussion is centered around singular and plural and Greek and
Latin words
and what not.

In psychology such phenomenon is referred to as displacement activity.

If you are interested, here is the MacMillon definition of it:

http://www.macmillandictionary.com/dictionary/british/displacement-activity

Cheers,

Manfred


On 01.04.2012 19:35, Gerard Bricogne wrote:
On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:
On 04/01/12 10:18, Gerard Bricogne wrote:
Dear Paul,

  May I join the mostly silent chorus of Greek/Latin-aware grumps
who
wince when seeing data treated as singular when it is plural.
When it are plural?
 Good nit-picking :-) . In my mind the quotes around data would have
had the same effect as writing 'the word data', and referring to that
word
by the 'it'. So there is only one word, while its grammatical number is
plural.


At any rate, I heard a Nobel laureate use it incorrectly just two days
ago.
 We shouldn't learn to write by imitating Nobel laureates, then.


 With best wishes,

  Gerard.

--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edumailto:schul...@cornell.edu

--
Dr. Manfred. S. Weiss
Helmholtz-Zentrum Berlin f?r Materialien und Energie
Macromolecular Crystallography (HZB-MX)
Albert-Einstein-Str. 15
D-12489 Berlin
GERMANY
Fon:   +49-30-806213149
Fax:   +49-30-806214975
Web:   http://www.helmholtz-berlin.de/bessy-mx
Email: mswe...@helmholtz-berlin.demailto:mswe...@helmholtz-berlin.de




Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren
e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv.
Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.de



Best wishes,

--Gerard

**
   Gerard J. Kleywegt

  http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
**
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
**
   Little known gastromathematical curiosity: let z be the
   radius and a the thickness of a pizza. Then the volume
of that pizza is equal to pi*z*z*a !
**

..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://web.mac.com/bosch_lab/






Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Kendall Nettles
My favorite part of the german humor link:

Some German humorists such as 
Loriothttp://en.wikipedia.org/wiki/Vicco_von_B%C3%BClow use seriousness as 
means of humor.


On Apr 2, 2012, at 1:38 PM, Bosch, Juergen wrote:

Hm, last I checked my passport said German - still think I can make lots of fun 
of myself. Some Germans are epigenetically marked with humor-suppressor genes 
others not.

Jürgen

On Apr 2, 2012, at 11:03 AM, Gerard DVD Kleywegt wrote:

Dear Manfred,

Outside Germany, such excursions are called humour. If you are interested,
here is the Wikipedia page for it: http://en.wikipedia.org/wiki/Humour

--Gerard

PS: It was on a Sunday so all levity was perpetrated in people's own time.
Today we'll all be serious again and frown and tut-tut appropriately.



On Mon, 2 Apr 2012, Manfred S. Weiss wrote:

Dear all,

I find this discussion most amazing. Here, we are dealing with the most
serious issue
that happened to Macromolecular Crystallography since the Alabama case,
and the
whole discussion is centered around singular and plural and Greek and
Latin words
and what not.

In psychology such phenomenon is referred to as displacement activity.

If you are interested, here is the MacMillon definition of it:

http://www.macmillandictionary.com/dictionary/british/displacement-activity

Cheers,

Manfred


On 01.04.2012 19:35, Gerard Bricogne wrote:
On Sun, Apr 01, 2012 at 01:18:15PM -0400, David Schuller wrote:
On 04/01/12 10:18, Gerard Bricogne wrote:
Dear Paul,

  May I join the mostly silent chorus of Greek/Latin-aware grumps
who
wince when seeing data treated as singular when it is plural.
When it are plural?
 Good nit-picking :-) . In my mind the quotes around data would have
had the same effect as writing 'the word data', and referring to that
word
by the 'it'. So there is only one word, while its grammatical number is
plural.


At any rate, I heard a Nobel laureate use it incorrectly just two days
ago.
 We shouldn't learn to write by imitating Nobel laureates, then.


 With best wishes,

  Gerard.

--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edumailto:schul...@cornell.edu

--
Dr. Manfred. S. Weiss
Helmholtz-Zentrum Berlin f?r Materialien und Energie
Macromolecular Crystallography (HZB-MX)
Albert-Einstein-Str. 15
D-12489 Berlin
GERMANY
Fon:   +49-30-806213149
Fax:   +49-30-806214975
Web:   http://www.helmholtz-berlin.de/bessy-mx
Email: mswe...@helmholtz-berlin.demailto:mswe...@helmholtz-berlin.de




Helmholtz-Zentrum Berlin f?r Materialien und Energie GmbH

Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren
e.V.

Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv.
Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Gesch?ftsf?hrerin: Prof. Dr. Anke Rita Kaysser-Pyzalla

Sitz Berlin, AG Charlottenburg, 89 HRB 5583

Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin

http://www.helmholtz-berlin.dehttp://www.helmholtz-berlin.de/



Best wishes,

--Gerard

**
   Gerard J. Kleywegt

  http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
**
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
**
   Little known gastromathematical curiosity: let z be the
   radius and a the thickness of a pizza. Then the volume
of that pizza is equal to pi*z*z*a !
**

..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://web.mac.com/bosch_lab/







Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

And the summary indicates that outside Germany = English speaking
world  - which probably unveals its author as American ;-)



On 04/02/12 18:25, Gerard DVD Kleywegt wrote:
 Dear Andreas,
 
 That page confirms the old adage: German humour is no laughing matter.
 
 --Gerard
 
 
 On Mon, 2 Apr 2012, Andreas F?rster wrote:
 
 Dear Gerard,

 inside Germany it's apparently called German Humour.  There's a
 Wikipedia entry for that as well.  Go figure:

 http://en.wikipedia.org/wiki/German_humor


 Andreas

 (still living on Sunday time)


 On 02/04/2012 4:03, Gerard DVD Kleywegt wrote:
 Dear Manfred,

 Outside Germany, such excursions are called humour. If you are
 interested, here is the Wikipedia page for it:
 http://en.wikipedia.org/wiki/Humour

 --Gerard

 PS: It was on a Sunday so all levity was perpetrated in people's own
 time. Today we'll all be serious again and frown and tut-tut
 appropriately.


 
 
 Best wishes,
 
 --Gerard
 
 **
Gerard J. Kleywegt
 
   http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
 **
The opinions in this message are fictional.  Any similarity
to actual opinions, living or dead, is purely coincidental.
 **
Little known gastromathematical curiosity: let z be the
radius and a the thickness of a pizza. Then the volume
 of that pizza is equal to pi*z*z*a !
 **
 

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPeed9UxlJ7aRr7hoRAh9tAKDpydssNnLTrxn51ccjsR6Sfr4azwCdHWN1
u2uFraBdBejfkNLF9nnXhCA=
=OngV
-END PGP SIGNATURE-


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Maria Sola i Vilarrubias
Dear Phoebe,

I cannot imagine myself delivering maps and coordinates (after years of
work... I insist: after years of work) to a  reviewer that could be, for
whatever chance, my best competitor (even if I suggested to the editor not
to include him/her as a reviewer... but decisions from editors are of all
kind).

I simply prefer not imagine this after two publications fuelled by clear,
direct and strong competition. That was stressful enough, already. If I
have to add to this stress the thought that my coordinates can go to the
wrong hands, then I think I would just give up or, alternatively, send
the work to a lower impact, fast-publishing journal and make my life easier
while sending my scientific future to the low-impact bin, killing future
opportunities.

Competition is there. I see that data to be deposited is strictly
confidential. I support the PDB to make the quality check work at the level
you mention, but not a reviewer:  People are nice but the world is big and
competition is crazy… at least enough to make fraud or copy other's work.
The latter is less difficult; by copying (simply copy and paste to my
computer this nice structure that I was looking for!), there is no need to
invent anything.

About a wrongly fit compound, the reviewer can ask images about the model
in a map calculated at a specific sigma and in different orientations.

Maria


On 2 April 2012 18:43, Phoebe Rice pr...@uchicago.edu wrote:

 Can we leverage this to push journals to routinely allow reviewers access
 coordinates and maps?

 Outright fraud is outrageous, but I'm actually more worried about ligands
 fit to marginal density and other issues of under-supervised model building.

 =
 Phoebe A. Rice
 Dept. of Biochemistry  Molecular Biology
 The University of Chicago
 phone 773 834 1723

 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
 http://www.rsc.org/shop/books/2008/9780854042722.asp


  Original message 
 Date: Mon, 2 Apr 2012 08:41:02 -0700
 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of
 Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com)
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
 To: CCP4BB@JISCMAIL.AC.UK
 
Robbie has restored the PDB_REDO of 3k78
 
 
 
It is at www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2
 
 
 
and Louise Jones form the IUCr office has kindly
made the article open access.
 
 
 
http://journals.iucr.org/f/issues/2012/04/00/issconts.html
 
 
 
BR
 
 
 
 
 
 
 
From: CCP4 bulletin board
[mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard
Rupp (Hofkristallrat a.D.)
Sent: Sunday, April 01, 2012 06:06
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] very informative - Trends in
Data Fabrication
 
 
 
 Hofkristallrat auA*er Dienst, is written as
Bernhard - unless you are referring to some other
guy with a french name Bernard.
 
 
 
As one may extrapolate given my recent paper, I have
been called names a lot worse
 
 
 
A*  And the book indeed is a bible of xtallography.
 
 
 
Enough of this - it is becoming embarrassing. I wish
I had done a more careful job proofing, as over 500
errata attest to,
 
and we all are only seeing further because we are
standing on the shoulders of giants. So once again
thanks
 
to all the contributors I have pestered with my
questions on BB and then some, and to all those who
actually read BMC and
 
submitted errata.
 
 
 
Best regards, BR
 
-
Bernhard Hieronimus Rupp, Hofkristallrat a.D.
001 (925) 209-7429
+43 (676) 571-0536
hofkristall...@gmail.com
b...@hofkristallamt.org
http://www.ruppweb.org/
--
Once the sun of science is standing low, even dwarfs
cast tall shadows
--
 
 
 
 




-- 
Maria Solà
Dep. Structural Biology
IBMB-CSIC
Baldiri Reixach 10-12
08028 BARCELONA
Spain
Tel: (+34) 93 403 4950
Fax: (+34) 93 403 4979
e-mail: maria.s...@ibmb.csic.es


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Nat Echols
On Mon, Apr 2, 2012 at 11:00 AM, Maria Sola i Vilarrubias
msv...@ibmb.csic.es wrote:
 About a wrongly fit compound, the reviewer can ask images about the model in
 a map calculated at a specific sigma and in different orientations.

This will often be insufficient, I'm afraid.  We generally assume good
faith on the part of the authors: if the caption says the 2mFo-DFc
map is shown contoured at 1.5sigma, we assume that this is an honest
statement, but we also have no way of verifying it until the
experimental data are available.  I know of at least one case offhand
where the maps could not possibly have been contoured at that level -
the ligands are not misfit, they are simply not present in the
crystals, and the paper is misleading (deliberately or not, I don't
know).  Most reviewers do not have the patience to spend weeks
pursuing these issues.  (Although it would certainly help if reviewers
insisted that the density around ligands not be shown in isolation.)

That aside, I completely understand why someone would be reluctant to
share their data with potential competitors.  Someone once suggested
making the model and maps viewable via a web applet (AstexViewer or
similar), but even that sounds like it could be prone to abuse.

-Nat


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Pete Meyer

Artem Evdokimov wrote:

I can't resist asking: If we assume that the data fabrication
techniques and the techniques for discovery of such activities should
have the same sort of arms race as the development of viruses and
anti-malvare software (but of course on a much more modest scale since
structural biology is a relatively niche discipline) - can we then


I don't think this assumption holds for structure prediction, except in
the extreme asymptotic limit.  All of the cases of fabricated data that
I've heard of were detected because the fabricated data didn't look like
actual experimental data - because our models for calculating data are
missing a variety of things that occur experimentally.

So a hypothetical arms race might be result in a better model of the
various components (and potential sources) of errors during data
collection and processing.  But this would be a much more interesting
development in itself than any use for fabricating data.

Pete


speculate further that eventually the most sophisticated fabrication
techniques would be equivalent to de novo structure prediction :) It's
really too bad that there's no real money in this (again, relatively
speaking - not as much money as there is in software development),
because if there was then the structural biology equivalent of 'virus
hackers' would in reality approximate the same development trajectory
as the most successful (and legitimate) protein modelers. Given the
ingenuity of hackers and like-minded people in general, I sometimes
wonder if this isn't a better way to develop structure prediction
tools...

Artem

On Sun, Apr 1, 2012 at 10:09 AM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote:

On 31/03/12 23:08, Kevin Jin wrote:


I really wish PDB could have some people to review those important
structures, like paper reviewer.


So do the wwPDB, I would imagine.

But they can't just magic funding and positions into existence...

If the coordinate is downloaded for modeling and docking, people may not
check the density and model by themself. However this is not the worst case,
since the original data was fabricated.


1. All of data was correct and real,


Hmmm...

 It will be very difficult for people to check the density and coordinated
if he/she is not a well-trained crystallographer.


I hope and believe that this is not the case.  Even basically-trained
crystallographers should be able to calculate and interpret difference maps
of the kind described by Bernhard.  And with the EDS and PDB_REDO server,
one does not even need to know how to make generate a difference map...

Paul.




Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Phoebe Rice
That's very sad, but a good point.  I may be a bit naive because I haven't had 
to worry mas uch about direct competition.  

However, I do find it very frustrating as a reviewer to try to pass judgement 
on a crystal structure based only on the standard table 1.  Sometimes I'm 
tempted to write based on the information presented, darned if I know!

Maybe 3rd-party validation through the pdb (with a report sent to the 
reviewers) is more appropriate?  

Phoebe

=
Phoebe A. Rice
Dept. of Biochemistry  Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp


 Original message 
Date: Mon, 2 Apr 2012 20:00:48 +0200
From: Maria Sola i Vilarrubias msv...@ibmb.csic.es  
Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication  
To: pr...@uchicago.edu
Cc: CCP4BB@jiscmail.ac.uk

   Dear Phoebe,

   I cannot imagine myself delivering maps and
   coordinates (after years of work... I insist: after
   years of work) to a  reviewer that could be, for
   whatever chance, my best competitor (even if I
   suggested to the editor not to include him/her as a
   reviewer... but decisions from editors are of all
   kind).

   I simply prefer not imagine this after two
   publications fuelled by clear, direct and strong
   competition. That was stressful enough, already. If
   I have to add to this stress the thought that my
   coordinates can go to the wrong hands, then I
   think I would just give up or, alternatively, send
   the work to a lower impact, fast-publishing journal
   and make my life easier while sending my scientific
   future to the low-impact bin, killing future
   opportunities.

   Competition is there. I see that data to be
   deposited is strictly confidential. I support the
   PDB to make the quality check work at the level you
   mention, but not a reviewer:  People are nice but
   the world is big and competition is crazy… at
   least enough to make fraud or copy other's work. The
   latter is less difficult; by copying (simply copy
   and paste to my computer this nice structure that I
   was looking for!), there is no need to invent
   anything.

   About a wrongly fit compound, the reviewer can ask
   images about the model in a map calculated at a
   specific sigma and in different orientations.

   Maria

   On 2 April 2012 18:43, Phoebe Rice
   pr...@uchicago.edu wrote:

 Can we leverage this to push journals to routinely
 allow reviewers access coordinates and maps?

 Outright fraud is outrageous, but I'm actually
 more worried about ligands fit to marginal density
 and other issues of under-supervised model
 building.

 =
 Phoebe A. Rice
 Dept. of Biochemistry  Molecular Biology
 The University of Chicago
 phone 773 834 1723
 
 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
 http://www.rsc.org/shop/books/2008/9780854042722.asp

  Original message 
 Date: Mon, 2 Apr 2012 08:41:02 -0700
 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK
 (on behalf of Bernhard Rupp (Hofkristallrat
 a.D.) hofkristall...@gmail.com)
 Subject: Re: [ccp4bb] very informative - Trends
 in Data Fabrication
 To: CCP4BB@JISCMAIL.AC.UK
 
    Robbie has restored the PDB_REDO of 3k78
 
 
 
    It is at
 www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2
 
 
 
    and Louise Jones form the IUCr office has
 kindly
    made the article open access.
 
 
 
   
 http://journals.iucr.org/f/issues/2012/04/00/issconts.html
 
 
 
    BR
 
 
 
 
 
 
 
    From: CCP4 bulletin board
    [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Bernhard
    Rupp (Hofkristallrat a.D.)
    Sent: Sunday, April 01, 2012 06:06
    To: CCP4BB@JISCMAIL.AC.UK
    Subject: Re: [ccp4bb] very informative -
 Trends in
    Data Fabrication
 
 
 
         Hofkristallrat auA*er Dienst, is
 written as
    Bernhard - unless you are referring to some
 other
    guy with a french name Bernard.
 
 
 
    As one may extrapolate given my recent paper,
 I have
    been called names a lot worse
 
 
 
    A*  And the book indeed is a bible of
 xtallography.
 
 
 
    Enough of this - it is becoming embarrassing.
 I wish
    I had done a more careful job proofing, as
 over 500
    errata attest to,
 
    and we all are only seeing further because we
 are
    standing on the shoulders of giants. So once
 again
    thanks
 
    to all the contributors I have pestered with
 my
    questions

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Randy Read
Dear Phoebe,

As it happens, validation through the PDB is exactly what the X-ray Validation 
Task Force proposed (to be honest, it was a suggestion made by George Sheldrick 
the last time there was a debate like this on the CCP4-BB!), and the wwPDB is 
currently implementing the pipeline needed to automatically produce a good 
validation report.  A preliminary version of such a report is already available 
when you deposit a structure now, the IUCr journals already require this for 
papers describing structures, and there seems to be interest from some other 
journals.  In the meantime, if you're refereeing a paper from a journal that 
doesn't require the validation report to be submitted with the paper, you can 
always ask them to get it from the author.

Best wishes,

Randy

-
Randy J. Read
Department of Haematology, University of Cambridge
Cambridge Institute for Medical ResearchTel: +44 1223 336500
Wellcome Trust/MRC Building Fax: +44 1223 336827
Hills RoadE-mail: 
rj...@cam.ac.uk
Cambridge CB2 0XY, U.K.   
www-structmed.cimr.cam.ac.uk

On 2 Apr 2012, at 20:01, Phoebe Rice wrote:

 That's very sad, but a good point.  I may be a bit naive because I haven't 
 had to worry mas uch about direct competition.  
 
 However, I do find it very frustrating as a reviewer to try to pass judgement 
 on a crystal structure based only on the standard table 1.  Sometimes I'm 
 tempted to write based on the information presented, darned if I know!
 
 Maybe 3rd-party validation through the pdb (with a report sent to the 
 reviewers) is more appropriate?  
 
 Phoebe
 
 =
 Phoebe A. Rice
 Dept. of Biochemistry  Molecular Biology
 The University of Chicago
 phone 773 834 1723
 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
 http://www.rsc.org/shop/books/2008/9780854042722.asp
 
 
  Original message 
 Date: Mon, 2 Apr 2012 20:00:48 +0200
 From: Maria Sola i Vilarrubias msv...@ibmb.csic.es  
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication  
 To: pr...@uchicago.edu
 Cc: CCP4BB@jiscmail.ac.uk
 
  Dear Phoebe,
 
  I cannot imagine myself delivering maps and
  coordinates (after years of work... I insist: after
  years of work) to a  reviewer that could be, for
  whatever chance, my best competitor (even if I
  suggested to the editor not to include him/her as a
  reviewer... but decisions from editors are of all
  kind).
 
  I simply prefer not imagine this after two
  publications fuelled by clear, direct and strong
  competition. That was stressful enough, already. If
  I have to add to this stress the thought that my
  coordinates can go to the wrong hands, then I
  think I would just give up or, alternatively, send
  the work to a lower impact, fast-publishing journal
  and make my life easier while sending my scientific
  future to the low-impact bin, killing future
  opportunities.
 
  Competition is there. I see that data to be
  deposited is strictly confidential. I support the
  PDB to make the quality check work at the level you
  mention, but not a reviewer:  People are nice but
  the world is big and competition is crazy… at
  least enough to make fraud or copy other's work. The
  latter is less difficult; by copying (simply copy
  and paste to my computer this nice structure that I
  was looking for!), there is no need to invent
  anything.
 
  About a wrongly fit compound, the reviewer can ask
  images about the model in a map calculated at a
  specific sigma and in different orientations.
 
  Maria
 
  On 2 April 2012 18:43, Phoebe Rice
  pr...@uchicago.edu wrote:
 
Can we leverage this to push journals to routinely
allow reviewers access coordinates and maps?
 
Outright fraud is outrageous, but I'm actually
more worried about ligands fit to marginal density
and other issues of under-supervised model
building.
 
=
Phoebe A. Rice
Dept. of Biochemistry  Molecular Biology
The University of Chicago
phone 773 834 1723

 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp
 
 Original message 
 Date: Mon, 2 Apr 2012 08:41:02 -0700
 From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK
(on behalf of Bernhard Rupp (Hofkristallrat
a.D.) hofkristall...@gmail.com)
 Subject: Re: [ccp4bb] very informative - Trends
in Data Fabrication
 To: CCP4BB@JISCMAIL.AC.UK
 
   Robbie has restored the PDB_REDO of 3k78
 
 
 
   It is at
www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2
 
 
 
   and Louise Jones form the IUCr office has
kindly
   made the article open access.
 
 
 
  
http://journals.iucr.org/f/issues/2012/04/00/issconts.html
 
 
 
   BR
 
 
 
 
 
 
 
   From: CCP4

Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-02 Thread Andreas Förster

That's pretty funny, isn't it?


Andreas



On 02/04/2012 6:52, Jacob Keller wrote:

Sorry to beat a dead horse, but:


  * *Antiwitz* (/anti-joke/): A short, often absurd scene, which has the
recognizable structure of a joke, but is illogical or lacking a
punch-line.

Example: /Two thick feet are crossing the street. Says one thick
foot to the other thick foot: Hello!/

Other examples: Nachts ist es kälter als draußen (At night it's
colder than outside) or Zu Fuß ist es kürzer als über'n Berg
(Walking is faster than over the mountain).





Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-01 Thread Anastassis Perrakis
 I still believe Prof. Dr. Hofkristallrat außer Dienst, is written as 
 Bernhard - unless you are referring to some other guy with a  french name 
 Bernard. And the book indeed is a bible of xtallography.
Jürgen

ausser Dienst ... now I get it ... my German is a lot worse than just spelling 
names wrong ;-)
(and sorry for the 'ss' - no clue where Escet is in my keyboard)

and indeed the book is great - maybe get the publication year and count PDB 
structures before and after it ... We are in year 3 Anno Rupp ... (and no 
worries Bernhard about the errata ... the Bible likely contains many more ... 
each chapter is contradicting every other ... at least you are consistent!)

Best -

A.


[ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread Paul Emsley

The PDBe page for 3k78 says:

The experimental data has been deposited

the data cif file says:

data is under question

Grump.

Is it to late to refer to data as if there were more than one of them?

Anyway, the data mtz file is here if you want to refine with it:

http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz

Paul.


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread Gerard Bricogne
Dear Paul,

 May I join the mostly silent chorus of Greek/Latin-aware grumps who
wince when seeing data treated as singular when it is plural. Related
instances are

 * a phenomenon (singular) vs. several phenomena (plural),
 
 * a criterion (singular) vs. several criteria (plural)
 
and many more.

 And then there is the infamous mix-up between principal (adjective)
and principle (noun, as in Principle of Least Action, or Peter's
Principle) giving rise to the favourite hero, the Principle Investigator.

 This phenomena is now so widespread that perhaps compliance with
ancient Greek or Latin morphology is no longer a relevant criteria ;-) . 


 With best wishes,
 
  Gerard.

--
On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote:
 The PDBe page for 3k78 says:

 The experimental data has been deposited

 the data cif file says:

 data is under question

 Grump.

 Is it to late to refer to data as if there were more than one of them?

 Anyway, the data mtz file is here if you want to refine with it:

 http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz

 Paul.

-- 

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-01 Thread Robbie Joosten
Dear CCP4BBers,

The PDB_REDO entry Bernhard referred to in his interesting and very thorough
article was automatically deleted because the original PDB entry was
obsoleted. Since access to the 'experimental' data of any study is
important, we have made a compressed copy of the PDB_REDO entry available at
http://www.cmbi.ru.nl/pdb_redo/others/3k78.tar.bz2 
Our apologies to those who have looked for this entry in vain.

Best wishes,
Robbie Joosten (on behalf of the PDB_REDO team)

Biochemistry
Netherlands Cancer Institute

P.S. The whole fraud thing seems to have interfered with the annual April
fools' post on CCP4BB. Let's hope this will not happen again. 





 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Michel Fodje
 Sent: Saturday, March 31, 2012 21:55
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
 
 Very interesting
 
 Response to Detection and analysis of unusual features in the structural
 model and structure-factor data of a birch pollen allergen
 doi:10.1107/S1744309112008433
 
 a quote from the response:
 
 Author Schwarzenbacher admits to the allegations of data fabrication and
 deeply apologizes to the co-authors and the scientific community for all
the
 problems this has caused
 
 .
 
 Note added in proof: subsequent to the acceptance of this article for
 publication, author Schwarzenbacher withdrew his admission of the
 allegations.
 
 
 
 
 
 
 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Bernhard Rupp (Hofkristallrat a.D.) [hofkristall...@gmail.com]
 Sent: Saturday, March 31, 2012 12:42 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
 
 This is an unresolved problem, and no real satisfactory solution exists,
 because the underlying reasons for zero occupancy can be different.
 For people who understand this and look at electron density, it is not a
 problem. For users who rely on some graphics program displaying only atom
 coordinates, it can be. The same holds for manipulation of B-factors,
‘trading’
 high B-factors against reduced occupancy, and other (almost always purely
 cosmetic but still confusing or inconsistent) practices.
 
 Best, BR
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Nian Huang
 Sent: Saturday, March 31, 2012 11:29 AM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] very informative - Trends in Data Fabrication
 
 I don't model zero occupancy in my model. But can't the refinement
 programs just treat those atoms with zero occupancy as missing atoms?
 
 Nian Huang
 On Sat, Mar 31, 2012 at 10:26 AM, Bosch, Juergen
 jubo...@jhsph.edumailto:jubo...@jhsph.edu wrote:
 really fascinating, bringing back the discussion for a repository for your
 collected frames.
 
 Jürgen
 
 
 Acta Cryst. (2012). F68, 366-376
 doi:10.1107/S1744309112008421http://dx.doi.org/10.1107/S17443091120084
 21
 
 Detection and analysis of unusual features in the structural model and
 structure-factor data of a birch pollen allergen B.
 Rupphttp://scripts.iucr.org/cgi-
 bin/citedin?search_on=nameauthor_name=Rupp,%20B.
 
 Abstract: Physically improbable features in the model of the birch pollen
 structure Bet v 1d (PDB entry 3k78http://pdb.pdb.bnl.gov/pdb-
 bin/opdbshort?3k78) are faithfully reproduced in electron density
 generated with the deposited structure factors, but these structure
factors
 themselves exhibit properties that are characteristic of data calculated
from a
 simple model and are inconsistent with the data and error model obtained
 through experimental measurements. The refinement of the
 3k78http://pdb.pdb.bnl.gov/pdb-bin/opdbshort?3k78model against these
 structure factors leads to an isomorphous structure different from the
 deposited model with an implausibly small R value (0.019). The abnormal
 refinement is compared with normal refinement of an isomorphous variant
 structure of Bet v 1l (PDB entry 1fm4http://pdb.pdb.bnl.gov/pdb-
 bin/opdbshort?1fm4). A variety of analytical tools, including the
application
 of Diederichs plots, R plots and bulk-solvent analysis are discussed as
 promising aids in validation. The examination of the Bet v 1d structure
also
 cautions against the practice of indicating poorly defined protein chain
 residues through zero occupancies. The recommendation to preserve
 diffraction images is amplified.
 ..
 Jürgen Bosch
 Johns Hopkins University
 Bloomberg School of Public Health
 Department of Biochemistry  Molecular Biology Johns Hopkins Malaria
 Research Institute
 615 North Wolfe Street, W8708
 Baltimore, MD 21205
 Office: +1-410-614-4742tel:%2B1-410-614-4742
 Lab:  +1-410-614-4894tel:%2B1-410-614-4894
 Fax:  +1-410-955-2926tel:%2B1-410-955-2926
 http://web.mac.com/bosch_lab/


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread George T. DeTitta
Perhaps the world could use a few more principle investigators?

A Buffalo view
Sent via BlackBerry by ATT

-Original Message-
From: Gerard Bricogne g...@globalphasing.com
Sender: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK
Date: Sun, 1 Apr 2012 15:18:15 
To: CCP4BB@JISCMAIL.AC.UK
Reply-To: Gerard Bricogne g...@globalphasing.com
Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - 
Trends in Data Fabrication]

Dear Paul,

 May I join the mostly silent chorus of Greek/Latin-aware grumps who
wince when seeing data treated as singular when it is plural. Related
instances are

 * a phenomenon (singular) vs. several phenomena (plural),

 * a criterion (singular) vs. several criteria (plural)

and many more.

 And then there is the infamous mix-up between principal (adjective)
and principle (noun, as in Principle of Least Action, or Peter's
Principle) giving rise to the favourite hero, the Principle Investigator.

 This phenomena is now so widespread that perhaps compliance with
ancient Greek or Latin morphology is no longer a relevant criteria ;-) .


 With best wishes,

  Gerard.

--
On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote:
 The PDBe page for 3k78 says:

 The experimental data has been deposited

 the data cif file says:

 data is under question

 Grump.

 Is it to late to refer to data as if there were more than one of them?

 Anyway, the data mtz file is here if you want to refine with it:

 http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz

 Paul.

--

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread Adrian Goldman
You can find all the principle investigators you want collecting datums ;) at 
the ESRF,  as that is how the French spell it on the application form for beam 
time!  (Unless it has _finally_ been corrected: haven't checked since I 
submitted my last BAG application in April.)

Adrian


On 1 Apr 2012, at 17:52, George T. DeTitta wrote:

 Perhaps the world could use a few more principle investigators?
 
 A Buffalo view
 Sent via BlackBerry by ATT
 
 -Original Message-
 From: Gerard Bricogne g...@globalphasing.com
 Sender: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK
 Date: Sun, 1 Apr 2012 15:18:15 
 To: CCP4BB@JISCMAIL.AC.UK
 Reply-To: Gerard Bricogne g...@globalphasing.com
 Subject: Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative 
 - Trends in Data Fabrication]
 
 Dear Paul,
 
 May I join the mostly silent chorus of Greek/Latin-aware grumps who
 wince when seeing data treated as singular when it is plural. Related
 instances are
 
 * a phenomenon (singular) vs. several phenomena (plural),
 
* a criterion (singular) vs. several criteria (plural)
 
 and many more.
 
 And then there is the infamous mix-up between principal (adjective)
 and principle (noun, as in Principle of Least Action, or Peter's
 Principle) giving rise to the favourite hero, the Principle Investigator.
 
 This phenomena is now so widespread that perhaps compliance with
 ancient Greek or Latin morphology is no longer a relevant criteria ;-) .
 
 
 With best wishes,
 
  Gerard.
 
 --
 On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote:
 The PDBe page for 3k78 says:
 
 The experimental data has been deposited
 
 the data cif file says:
 
 data is under question
 
 Grump.
 
 Is it to late to refer to data as if there were more than one of them?
 
 Anyway, the data mtz file is here if you want to refine with it:
 
 http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz
 
 Paul.
 
 --
 
 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread Patrick Loll
Hear, hear! I'm glad to know I'm not the last grump left standing. When I raise 
this point every year, my students regard me with bemused stares, as though 
they've just seen a coelacanth swim past their window...


On 1 Apr 2012, at 10:18 AM, Gerard Bricogne wrote:

 Dear Paul,
 
 May I join the mostly silent chorus of Greek/Latin-aware grumps who
 wince when seeing data treated as singular when it is plural. Related
 instances are
 
 * a phenomenon (singular) vs. several phenomena (plural),

* a criterion (singular) vs. several criteria (plural)

 and many more.
 
 And then there is the infamous mix-up between principal (adjective)
 and principle (noun, as in Principle of Least Action, or Peter's
 Principle) giving rise to the favourite hero, the Principle Investigator.
 
 This phenomena is now so widespread that perhaps compliance with
 ancient Greek or Latin morphology is no longer a relevant criteria ;-) . 
 
 
 With best wishes,
 
  Gerard.
 
 --
 On Sun, Apr 01, 2012 at 01:05:10PM +0100, Paul Emsley wrote:
 The PDBe page for 3k78 says:
 
 The experimental data has been deposited
 
 the data cif file says:
 
 data is under question
 
 Grump.
 
 Is it to late to refer to data as if there were more than one of them?
 
 Anyway, the data mtz file is here if you want to refine with it:
 
 http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz
 
 Paul.
 
 -- 
 
 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread VAN RAAIJ , MARK JOHAN

another singular/plural grump:
Recently we can read: phage are.
Phage is singular, the plural is phages (and this does not have that  
much to do with latin or greek).

more reading:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3109450/

Quoting Paul Emsley:


The PDBe page for 3k78 says:

The experimental data has been deposited

the data cif file says:

data is under question

Grump.

Is it to late to refer to data as if there were more than one of them?

Anyway, the data mtz file is here if you want to refine with it:

http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz

Paul.





Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoléculas
Centro Nacional de Biotecnología - CSIC
c/Darwin 3, Campus Cantoblanco
28049 Madrid
tel. 91 585 4616
email: mjvanra...@cnb.csic.es


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread Antony Oliver
Think the jury might be out on this one... A quick snip from WikiDictionary...

The plural word phages refers to different types of phage, whereas in common 
usage the word phage can be both singular and plural, referring in the plural 
sense to particles of the same type of phage. Maloy et al: Microbial Genetics, 
2nd ed., 1984

Tony.

---
Mobile Account
---

On 1 Apr 2012, at 16:29, VAN RAAIJ , MARK JOHAN 
mjvanra...@cnb.csic.esmailto:mjvanra...@cnb.csic.es wrote:

another singular/plural grump:
Recently we can read: phage are.
Phage is singular, the plural is phages (and this does not have that much to do 
with latin or greek).
more reading:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3109450/

Quoting Paul Emsley:

The PDBe page for 3k78 says:

The experimental data has been deposited

the data cif file says:

data is under question

Grump.

Is it to late to refer to data as if there were more than one of them?

Anyway, the data mtz file is here if you want to refine with it:

http://lmb.bioch.ox.ac.uk/emsley/data/r3k78sf.mtz

Paul.




Mark J van Raaij
Laboratorio M-4
Dpto de Estructura de Macromoléculas
Centro Nacional de Biotecnología - CSIC
c/Darwin 3, Campus Cantoblanco
28049 Madrid
tel. 91 585 4616
email: mjvanra...@cnb.csic.esmailto:mjvanra...@cnb.csic.es


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread Gerard DVD Kleywegt

Is it to late to refer to data as if there were more than one of them?


Is it too late to explain the difference between to and too?

--A much mellowed CD


Re: [ccp4bb] one datum many data? [was Re: [ccp4bb] very informative - Trends in Data Fabrication]

2012-04-01 Thread David Schuller

On 04/01/12 10:18, Gerard Bricogne wrote:

Dear Paul,

  May I join the mostly silent chorus of Greek/Latin-aware grumps who
wince when seeing data treated as singular when it is plural.

When it are plural?
At any rate, I heard a Nobel laureate use it incorrectly just two days ago.

--
===
All Things Serve the Beam
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edu


  1   2   >