While the topic of "fabrication" is still hot, I thought I too could add 
a few thoughts. 

Our Mathematician friends always make fun of us (Biologists/ Biochemists/ 
crystallographers!) that our papers are accepted within 4-8 weeks of 
submission. 
This is not to talk of Science/ Nature/ Cell, where even more rapid reviews 
are the norms. In the Mathematics world it is customary to have one year 
review of manuscripts, and prior announcements of the work on respective 
web sites. The one year review, and the prior announcements on web sites, 
allows others to review the results independently. That perhaps brings 
in the required rigor in the results. Consequently, there are not as many 
retractions in Mathematics as what we see in our area. It is perhaps not 
possible in our (crystallographic) World to check every strcture independently 
by others. Yet, longer review along with access to raw data might allow 
reviewers to check the finer details of the structures. I would strongly 
suggest that raw data be made available to reviewers, and that reviewers 
should check the structures before the papers are accepted. Any error in 
the final published structures, blame should also lie partially on the 
reviewer. The back-to-back controversies are bound to hurt crystallogrophic 
community as a whole, and IUCr should ponder over to better checks for 
the future. 

Shekhar Mande
Hyderabad, INDIA
---------REPLY TO-------------
Date:Thu Aug 16 21:22:20 GMT+08:00 2007
FROM: Randy J. Read  <[EMAIL PROTECTED]>
To: [email protected]
Subject: Re: [ccp4bb] The importance of USING our validation tools
On Aug 16 2007, Eleanor Dodson wrote:

>The weighting in REFMAC is a function of SigmA ( plotted in log file). 
>For this example it will be nearly 1 for all resolutions ranges so the 
>weights are pretty constant. There is also a contribution from the 
>"experimental" sigma, which in this case seems to be proportional to |F| 

Originally I expected that the publication of our Brief Communication in 
Nature would stimulate a lot of discussion on the bulletin board, but 
clearly it hasn't. One reason is probably that we couldn't be as forthright 
as we wished to be. For its own good reasons, Nature did not allow us to 
use the word "fabricated". Nor were we allowed to discuss other structures 
from the same group, if they weren't published in Nature.

Another reason is an understandable reluctance to make allegations in 
public, and the CCP4 bulletin board probably isn't the best place to do 
that.

But I think the case raises essential topics for the community to discuss, 
and this is a good forum for those discussions. We need to consider how 
to 
ensure the integrity of the structural databases and the associated 
publications.

So here are some questions to start a discussion, with some suggestions 
of 
partial answers.

1. How many structures in the PDB are fabricated?

I don't know, but I think (or at least hope) that the number is very small. 

2. How easy is it to fabricate a structure?

It's very easy, if no-one will be examining it with a suspicious mind, 
but 
it's extremely difficult to do well. No matter how well a structure is 
fabricated, it will violate something that is known now or learned later 
about the properties of real macromolecules and their diffraction data. 
If 
you're clever enough to do this really well, then you should be clever 
enough to determine the real structure of an interesting protein.

3. How can we tell whether structures in the PDB are fabricated, or just 
poorly refined?

The current standard validation tools are aimed at detecting errors in 
structure determination or the effects of poor refinement practice. None 
of 
them are aimed at detecting specific signs of fabrication because we assume 
(almost always correctly) that others are acting in good faith.

The more information that is available, the easier it will be to detect 
fabrication (because it is harder to make up more information 
convincingly). For instance, if the diffraction data are deposited, we 
can 
check for consistency with the known properties of real macromolecular 
crystals, e.g. that they contain disordered solvent and not vacuum. As 
Tassos Perrakis has discovered, there are characteristic ways in which 
the 
standard deviations depend on the intensities and the resolution. If 
unmerged data are deposited, there will probably be evidence of radiation 
damage, weak effects from intrinsic anomalous scatterers, etc. Raw images 
are probably even harder to simulate convincingly.

If a structure is fabricated by making up a new crystal form, perhaps a 
complex of previously-known components, then the crystal packing 
interactions should look like the interactions seen in real crystals. If 
it's fabricated by homology modelling, then the internal packing is likely 
to be suboptimal. I'm told by David Baker (who knows a thing or two about 
this) that it is extremely difficult to make a homology model that both 
obeys what we know about torsion angle preferences and is packed as well 
as 
a real protein structure.

I'm very interested in hearing about new ideas along these lines. The wwPDB 
has agreed to sponsor a workshop next year where we will propose and test 
new validation criteria.

4. If new validation criteria are applied at the PDB, won't someone who 
wants to fabricate a structure just keep improving their fabricated model 
until it passes all the tests?

That's a possibility, but I think the deterrence effect of knowing that 
there are measures to detect fabrication will outweigh this. And it isn't 
enough for a fabricated structure to pass today's tests; it has to pass 
all 
the new tests devised for the rest of the person's life, or at least their 
career.

5. What should we do if tests suggest that a structure may be fabricated? 

I think we need to be extremely careful. Conclusions should not be drawn 
on 
the basis of a few numbers. The tests can just point up which structures 
should be examined closely. Close examination would then involve less 
automated criteria, such as whether the structure agrees with all the 
biochemical data about the system. As in the process followed by Nature, 
you also have to start by giving the people who deposited the structure 
an 
opportunity to explain the anomalies.

Randy Read

-

Reply via email to