***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***


While there may not have been obvious warning signs in the published data (or maps), two unfortunate aspects of the refinement may have obscured them.

The first involves the selection of the test sets of reflections. All five retracted papers report the presence of non-crystallographic symmetry (e.g. 8-fold NCS in 1JSQ). All five PDB files report that the test sets were chosen randomly, meaning that test set reflections were potentially coupled (perhaps strongly) by NCS to reflections included in the working set. This can lead to artificially low Rfree values and thus mask more fundamental errors in a structure. A way to avoid biasing Rfree values is to choose the test set in thin resolution shells whenever NCS is present. Currently, this precaution is often ignored. It should become a de facto standard for publication of structures containing NCS.

The second aspect concerns over-parametrization. In the retracted structures, the potential for cross-talk between the working and test sets was all the more serious because the authors chose to use multicopy refinement procedures, expanding the number of free parameters by as much as 16-fold (although harmonic constraints were also employed). In the retraction, Chang et al. themselves note that "Unfortunately, the use of the multicopy refinement procedure still allowed us to obtain reasonable refinement values for the wrong structures." While the reported values are not great, in the cases where the Rfree values obtained from single-copy refinement are described, they were clearly incompatible with a correct solution (Rfree > 40%).

It has been argued that multicopy refinement captures genuine aspects of the data, based on the observed decline in Rfree (e.g. in the retracted JMB paper). However, given the fact that Rfree was probably coupled to Rwork by NCS, the drop in Rfree cannot be taken as validation of the multicopy approach. Instead, it probably reflected a significant level of overfitting, which "leaked through" to the Rfree. In hindsight, it is hard to see how twelve or sixteen incorrect structures could be genuinely better than one, and yet they yielded much more attractive statistics. Unless multicopy refinement can be rigorously justified, it should probably be avoided, particularly for low-resolution structures in which the ratio of observations to parameters is low even for a single-copy refinement.

Without multicopy refinement, these structures probably never would have been published. And even with multicopy refinement, a more rigorous test set based on resolution shells might have been more resistant to overfitting.

Dean


PS My apologies if an earlier version of this message also arrives. It appears to have been tied up by the server for several days.


-------- Original Message --------
Subject: [ccp4bb]: Retraction of ABC transporter structures - were there
warning signs?
Date: Sat, 23 Dec 2006 13:32:15 -0500
From: Arun Malhotra <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Organization: University of Miami School of Medicine
To: [email protected]

***  For details on how to be removed from this list visit the  ***
***          CCP4 home page http://www.ccp4.ac.uk         ***



I was shocked to see the retraction in yesterday's issue of Science (Dec
22, 2006) of several ABC transporter structures and papers from the
Chang lab, including three published in Science.  The retraction says
that the structures have the wrong hand and topology due to an
"in-house" program that inverted the signs on the anomalous pairs.

I have no expertise in ABC transporters, but were there warning signs in
the structures? Were red flags raised by PDB or the other servers such
as EDI, EDS, etc.? Looking at some of these papers, these are low
resolution structure and I see very high R/Rfree, but there must have
been other signs of problems as well.

In the past few years, there have been almost no structures retracted
due to gross errors and the checks being used by structural biology
community seemed to working quite well - what can we learn from this
tragic and sad error ?

--
Arun Malhotra                              Phone: (305) 243-2826
Associate Professor                        Lab:   (305) 243-2890
Dept. of Biochemistry & Molecular Biology  Fax:   (305) 243-3955
University of Miami School of Medicine
PO Box 016129                         E-Mail: [EMAIL PROTECTED]
Miami, FL 33101              Web: http://structure.med.miami.edu


--
Dean R. Madden, Ph.D.
Department of Biochemistry
Dartmouth Medical School
7200 Vail Building
Hanover, NH 03755-3844 USA

tel: +1 (603) 650-1164
fax: +1 (603) 650-1128
e-mail: [EMAIL PROTECTED]


Reply via email to