Re: [ccp4bb] a challenge

George Sheldrick Sat, 12 Jan 2013 14:00:55 -0800

James,

I had in fact just come to the conclusion that the indexing wasconsistent with 3dko for 'possible' but not for 'impossible',

which I suppose was logical.


George

Woops! sorry folks. I made a mistake with the I(+)/I(-) entry. Theyhad the wrong axis convention relative to 3dko and the F in the samefile. Sorry about that.
The files on the website now should be right.
http://bl831.als.lbl.gov/~jamesh/challenge/possible.mtz
http://bl831.als.lbl.gov/~jamesh/challenge/impossible.mtz

md5 sums:
c4bdb32a08c884884229e8080228d166  impossible.mtz
caf05437132841b595be1c0dc1151123  possible.mtz

-James Holton
MAD Scientist

On 1/12/2013 8:25 AM, James Holton wrote:
Fair enough!
I have just now added DANO and I(+)/I(-) to the files. I'll be veryinterested to see what you can come up with! For the record, thephases therein came from running mlphare with default parameters butexactly the correct heavy-atom constellation (all the sulfur atoms in3dko), and then running dm with default parameters.
Yes, there are other ways to run mlphare and dm that give betterphases, but I was only able to determine those parameters by"cheating" (comparing the resulting map to the right answer), so Idon't think it is "fair" to use those maps.
I have had a few questions about what is "cheating" and what is notcheating. I don't have a problem with the use of sequenceinformation because that actually is something that you realisticallywould know about your protein when you sat down to collect data. Thesequence of this molecule is that of 3dko:
http://bl831.als.lbl.gov/~jamesh/challenge/seq.pir
I also don't have a problem with anyone actually using anautomation program to _help_ them solve the "impossible" dataset aslong as they can explain what they did. Simply putting the abovesequence into BALBES would, of course, be cheating! I suppose onecould try eliminating 3dko and its "homologs" from the BALBES search,but that, in and of itself, is perhaps relevant to the challenge:"what is the most distance homolog that still allows you to solve thestructure?". That, I think, is also a stringent test ofmodel-building skill.
I have already tried ARP/wARP, phenix.autobuild andbuccaneer/refmac. With default parameters, all of these programsfail on both the "possible" and "impossible" datasets. It was onlywith some substantial tweaking that I found a way to getphenix.autobuild to crack the "possible" dataset (using 20 models inparallel). I have not yet found a way to get any automation programto build its way out of the "impossible" dataset. Personally, Ithink that the breakthrough might be something like what TomTerwilliger mentioned. If you build a good enough starting set ofatoms, then I think an automation program should be able to take youthe rest of the way. If that is the case, then it means people likeTom who develop such programs for us might be able to use thatinsight to improve the software, and that is something that willbenefit all of us.
Or, it is entirely possible that I'm just not running the currentsoftware properly! If so, I'd love it if someone who knows better(such as their developers) could enlighten me.
-James Holton
MAD Scientist

On 1/12/2013 3:07 AM, Pavol Skubak wrote:
Dear James,

your challenge in its current form ignores an important source
of information for model building that is available for your
simulated data - namely, it does not allow to use anomalous
phase information in the model building. In difficult cases on
the edge of success such as this one, this typically makes
the difference between building and not building.

If you can make the F+/F- and Se substructure available, we
can test whether this is the case indeed. However, while I
expect this would push the challenge further significantly,
most likely you would be able to decrease the Se incorporation
of your simulated data further to such levels that the anomalous
signal is again no longer sufficient to build the structure. And
most likely, there would again exist an edge where a small
decrease in the Se incorporation would lead from a model built
to no model built.

Best regards,

--
Pavol Skubak
Biophysical Structural Chemistry
Gorleaus Laboratories
Einsteinweg 55
Leiden University
LEIDEN  2333CC
the Netherlands
tel: 0031715274414 <tel:0031715274414>
web: http://bsc.lic.leidenuniv.nl/people/skubak-0



--
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582

Re: [ccp4bb] a challenge

Reply via email to