Here is another take on this topic, by Carlos Quteiral (@c_outeiral), from a non-crystallographer's point of view, covering many of the points discussed in this thread  (incl. an example of the model guiding correction of the experimental structure).

https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/

Marko

On 08/12/2020 13:25, Tristan Croll wrote:
This is a number that needs to be interpreted with some care. 2 Å crystal structures in general achieve an RMSD of 0.2 Å on the portion of the crystal that's resolved, including loops that are often only in well-resolved conformations due to physiologically-irrelevant crystal packing interactions. The predicted models, on the other hand, are in isolation. Once you get to the level achieved by this last round of predictions, that starts making fair comparison somewhat more difficult*. Two obvious options that I see: (1) limit the comparison only to the stable core of the protein (in which case many of the predictions have RMSDs in the very low fractions of an Angstrom), or (2) compare ensembles derived from MD simulations starting from the experimental and predicted structure, and see how well they overlap.

-- Tristan

* There's one more thorny issue when you get to this level: it becomes more and more possible (even likely) that the prediction gets some things right that are wrong in the experimental structure. 

From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> on behalf of Ian Tickle <ianj...@gmail.com>
Sent: 08 December 2020 13:04
To: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)
 

There was a little bit of press-release hype: the release stated "a score of around 90 GDT is informally considered to be competitive with results obtained from experimental methods" and "our latest AlphaFold system achieves a median score of 92.4 GDT overall across all targets. This means that our predictions have an average error (RMSD) of approximately 1.6 Angstroms,".

Experimental methods achieve an average error of around 0.2 Ang. or better at 2 Ang. resolution, and of course much better at atomic resolution (1 Ang. or better), or around 0.5 Ang. at 3 Ang. resolution.  For ligand-binding studies I would say you need 3 Ang. resolution or better.  1.6 Ang. error is probably equivalent to around 4 Ang. resolution.  No doubt that will improve with time and experience, though I think it will be an uphill struggle to get better than 1 Ang. error, simply because the method can't be better than the data that go into it and 1-1.5 Ang. represents a typical spread of homologous models in the PDB.  So yes very competitive if you're desperate for a MR starting model, but not quite yet there for a refined high-resolution structure.

Cheers

-- Ian


On Tue, 8 Dec 2020 at 12:11, Harry Powell - CCP4BB <0000193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:
Hi

It’s a bit more than science by press release - they took part in CASP14 where they were given sequences but no other experimental data, and did significantly better than the other homology modellers (who had access to the same data) when judge by independent analysis. There were things wrong with their structures, sure, but in almost every case they were less wrong than the other modellers (many of whom have been working on this problem for decades).

It _will_ be more impressive once the methods they used (or equivalents) are implemented by other groups and are available to the “public” (I haven’t found an AlphaFold webserver to submit a sequence to, whereas the other groups in the field do make their methods readily available), but it’s still a step-change in protein structure prediction - it shows it can be done pretty well.

Michel is right, of course; you can’t have homology modelling without homologous models, which are drawn from the PDB - but the other modellers had the same access to the PDB (just as we all do…).

Just my two ha’porth.

Harry

> On 8 Dec 2020, at 11:33, Goldman, Adrian <adrian.gold...@helsinki.fi> wrote:
>
> My impression is that they haven’t published the code, and it is science by press-release.  If one of us tried it, we would - rightly - get hounded out of time.
>
> Adrian
>
>
>
>> On 4 Dec 2020, at 15:57, Michel Fodje <michel.fo...@lightsource.ca> wrote:
>>
>> I think the results from AlphaFold2, although exciting and a breakthrough are being exaggerated just a bit.  We know that all the information required for the 3D structure is in the sequence. The protein folding problem is simply how to go from a sequence to the 3D structure. This is not a complex problem in the sense that cells solve it deterministically.  Thus the problem is due to lack of understanding and not due to complexity.  AlphaFold and all the others trying to solve this problem are “cheating” in that they are not just using the sequence, they are using other sequences like it (multiple-sequence alignments), and they are using all the structural information contained in the PDB.  All of this information is not used by the cells.   In short, unless AlphaFold2 now allows us to understand how exactly a single protein sequence produces a particular 3D structure, the protein folding problem is hardly solved in a theoretical sense. The only reason we know how well AlphaFold2 did is because the structures were solved and we could compare with the predictions, which means verification is lacking.
>> 
>> The protein folding problem will be solved when we understand how to go from a sequence to a structure, and can verify a given structure to be correct without experimental data. Even if AlphaFold2 got 99% of structures right, your next interesting target protein might be the 1%. How would you know?   Until then, what AlphaFold2 is telling us right now is that all (most) of the information present in the sequence that determines the 3D structure can be gleaned in bits and pieces scattered between homologous sequences, multiple-sequence alignments, and other protein 3D structures in the PDB.  Deep Learning allows a huge amount of data to be thrown at a problem and the back-propagation of the networks then allows careful fine-tuning of weights which determine how relevant different pieces of information are to the prediction.  The networks used here are humongous and a detailed look at the weights (if at all feasible) may point us in the right direction.
>> 
>> 
>> From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI)
>> Sent: December 4, 2020 9:14 AM
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)
>> 
>> The subject line for Isabel’s email is very good.
>> 
>> I do have a question (more a request) for the more computer scientist oriented people. I think it is relevant for where this technology will be going. It comes from trying to understand whether problems addressed by Alpha are NP, NP hard, NP complete etc. My understanding is that the previous successes of Alpha were for complete information games such as Chess and Go. Both the rules and the present position were available to both sides. The folding problem might be in a different category. It would be nice if someone could explain the difference (if any) between Go and the protein folding problem perhaps using the NP type categories.
>> 
>> Colin
>> 
>> 
>> 
>> From: CCP4 bulletin board <CCP4BB@JISCMAIL.AC.UK> On Behalf Of Isabel Garcia-Saez
>> Sent: 03 December 2020 11:18
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)
>> 
>> Dear all,
>> 
>> Just commenting that after the stunning performance of AlphaFold that uses AI from Google maybe some of us we could dedicate ourselves to the noble art of gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or everything together (just in case I have already prepared my subscription to Netflix).
>> 
>> https://www.nature.com/articles/d41586-020-03348-4
>> 
>> Well, I suppose that we still have the structures of complexes (at the moment). I am wondering how the labs will have access to this technology in the future (would it be for free coming from the company DeepMind - Google?). It seems that they have already published some code. Well, exciting times.
>> 
>> Cheers,
>> 
>> Isabel
>> 
>> 
>> Isabel Garcia-Saez              PhD
>> Institut de Biologie Structurale
>> Viral Infection and Cancer Group (VIC)-Cell Division Team
>> 71, Avenue des Martyrs
>> CS 10090
>> 38044 Grenoble Cedex 9
>> France
>> Tel.: 00 33 (0) 457 42 86 15
>> e-mail: isabel.gar...@ibs.fr
>> FAX: 00 33 (0) 476 50 18 90
>> http://www.ibs.fr/
>> 
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>> 
>>
>> --
>>
>> This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
>> Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.
>> Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
>> Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>> 
>>
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1


-- 

Marko Hyvonen
Department of Biochemistry, University of Cambridge
mh...@cam.ac.uk
+44 (0)1223 766 044
@HyvonenGroup
http://hyvonen.bioc.cam.ac.uk
 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

Reply via email to