Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Lindsay Sawyer
In reply to James, BLAST will align sequences but your premise is that 
the function of the 'unknown' sequence/structure is the same as that of 
the 'known'. The lipocalin family is one which has a wide distribution 
and the functions vary considerably from involvement with embryo 
implantation to colouration of the lobster carapace to brain-related 
enzyme activity. Each easily predicatable from the other? Possibly, but 
knowing what ligand binds doesn't necessarily give thephysiological 
function!


Lindsay

On 12/11/2020 3:54 PM, James Holton wrote:
Well, that problem was solved a long time ago.  An excellent 
function-from-sequence predictor is here:

https://blast.ncbi.nlm.nih.gov/Blast.cgi

AlphaFold2 is doing rather much the same thing.  Just with a 3D output 
rather than 1D, and an underlying model with a LOT more fittable 
parameters.


-James Holton
MAD Scientist

On 12/11/2020 4:42 AM, Phil Evans wrote:
Alpha-fold looks great and is clearly a long way towards answering 
the question “this is the sequence, what is the structure?”


But I’ve always thought the more interesting question is “this is the 
structure, what does it do?”  Is there any progress on that question?


Phil



On 11 Dec 2020, at 12:12, Tristan Croll  wrote:

I'm not Randy, but I do have an answer: like this. This is T1049-D1. 
AlphaFold prediction in red, experimental structure (6y4f) in green. 
Agreement is close to perfect, apart from the C-terminal tail which 
is way off - but clearly flexible and only resolved in this 
conformation in the crystal due to packing interactions. GDT_TS is 
93.1; RMS_CA is 3.68 - but if you exclude those tail residues, it's 
0.79. With an alignment cutoff of 1 A, you can align 109 of 134 CAs 
with an RMSD of 0.46 A.
From: CCP4 bulletin board  on behalf of 
Leonid Sazanov 

Sent: 11 December 2020 10:36
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)

  Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 
90 (supposedly as good as experimental model) the RMS_CA (backbone) 
is > 3.0 Angstrom? Such a deviation can hardly be described as good 
as experimental. Could it be that GDT_TS is kind of designed to 
evaluate how well the general sub-domain level fold is predicted, 
rather than overall detail?


Thanks,
Leonid


Several people have mentioned lack of peer review as a reason to 
doubt the significance of the AlphaFold2 results.  There are 
different routes to peer review and, while the results have not been 
published in a peer review journal, I would have to say (as someone 
who has been an assessor for two CASPs, as well as having editorial 
responsibilities for a peer-reviewed journal), the peer review at 
CASP is much more rigorous than the peer review that most papers 
undergo.  The targets are selected from structures that have 
recently been solved but not published or disseminated, and even 
just tweeting a C-alpha trace is probably enough to get a target 
cancelled. In some cases (as we’ve heard here) the people 
determining the structure are overly optimistic about when their 
structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to 
the identities of the predictors, and they carry out months of 
calculations and inspections of the models, computing ranking scores 
before they find out who made the predictions.  Most assessors try 
to bring something new to the assessment, because the criteria 
should get more stringent as the predictions get better, and they 
have new ideas of what to look for, but there’s always some overlap 
with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.




Of course we’d all like to know the details of how AlphaFold2 works, 
and the DeepMind people could have been (and should be) much more 
forthcoming, but their results are real.  They didn’t have any way 
of cheating, being selective about what they reported, or gaming the 
system in any other way that the other groups couldn’t do.  (And 
yes, when we learned that DeepMind was behind the exceptionally good 
results two years ago at CASP13, we made the same half-jokes about 
whether Gmail had been in the database they were mining!)




Best wishes,



Randy Read

 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a 
mailing list hosted by www.jiscmail.ac.uk, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubs

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread James Holton
Well, that problem was solved a long time ago.  An excellent 
function-from-sequence predictor is here:

https://blast.ncbi.nlm.nih.gov/Blast.cgi

AlphaFold2 is doing rather much the same thing.  Just with a 3D output 
rather than 1D, and an underlying model with a LOT more fittable parameters.


-James Holton
MAD Scientist

On 12/11/2020 4:42 AM, Phil Evans wrote:

Alpha-fold looks great and is clearly a long way towards answering the question 
“this is the sequence, what is the structure?”

But I’ve always thought the more interesting question is “this is the 
structure, what does it do?”  Is there any progress on that question?

Phil



On 11 Dec 2020, at 12:12, Tristan Croll  wrote:

I'm not Randy, but I do have an answer: like this. This is T1049-D1. AlphaFold 
prediction in red, experimental structure (6y4f) in green. Agreement is close 
to perfect, apart from the C-terminal tail which is way off - but clearly 
flexible and only resolved in this conformation in the crystal due to packing 
interactions. GDT_TS is 93.1; RMS_CA is 3.68 - but if you exclude those tail 
residues, it's 0.79. With an alignment cutoff of 1 A, you can align 109 of 134 
CAs with an RMSD of 0.46 A.
From: CCP4 bulletin board  on behalf of Leonid Sazanov 

Sent: 11 December 2020 10:36
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)
  
Dear Randy,


Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 (supposedly 
as good as experimental model) the RMS_CA (backbone) is > 3.0 Angstrom? Such a 
deviation can hardly be described as good as experimental. Could it be that GDT_TS is 
kind of designed to evaluate how well the general sub-domain level fold is predicted, 
rather than overall detail?

Thanks,
Leonid


Several people have mentioned lack of peer review as a reason to doubt the 
significance of the AlphaFold2 results.  There are different routes to peer 
review and, while the results have not been published in a peer review journal, 
I would have to say (as someone who has been an assessor for two CASPs, as well 
as having editorial responsibilities for a peer-reviewed journal), the peer 
review at CASP is much more rigorous than the peer review that most papers 
undergo.  The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a C-alpha 
trace is probably enough to get a target cancelled.  In some cases (as we’ve 
heard here) the people determining the structure are overly optimistic about 
when their structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to the 
identities of the predictors, and they carry out months of calculations and 
inspections of the models, computing ranking scores before they find out who 
made the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the predictions 
get better, and they have new ideas of what to look for, but there’s always 
some overlap with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.



Of course we’d all like to know the details of how AlphaFold2 works, and the 
DeepMind people could have been (and should be) much more forthcoming, but 
their results are real.  They didn’t have any way of cheating, being selective 
about what they reported, or gaming the system in any other way that the other 
groups couldn’t do.  (And yes, when we learned that DeepMind was behind the 
exceptionally good results two years ago at CASP13, we made the same half-jokes 
about whether Gmail had been in the database they were mining!)



Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmai

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Leonid Sazanov

Thanks, I will try this.

Also, on CASP website there are such scores as RMS_ALL (can be seen in 
tables) and GDC_SC (for side-chains, not visible in tables for some reason).


RMS_ALL presumably includes side-chains and seems good for AlphaFold2 
models, between 1 to 2 Angstrom (apart from the same outliers as 
RMS_CA), although that is not quite at the experimental level.


Were any scores including side-chains included in ranking/evaluation (as 
we hear mostly about GDT_TS)?


If not, how can "experimental level" precision be claimed?


Thanks,

Leonid



On 11.12.20 13:56, Tristan Croll wrote:

I agree the website can be quite cryptic!

You can get all the targets as a tarball from 
https://predictioncenter.org/download_area/CASP14/targets/ 
<https://predictioncenter.org/download_area/CASP14/targets/>. For the 
predictions, you can either get them as PDB files on a case-by-case 
basis from the results section, or tarballs of all predictions for a 
given target from 
https://predictioncenter.org/download_area/CASP14/predictions_trimmed_to_domains/ 
<https://predictioncenter.org/download_area/CASP14/predictions_trimmed_to_domains/>. 
In the latter case, each file is essentially a PDB file without the 
.pdb extension, except with 4 lines added to the front looking 
something like:


PFRMAT TS
TARGET T1049
MODEL 2
PARENT N/A

Depending on your choice of viewer, you may need to remove these lines 
before attempting to open it.


The GDT_TS score only considers alpha carbons, so in principle it /is/ 
possible to get a high score on it while still having a model that's 
rubbish in every other respect. It's certainly worth complementing it 
with other scores - e.g. good old MolProbity, or SphereGrinder. The 
latter is quite good in principle - essentially, it places a 6 A 
radius sphere at each CA atom of the target, finds all heavy atoms in 
the sphere, and measures their RMSD to the corresponding atoms in the 
prediction. The actual implementation for CASP is a bit broad-brush, 
though - the score is just the fraction of spheres whose RMSD is under 
2 A.


In the last CASP round I pushed for the need to start adding metrics 
that directly compared the models in torsion space - far from the 
first time that's been suggested, but it's arguably only in the past 
few rounds that models have gotten good enough for this to be a useful 
discriminating measure. It doesn't appear that this has been added to 
the standard measures for CASP14, but if it had I can see that 
AlphaFold2 would have done extremely well - I only showed the ribbon 
representation for T1049 in my last email, but the sidechains in the 
core show pretty amazing agreement with the target.


Best regards,

Tristan

*From:* Leonid Sazanov 
*Sent:* 11 December 2020 12:32
*To:* Tristan Croll ; CCP4BB@JISCMAIL.AC.UK 

*Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)


I see, thanks, that looks good.

Where can one download predicted_model+exp_model PDBs together?

I could easily find predicted models but not experimental - CASP 
website seems very cryptic.


Also, can you comment on how much GDT_TS depends on CA and how much on 
side chains positioning?


E.g. if it is >90, can one be sure that most side-chains are in the 
right place?


Thanks.

Leonid


On 11.12.20 13:12, Tristan Croll wrote:
I'm not Randy, but I do have an answer: like this. This is T1049-D1. 
AlphaFold prediction in red, experimental structure (6y4f) in green. 
Agreement is close to perfect, apart from the C-terminal tail which 
is way off - but clearly flexible and only resolved in this 
conformation in the crystal due to packing interactions. GDT_TS is 
93.1; RMS_CA is 3.68 - but if you exclude those tail residues, it's 
0.79. With an alignment cutoff of 1 A, you can align 109 of 134 CAs 
with an RMSD of 0.46 A.


*From:* CCP4 bulletin board  
<mailto:CCP4BB@JISCMAIL.AC.UK> on behalf of Leonid Sazanov 
 <mailto:saza...@ist.ac.at>

*Sent:* 11 December 2020 10:36
*To:* CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK> 
 <mailto:CCP4BB@JISCMAIL.AC.UK>
*Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)

Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 
3.0 Angstrom? Such a deviation can hardly be described as good as 
experimental. Could it be that GDT_TS is kind of designed to evaluate 
how well the general sub-domain level fold is predicted, rather than 
overall detail?


Thanks,
Leonid


>>>>>
Several people have mentioned lack of peer review as a reason to 
doubt the significance of the AlphaFold2 results.  There are 
different routes to peer review and, while the r

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Brandstetter Johann
Eventually computational methods (like AlphaFold) should provide reliable 
information on the spectrum of metastable conformational substates that a 
protein can adopt, i.e. its dynamics. This information will be valuable to 
answer the question of a protein's function, and also of its crystallization - 
and if it is only: difficult!

Best,
Hans

-Original Message-
From: CCP4 bulletin board  On Behalf Of Bryan Lepore
Sent: Freitag, 11. Dezember 2020 15:03
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

> On Dec 11, 2020, at 07:42, Phil Evans  wrote:
> 
> But I’ve always thought the more interesting question is “this is the 
> structure, what does it do?”

It sounds compelling though, that methods of the sort implemented in the CASP 
work are perfectly poised to make progress on the question:

“how might this protein crystallize?”

-Bryan W. Lepore


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Bryan Lepore
> On Dec 11, 2020, at 07:42, Phil Evans  wrote:
> 
> But I’ve always thought the more interesting question is “this is the 
> structure, what does it do?”

It sounds compelling though, that methods of the sort implemented in the CASP 
work are perfectly poised to make progress on the question:

“how might this protein crystallize?”

-Bryan W. Lepore


To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Panne, Daniel (Prof.)
I agree with Phil!

Yes, it is nice to be able to obtain better models but interesting biological 
function resides usually in the most variable and least predictable features of 
a protein, how it associates with other proteins etc. Even when a fold can 
predicted, such folds alone frequently fail to predict function (which doomed 
structural genomics from the outset).

Daniel




On 11 Dec 2020, at 12:42, Phil Evans 
mailto:p...@mrc-lmb.cam.ac.uk>> wrote:

Alpha-fold looks great and is clearly a long way towards answering the question 
“this is the sequence, what is the structure?”

But I’ve always thought the more interesting question is “this is the 
structure, what does it do?”  Is there any progress on that question?

Phil


On 11 Dec 2020, at 12:12, Tristan Croll 
mailto:ti...@cam.ac.uk>> wrote:

I'm not Randy, but I do have an answer: like this. This is T1049-D1. AlphaFold 
prediction in red, experimental structure (6y4f) in green. Agreement is close 
to perfect, apart from the C-terminal tail which is way off - but clearly 
flexible and only resolved in this conformation in the crystal due to packing 
interactions. GDT_TS is 93.1; RMS_CA is 3.68 - but if you exclude those tail 
residues, it's 0.79. With an alignment cutoff of 1 A, you can align 109 of 134 
CAs with an RMSD of 0.46 A.
From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
on behalf of Leonid Sazanov mailto:saza...@ist.ac.at>>
Sent: 11 December 2020 10:36
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK> 
mailto:CCP4BB@JISCMAIL.AC.UK>>
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 3.0 
Angstrom? Such a deviation can hardly be described as good as experimental. 
Could it be that GDT_TS is kind of designed to evaluate how well the general 
sub-domain level fold is predicted, rather than overall detail?

Thanks,
Leonid



Several people have mentioned lack of peer review as a reason to doubt the 
significance of the AlphaFold2 results.  There are different routes to peer 
review and, while the results have not been published in a peer review journal, 
I would have to say (as someone who has been an assessor for two CASPs, as well 
as having editorial responsibilities for a peer-reviewed journal), the peer 
review at CASP is much more rigorous than the peer review that most papers 
undergo. The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a C-alpha 
trace is probably enough to get a target cancelled.  In some cases (as we’ve 
heard here) the people determining the structure are overly optimistic about 
when their structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to the 
identities of the predictors, and they carry out months of calculations and 
inspections of the models, computing ranking scores before they find out who 
made the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the predictions 
get better, and they have new ideas of what to look for, but there’s always 
some overlap with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.



Of course we’d all like to know the details of how AlphaFold2 works, and the 
DeepMind people could have been (and should be) much more forthcoming, but 
their results are real.  They didn’t have any way of cheating, being selective 
about what they reported, or gaming the system in any other way that the other 
groups couldn’t do.  (And yes, when we learned that DeepMind was behind the 
exceptionally good results two years ago at CASP13, we made the same half-jokes 
about whether Gmail had been in the database they were mining!)



Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2FWA-JISC.exe%3FSUBED1%3DCCP4BB%26A%3D1data=04%7C01%7Cdaniel.panne%40leicester.ac.uk%7C73159083e42747d3d64b08d89dd23ab9%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C637432873543493958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=smwRyzRk9Hed0G%2F%2Ftavt%2BBb62BjupKUJmuD%2BpxTDXC0%3Dreserved=0

This message was issued to members of 
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.jiscmail.ac.uk%2FCCP4BBdata=04%7C01%7Cdaniel.panne%40leicester.ac.uk%7C73159083e42747d3d64b08d89dd23ab9%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C637432873543493958%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Phil Evans
Alpha-fold looks great and is clearly a long way towards answering the question 
“this is the sequence, what is the structure?”

But I’ve always thought the more interesting question is “this is the 
structure, what does it do?”  Is there any progress on that question?

Phil


> On 11 Dec 2020, at 12:12, Tristan Croll  wrote:
> 
> I'm not Randy, but I do have an answer: like this. This is T1049-D1. 
> AlphaFold prediction in red, experimental structure (6y4f) in green. 
> Agreement is close to perfect, apart from the C-terminal tail which is way 
> off - but clearly flexible and only resolved in this conformation in the 
> crystal due to packing interactions. GDT_TS is 93.1; RMS_CA is 3.68 - but if 
> you exclude those tail residues, it's 0.79. With an alignment cutoff of 1 A, 
> you can align 109 of 134 CAs with an RMSD of 0.46 A.
> From: CCP4 bulletin board  on behalf of Leonid Sazanov 
> 
> Sent: 11 December 2020 10:36
> To: CCP4BB@JISCMAIL.AC.UK 
> Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and 
> less pipetting (?)
>  
> Dear Randy,
> 
> Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
> (supposedly as good as experimental model) the RMS_CA (backbone) is > 3.0 
> Angstrom? Such a deviation can hardly be described as good as experimental. 
> Could it be that GDT_TS is kind of designed to evaluate how well the general 
> sub-domain level fold is predicted, rather than overall detail?
> 
> Thanks,
> Leonid
> 
> 
> >>>>>
> Several people have mentioned lack of peer review as a reason to doubt the 
> significance of the AlphaFold2 results.  There are different routes to peer 
> review and, while the results have not been published in a peer review 
> journal, I would have to say (as someone who has been an assessor for two 
> CASPs, as well as having editorial responsibilities for a peer-reviewed 
> journal), the peer review at CASP is much more rigorous than the peer review 
> that most papers undergo.  The targets are selected from structures that have 
> recently been solved but not published or disseminated, and even just 
> tweeting a C-alpha trace is probably enough to get a target cancelled.  In 
> some cases (as we’ve heard here) the people determining the structure are 
> overly optimistic about when their structure solution will be finished, so 
> even they may not know the structure at the time it is predicted.  The 
> assessors are blinded to the identities of the predictors, and they carry out 
> months of calculations and inspections of the models, computing ranking 
> scores before they find out who made the predictions.  Most assessors try to 
> bring something new to the assessment, because the criteria should get more 
> stringent as the predictions get better, and they have new ideas of what to 
> look for, but there’s always some overlap with “traditional” measures such as 
> GDT-TS, GDT-HA (more stringent high-accuracy version of GDT) and lDDT.
> 
> 
> 
> Of course we’d all like to know the details of how AlphaFold2 works, and the 
> DeepMind people could have been (and should be) much more forthcoming, but 
> their results are real.  They didn’t have any way of cheating, being 
> selective about what they reported, or gaming the system in any other way 
> that the other groups couldn’t do.  (And yes, when we learned that DeepMind 
> was behind the exceptionally good results two years ago at CASP13, we made 
> the same half-jokes about whether Gmail had been in the database they were 
> mining!)
> 
> 
> 
> Best wishes,
> 
> 
> 
> Randy Read
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Leonid Sazanov

I see, thanks, that looks good.

Where can one download predicted_model+exp_model PDBs together?

I could easily find predicted models but not experimental - CASP website 
seems very cryptic.


Also, can you comment on how much GDT_TS depends on CA and how much on 
side chains positioning?


E.g. if it is >90, can one be sure that most side-chains are in the 
right place?


Thanks.

Leonid


On 11.12.20 13:12, Tristan Croll wrote:
I'm not Randy, but I do have an answer: like this. This is T1049-D1. 
AlphaFold prediction in red, experimental structure (6y4f) in green. 
Agreement is close to perfect, apart from the C-terminal tail which is 
way off - but clearly flexible and only resolved in this conformation 
in the crystal due to packing interactions. GDT_TS is 93.1; RMS_CA is 
3.68 - but if you exclude those tail residues, it's 0.79. With an 
alignment cutoff of 1 A, you can align 109 of 134 CAs with an RMSD of 
0.46 A.


*From:* CCP4 bulletin board  on behalf of 
Leonid Sazanov 

*Sent:* 11 December 2020 10:36
*To:* CCP4BB@JISCMAIL.AC.UK 
*Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)

Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 
3.0 Angstrom? Such a deviation can hardly be described as good as 
experimental. Could it be that GDT_TS is kind of designed to evaluate 
how well the general sub-domain level fold is predicted, rather than 
overall detail?


Thanks,
Leonid


>>>>>
Several people have mentioned lack of peer review as a reason to doubt 
the significance of the AlphaFold2 results.  There are different 
routes to peer review and, while the results have not been published 
in a peer review journal, I would have to say (as someone who has been 
an assessor for two CASPs, as well as having editorial 
responsibilities for a peer-reviewed journal), the peer review at CASP 
is much more rigorous than the peer review that most papers undergo.  
The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a 
C-alpha trace is probably enough to get a target cancelled.  In some 
cases (as we’ve heard here) the people determining the structure are 
overly optimistic about when their structure solution will be 
finished, so even they may not know the structure at the time it is 
predicted. The assessors are blinded to the identities of the 
predictors, and they carry out months of calculations and inspections 
of the models, computing ranking scores before they find out who made 
the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the 
predictions get better, and they have new ideas of what to look for, 
but there’s always some overlap with “traditional” measures such as 
GDT-TS, GDT-HA (more stringent high-accuracy version of GDT) and lDDT.




Of course we’d all like to know the details of how AlphaFold2 works, 
and the DeepMind people could have been (and should be) much more 
forthcoming, but their results are real.  They didn’t have any way of 
cheating, being selective about what they reported, or gaming the 
system in any other way that the other groups couldn’t do.  (And yes, 
when we learned that DeepMind was behind the exceptionally good 
results two years ago at CASP13, we made the same half-jokes about 
whether Gmail had been in the database they were mining!)




Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1>


This message was issued to members of www.jiscmail.ac.uk/CCP4BB 
<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted by 
www.jiscmail.ac.uk <http://www.jiscmail.ac.uk>, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/ 
<https://www.jiscmail.ac.uk/policyandsecurity/>


--
Prof. Leonid Sazanov FRS
IST Austria
Am Campus 1
A-3400 Klosterneuburg
Austria

Phone: +43 2243 9000 3026
E-mail: saza...@ist.ac.at




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Tristan Croll
I'm not Randy, but I do have an answer: like this. This is T1049-D1. AlphaFold 
prediction in red, experimental structure (6y4f) in green. Agreement is close 
to perfect, apart from the C-terminal tail which is way off - but clearly 
flexible and only resolved in this conformation in the crystal due to packing 
interactions. GDT_TS is 93.1; RMS_CA is 3.68 - but if you exclude those tail 
residues, it's 0.79. With an alignment cutoff of 1 A, you can align 109 of 134 
CAs with an RMSD of 0.46 A.

From: CCP4 bulletin board  on behalf of Leonid Sazanov 

Sent: 11 December 2020 10:36
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 3.0 
Angstrom? Such a deviation can hardly be described as good as experimental. 
Could it be that GDT_TS is kind of designed to evaluate how well the general 
sub-domain level fold is predicted, rather than overall detail?

Thanks,
Leonid


>>>>>
Several people have mentioned lack of peer review as a reason to doubt the 
significance of the AlphaFold2 results.  There are different routes to peer 
review and, while the results have not been published in a peer review journal, 
I would have to say (as someone who has been an assessor for two CASPs, as well 
as having editorial responsibilities for a peer-reviewed journal), the peer 
review at CASP is much more rigorous than the peer review that most papers 
undergo.  The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a C-alpha 
trace is probably enough to get a target cancelled.  In some cases (as we’ve 
heard here) the people determining the structure are overly optimistic about 
when their structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to the 
identities of the predictors, and they carry out months of calculations and 
inspections of the models, computing ranking scores before they find out who 
made the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the predictions 
get better, and they have new ideas of what to look for, but there’s always 
some overlap with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.



Of course we’d all like to know the details of how AlphaFold2 works, and the 
DeepMind people could have been (and should be) much more forthcoming, but 
their results are real.  They didn’t have any way of cheating, being selective 
about what they reported, or gaming the system in any other way that the other 
groups couldn’t do.  (And yes, when we learned that DeepMind was behind the 
exceptionally good results two years ago at CASP13, we made the same half-jokes 
about whether Gmail had been in the database they were mining!)



Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of 
www.jiscmail.ac.uk/CCP4BB<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list 
hosted by www.jiscmail.ac.uk<http://www.jiscmail.ac.uk>, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Leonid Sazanov
Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 3.0 
Angstrom? Such a deviation can hardly be described as good as experimental. 
Could it be that GDT_TS is kind of designed to evaluate how well the general 
sub-domain level fold is predicted, rather than overall detail?

Thanks,
Leonid


>
Several people have mentioned lack of peer review as a reason to doubt the 
significance of the AlphaFold2 results.  There are different routes to peer 
review and, while the results have not been published in a peer review journal, 
I would have to say (as someone who has been an assessor for two CASPs, as well 
as having editorial responsibilities for a peer-reviewed journal), the peer 
review at CASP is much more rigorous than the peer review that most papers 
undergo.  The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a C-alpha 
trace is probably enough to get a target cancelled.  In some cases (as we’ve 
heard here) the people determining the structure are overly optimistic about 
when their structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to the 
identities of the predictors, and they carry out months of calculations and 
inspections of the models, computing ranking scores before they find out who 
made the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the predictions 
get better, and they have new ideas of what to look for, but there’s always 
some overlap with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.



Of course we’d all like to know the details of how AlphaFold2 works, and the 
DeepMind people could have been (and should be) much more forthcoming, but 
their results are real.  They didn’t have any way of cheating, being selective 
about what they reported, or gaming the system in any other way that the other 
groups couldn’t do.  (And yes, when we learned that DeepMind was behind the 
exceptionally good results two years ago at CASP13, we made the same half-jokes 
about whether Gmail had been in the database they were mining!)



Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-10 Thread Randy John Read
Several people have mentioned lack of peer review as a reason to doubt the 
significance of the AlphaFold2 results.  There are different routes to peer 
review and, while the results have not been published in a peer review journal, 
I would have to say (as someone who has been an assessor for two CASPs, as well 
as having editorial responsibilities for a peer-reviewed journal), the peer 
review at CASP is much more rigorous than the peer review that most papers 
undergo.  The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a C-alpha 
trace is probably enough to get a target cancelled.  In some cases (as we’ve 
heard here) the people determining the structure are overly optimistic about 
when their structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to the 
identities of the predictors, and they carry out months of calculations and 
inspections of the models, computing ranking scores before they find out who 
made the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the predictions 
get better, and they have new ideas of what to look for, but there’s always 
some overlap with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.

Of course we’d all like to know the details of how AlphaFold2 works, and the 
DeepMind people could have been (and should be) much more forthcoming, but 
their results are real.  They didn’t have any way of cheating, being selective 
about what they reported, or gaming the system in any other way that the other 
groups couldn’t do.  (And yes, when we learned that DeepMind was behind the 
exceptionally good results two years ago at CASP13, we made the same half-jokes 
about whether Gmail had been in the database they were mining!)

Best wishes,

Randy Read

> On 9 Dec 2020, at 10:36, Hughes, Jonathan 
>  wrote:
> 
> i think the answer to all these doubts and questions is quite simple: the 
> AlphaFold2 people must make all details of their methods public (source code) 
> and, as would probably be necessary, open their system for inspection and use 
> by independent experts. isn't that what peer review and reproducibility are 
> all about? those rules date from the time before every tom, dick and 
> henriette could publicize anything they like inside their own zuckerberg 
> bubble. my opinion is that this is a virtual infectious disease that will 
> cause humanity far bigger problems than corona ever will – i just hope i'm 
> wrong!
> best
> jon
>  
> Von: CCP4 bulletin board  Im Auftrag von Mark J van 
> Raaij
> Gesendet: Mittwoch, 9. Dezember 2020 11:14
> An: CCP4BB@JISCMAIL.AC.UK
> Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and 
> less pipetting (?)
>  
> on the day the news came out, I did wonder if the AlphaFold2 team somehow had 
> access to all the preliminary PDB files sent around via Gmail (which belongs 
> to the same company), but more as a joke/conspirational thought.
> "our" target T1052, was also predicted very well by domains and as a monomer. 
> It will be interesting to see how well future iterations of the method can 
> assemble the complete protein chain and the complete protein chains into the 
> correct heteromer.
>  
> Mark J van Raaij
> Dpto de Estructura de Macromoleculas
> Centro Nacional de Biotecnologia - CSIC
> calle Darwin 3
> E-28049 Madrid, Spain
> tel. (+34) 91 585 4616
> Section Editor Acta Crystallographica F
> https://journals.iucr.org/f/
> 
>  
> On 9 Dec 2020, at 10:37, Cedric Govaerts  wrote:
>  
> Dear All
>  
> After about 10 (!) years of (very) hard work we solved the structures of our 
> dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
> anisotropic, experimental phasing, and many long nights with Coot and 
> Buster to achieve model refinement. 
>  
> The experimental structure had a well defined ligand nicely coordinated but 
> also a lipid embedded inside the binding cavity (a complete surprise but 
> biologically relevant) and two detergent molecules well defined 
> (experimental/crystallisation artefact).
>  
> As our paper was accepted basically when CASP organisers were calling for 
> targets I offered my baby to the computing Gods. However we only provided the 
> sequence to CASP, no info regarding any ligand or lipid.
>  
> Less than a month after, the CASP team contacted us and send us the best 
> model.  In fact it was 2 half models as the transporter is a pseudo dimer, 
> with the N-lobe and C-lobe moving relative to each other during transport 
> cycle, thus divided as two domains in CASP.
>  
&

Re: [ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Patrick Shaw Stewart
>they can maintain an advantage through several routes - they can
> publish in patents (so people can see what they’ve done, but not legally
> implement it )


In Europe and I think some other countries, inventions can only be patented
if they have *industrial applicability.*

In any case, academics all over the world tend to ignore them.



On Wed, Dec 9, 2020 at 12:18 PM Harry Powell - CCP4BB <
193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:

> Hi
>
> Actually, since Deep Mind is a commercial organization (funded by
> shareholders and people who buy their services), I don’t think they are
> subject to the same rules as academia as regards making their source code
> public. It would be very nice if they would (could?) make their code
> public, but I don’t see any obligation to do so. Their responsibility is
> primarily to their shareholders (you can argue the rights and wrongs of
> that until the cows come home).
>
> Commercially, they can maintain an advantage through several routes - they
> can publish in patents (so people can see what they’ve done, but not
> legally implement it without a licence), they can keep it all confidential
> and hope that no-one manages to reverse engineer and implement it (at the
> risk of someone else publishing the details and removing their advantage),
> they can publish something that is honest but just misleading enough (or
> lacking in detail) to throw people off the scent, or…
>
> If they can provoke other developers to work out where they have gone
> wrong and produce something that competes with AlphaFold2, that would be
> great. If they can provide something like a web service that allows users
> to run their method, that would be great too, but the important thing is
> (that unless they had prior knowledge of the structures in CASP14) they’ve
> done something that no-one else has managed to do as well in spite of years
> of trying.
>
> Just my two ha’porth.
>
> Harry
>
> > On 9 Dec 2020, at 10:36, Hughes, Jonathan <
> jon.hug...@bot3.bio.uni-giessen.de> wrote:
> >
> > i think the answer to all these doubts and questions is quite simple:
> the AlphaFold2 people must make all details of their methods public (source
> code) and, as would probably be necessary, open their system for inspection
> and use by independent experts. isn't that what peer review and
> reproducibility are all about? those rules date from the time before every
> tom, dick and henriette could publicize anything they like inside their own
> zuckerberg bubble. my opinion is that this is a virtual infectious disease
> that will cause humanity far bigger problems than corona ever will – i just
> hope i'm wrong!
> >
> > best
> >
> > jon
> >
> >
> >
> > Von: CCP4 bulletin board  Im Auftrag von Mark J
> van Raaij
> > Gesendet: Mittwoch, 9. Dezember 2020 11:14
> > An: CCP4BB@JISCMAIL.AC.UK
> > Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking
> and less pipetting (?)
> >
> >
> >
> > on the day the news came out, I did wonder if the AlphaFold2 team
> somehow had access to all the preliminary PDB files sent around via Gmail
> (which belongs to the same company), but more as a joke/conspirational
> thought.
> >
> > "our" target T1052, was also predicted very well by domains and as a
> monomer. It will be interesting to see how well future iterations of the
> method can assemble the complete protein chain and the complete protein
> chains into the correct heteromer.
> >
> >
> >
> > Mark J van Raaij
> > Dpto de Estructura de Macromoleculas
> > Centro Nacional de Biotecnologia - CSIC
> > calle Darwin 3
> > E-28049 Madrid, Spain
> > tel. (+34) 91 585 4616
> >
> > Section Editor Acta Crystallographica F
> > https://journals.iucr.org/f/
> >
> >
> >
> > On 9 Dec 2020, at 10:37, Cedric Govaerts 
> wrote:
> >
> >
> >
> > Dear All
> >
> >
> >
> > After about 10 (!) years of (very) hard work we solved the structures of
> our dearest membrane transporter.  Dataset at 2.9 And resolution, fairly
> anisotropic, experimental phasing, and many long nights with Coot and
> Buster to achieve model refinement.
> >
> >
> >
> > The experimental structure had a well defined ligand nicely coordinated
> but also a lipid embedded inside the binding cavity (a complete surprise
> but biologically relevant) and two detergent molecules well defined
> (experimental/crystallisation artefact).
> >
> >
> >
> > As our paper was accepted basically when CASP organisers were calling
> for targets I offered

Re: [ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Harry Powell - CCP4BB
Hi

Actually, since Deep Mind is a commercial organization (funded by shareholders 
and people who buy their services), I don’t think they are subject to the same 
rules as academia as regards making their source code public. It would be very 
nice if they would (could?) make their code public, but I don’t see any 
obligation to do so. Their responsibility is primarily to their shareholders 
(you can argue the rights and wrongs of that until the cows come home).

Commercially, they can maintain an advantage through several routes - they can 
publish in patents (so people can see what they’ve done, but not legally 
implement it without a licence), they can keep it all confidential and hope 
that no-one manages to reverse engineer and implement it (at the risk of 
someone else publishing the details and removing their advantage), they can 
publish something that is honest but just misleading enough (or lacking in 
detail) to throw people off the scent, or…

If they can provoke other developers to work out where they have gone wrong and 
produce something that competes with AlphaFold2, that would be great. If they 
can provide something like a web service that allows users to run their method, 
that would be great too, but the important thing is (that unless they had prior 
knowledge of the structures in CASP14) they’ve done something that no-one else 
has managed to do as well in spite of years of trying.

Just my two ha’porth.

Harry

> On 9 Dec 2020, at 10:36, Hughes, Jonathan 
>  wrote:
> 
> i think the answer to all these doubts and questions is quite simple: the 
> AlphaFold2 people must make all details of their methods public (source code) 
> and, as would probably be necessary, open their system for inspection and use 
> by independent experts. isn't that what peer review and reproducibility are 
> all about? those rules date from the time before every tom, dick and 
> henriette could publicize anything they like inside their own zuckerberg 
> bubble. my opinion is that this is a virtual infectious disease that will 
> cause humanity far bigger problems than corona ever will – i just hope i'm 
> wrong!
> 
> best
> 
> jon
> 
>  
> 
> Von: CCP4 bulletin board  Im Auftrag von Mark J van 
> Raaij
> Gesendet: Mittwoch, 9. Dezember 2020 11:14
> An: CCP4BB@JISCMAIL.AC.UK
> Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and 
> less pipetting (?)
> 
>  
> 
> on the day the news came out, I did wonder if the AlphaFold2 team somehow had 
> access to all the preliminary PDB files sent around via Gmail (which belongs 
> to the same company), but more as a joke/conspirational thought.
> 
> "our" target T1052, was also predicted very well by domains and as a monomer. 
> It will be interesting to see how well future iterations of the method can 
> assemble the complete protein chain and the complete protein chains into the 
> correct heteromer.
> 
>  
> 
> Mark J van Raaij
> Dpto de Estructura de Macromoleculas
> Centro Nacional de Biotecnologia - CSIC
> calle Darwin 3
> E-28049 Madrid, Spain
> tel. (+34) 91 585 4616
> 
> Section Editor Acta Crystallographica F
> https://journals.iucr.org/f/
> 
>  
> 
> On 9 Dec 2020, at 10:37, Cedric Govaerts  wrote:
> 
>  
> 
> Dear All
> 
>  
> 
> After about 10 (!) years of (very) hard work we solved the structures of our 
> dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
> anisotropic, experimental phasing, and many long nights with Coot and 
> Buster to achieve model refinement. 
> 
>  
> 
> The experimental structure had a well defined ligand nicely coordinated but 
> also a lipid embedded inside the binding cavity (a complete surprise but 
> biologically relevant) and two detergent molecules well defined 
> (experimental/crystallisation artefact).
> 
>  
> 
> As our paper was accepted basically when CASP organisers were calling for 
> targets I offered my baby to the computing Gods. However we only provided the 
> sequence to CASP, no info regarding any ligand or lipid.
> 
>  
> 
> Less than a month after, the CASP team contacted us and send us the best 
> model.  In fact it was 2 half models as the transporter is a pseudo dimer, 
> with the N-lobe and C-lobe moving relative to each other during transport 
> cycle, thus divided as two domains in CASP.
> 
>  
> 
> The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. 
> And yes, group 427 was the superpower (did not know at the time that it was 
> AlphaFold).
> 
>  
> 
> We had long discussions with the CASP team, as -for us- this almost exact 
> modelling was dream-like (or science fiction) and -at some point- we were 
> even suspecting fraud, as our coordinates 

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Matthew Snee
It seems immensely powerful, but my impression is it shows just how much 
information can be extrapolated from the PDB if a technique that can make use 
of "deep similarity" can be employed.

Obviously alphafold2 can make use of relationships that arent limited to direct 
homology, but if there is a fundamental "cellular context-free" relationship 
between sequence and structure (I'm sceptical about this) then it must be via 
the sidechains.

If the sidechains predictions are worse than the backbone, and loops are also 
imperfect, then it strongly suggests that the process is still inferring the 
structure (albeit in a very clever way that can determine and weight 
similarities that go far beyond those implied by direct homology) rather than  
"building" it de novo.

Obviously sidechain and loop positions are important when we think about the 
applications of macro molecular structures, but I'm not qualified to say 
whether there is actually enough data in the PDB to beat the law of diminishing 
returns and reliably get trustworthy "experimental quality" predictions, and 
how that will scale with complex proteins which may be very context dependent 
in their ability to fold.

We probably dont need a universal understanding of sequence/structure to get 
there, but the claim that this is just a matter of time only really follows on 
from the assumption of a true de-novo method.  Without it, the learning set may 
need to be bigger than all solved (or even solveable) structures.

This could have been framed as something really exciting and complementary to 
experimental structural biology (trivial MR, much better denovo EM etc..) at a 
time when multi-disciplinary approaches are producing incredible insights, but 
the press that has been generated, seems  misleading, and I fear this is what 
the public and funders will base their decisions upon.

Just my two cents.

Matthew.




Get Outlook for Android<https://aka.ms/ghei36>


From: CCP4 bulletin board  on behalf of Cedric Govaerts 

Sent: Wednesday, December 9, 2020 9:37:17 AM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

Dear All

After about 10 (!) years of (very) hard work we solved the structures of our 
dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
anisotropic, experimental phasing, and many long nights with Coot and 
Buster to achieve model refinement.

The experimental structure had a well defined ligand nicely coordinated but 
also a lipid embedded inside the binding cavity (a complete surprise but 
biologically relevant) and two detergent molecules well defined 
(experimental/crystallisation artefact).

As our paper was accepted basically when CASP organisers were calling for 
targets I offered my baby to the computing Gods. However we only provided the 
sequence to CASP, no info regarding any ligand or lipid.

Less than a month after, the CASP team contacted us and send us the best model. 
 In fact it was 2 half models as the transporter is a pseudo dimer, with the 
N-lobe and C-lobe moving relative to each other during transport cycle, thus 
divided as two domains in CASP.

The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. And 
yes, group 427 was the superpower (did not know at the time that it was 
AlphaFold).

We had long discussions with the CASP team, as -for us- this almost exact 
modelling was dream-like (or science fiction) and -at some point- we were even 
suspecting fraud, as our coordinates had travelled over the internet a few 
times around when interacting with colleagues.  The organisers reassured us 
that we were not the only target that had been “nailed” so no reason to suspect 
any wrongdoing.

To this day I am still baffled and I would be happy to hear from the community, 
maybe from some of the CASP participants.

The target is T024, the “perfect" models are domain-split version (T024-D1 and 
T024-D2), as AlphaFold2 did not perform so well on the complete assembly.
Deposited PDB is 6T1Z

Cedric

PS: I should also note that many other groups performed very well, much better 
than I would have dreamed, including on the full protein but just not as 
crazy-good.
—
Prof. Cedric Govaerts, Ph.D.
Universite Libre de Bruxelles
Campus Plaine. Phone :+32 2 650 53 77
Building BC, Room 1C4 203
Boulevard du Triomphe, Acces 2
1050 Brussels
Belgium
http://govaertslab.ulb.ac.be/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmai

[ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Hughes, Jonathan
i think the answer to all these doubts and questions is quite simple: the 
AlphaFold2 people must make all details of their methods public (source code) 
and, as would probably be necessary, open their system for inspection and use 
by independent experts. isn't that what peer review and reproducibility are all 
about? those rules date from the time before every tom, dick and henriette 
could publicize anything they like inside their own zuckerberg bubble. my 
opinion is that this is a virtual infectious disease that will cause humanity 
far bigger problems than corona ever will – i just hope i'm wrong!
best
jon

Von: CCP4 bulletin board  Im Auftrag von Mark J van Raaij
Gesendet: Mittwoch, 9. Dezember 2020 11:14
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

on the day the news came out, I did wonder if the AlphaFold2 team somehow had 
access to all the preliminary PDB files sent around via Gmail (which belongs to 
the same company), but more as a joke/conspirational thought.
"our" target T1052, was also predicted very well by domains and as a monomer. 
It will be interesting to see how well future iterations of the method can 
assemble the complete protein chain and the complete protein chains into the 
correct heteromer.

Mark J van Raaij
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
calle Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
Section Editor Acta Crystallographica F
https://journals.iucr.org/f/

On 9 Dec 2020, at 10:37, Cedric Govaerts 
mailto:cedric.govae...@ulb.ac.be>> wrote:

Dear All

After about 10 (!) years of (very) hard work we solved the structures of our 
dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
anisotropic, experimental phasing, and many long nights with Coot and 
Buster to achieve model refinement.

The experimental structure had a well defined ligand nicely coordinated but 
also a lipid embedded inside the binding cavity (a complete surprise but 
biologically relevant) and two detergent molecules well defined 
(experimental/crystallisation artefact).

As our paper was accepted basically when CASP organisers were calling for 
targets I offered my baby to the computing Gods. However we only provided the 
sequence to CASP, no info regarding any ligand or lipid.

Less than a month after, the CASP team contacted us and send us the best model. 
 In fact it was 2 half models as the transporter is a pseudo dimer, with the 
N-lobe and C-lobe moving relative to each other during transport cycle, thus 
divided as two domains in CASP.

The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. And 
yes, group 427 was the superpower (did not know at the time that it was 
AlphaFold).

We had long discussions with the CASP team, as -for us- this almost exact 
modelling was dream-like (or science fiction) and -at some point- we were even 
suspecting fraud, as our coordinates had travelled over the internet a few 
times around when interacting with colleagues.  The organisers reassured us 
that we were not the only target that had been “nailed” so no reason to suspect 
any wrongdoing.

To this day I am still baffled and I would be happy to hear from the community, 
maybe from some of the CASP participants.

The target is T024, the “perfect" models are domain-split version (T024-D1 and 
T024-D2), as AlphaFold2 did not perform so well on the complete assembly.
Deposited PDB is 6T1Z

Cedric

PS: I should also note that many other groups performed very well, much better 
than I would have dreamed, including on the full protein but just not as 
crazy-good.
—
Prof. Cedric Govaerts, Ph.D.
Universite Libre de Bruxelles
Campus Plaine. Phone :+32 2 650 53 77
Building BC, Room 1C4 203
Boulevard du Triomphe, Acces 2
1050 Brussels
Belgium
http://govaertslab.ulb.ac.be/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Mark J van Raaij
on the day the news came out, I did wonder if the AlphaFold2 team somehow had 
access to all the preliminary PDB files sent around via Gmail (which belongs to 
the same company), but more as a joke/conspirational thought.
"our" target T1052, was also predicted very well by domains and as a monomer. 
It will be interesting to see how well future iterations of the method can 
assemble the complete protein chain and the complete protein chains into the 
correct heteromer.

Mark J van Raaij
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
calle Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
Section Editor Acta Crystallographica F
https://journals.iucr.org/f/


> On 9 Dec 2020, at 10:37, Cedric Govaerts  wrote:
> 
> Dear All
> 
> After about 10 (!) years of (very) hard work we solved the structures of our 
> dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
> anisotropic, experimental phasing, and many long nights with Coot and 
> Buster to achieve model refinement. 
> 
> The experimental structure had a well defined ligand nicely coordinated but 
> also a lipid embedded inside the binding cavity (a complete surprise but 
> biologically relevant) and two detergent molecules well defined 
> (experimental/crystallisation artefact).
> 
> As our paper was accepted basically when CASP organisers were calling for 
> targets I offered my baby to the computing Gods. However we only provided the 
> sequence to CASP, no info regarding any ligand or lipid.
> 
> Less than a month after, the CASP team contacted us and send us the best 
> model.  In fact it was 2 half models as the transporter is a pseudo dimer, 
> with the N-lobe and C-lobe moving relative to each other during transport 
> cycle, thus divided as two domains in CASP.
> 
> The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. 
> And yes, group 427 was the superpower (did not know at the time that it was 
> AlphaFold).
> 
> We had long discussions with the CASP team, as -for us- this almost exact 
> modelling was dream-like (or science fiction) and -at some point- we were 
> even suspecting fraud, as our coordinates had travelled over the internet a 
> few times around when interacting with colleagues.  The organisers reassured 
> us that we were not the only target that had been “nailed” so no reason to 
> suspect any wrongdoing.
> 
> To this day I am still baffled and I would be happy to hear from the 
> community, maybe from some of the CASP participants.
> 
> The target is T024, the “perfect" models are domain-split version (T024-D1 
> and T024-D2), as AlphaFold2 did not perform so well on the complete assembly.
> Deposited PDB is 6T1Z
> 
> Cedric
> 
> PS: I should also note that many other groups performed very well, much 
> better than I would have dreamed, including on the full protein but just not 
> as crazy-good.
> —
> Prof. Cedric Govaerts, Ph.D.
> Universite Libre de Bruxelles
> Campus Plaine. Phone :+32 2 650 53 77
> Building BC, Room 1C4 203
> Boulevard du Triomphe, Acces 2
> 1050 Brussels
> Belgium
> http://govaertslab.ulb.ac.be/ 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
> 



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Cedric Govaerts
Dear All

After about 10 (!) years of (very) hard work we solved the structures of our 
dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
anisotropic, experimental phasing, and many long nights with Coot and 
Buster to achieve model refinement. 

The experimental structure had a well defined ligand nicely coordinated but 
also a lipid embedded inside the binding cavity (a complete surprise but 
biologically relevant) and two detergent molecules well defined 
(experimental/crystallisation artefact).

As our paper was accepted basically when CASP organisers were calling for 
targets I offered my baby to the computing Gods. However we only provided the 
sequence to CASP, no info regarding any ligand or lipid.

Less than a month after, the CASP team contacted us and send us the best model. 
 In fact it was 2 half models as the transporter is a pseudo dimer, with the 
N-lobe and C-lobe moving relative to each other during transport cycle, thus 
divided as two domains in CASP.

The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. And 
yes, group 427 was the superpower (did not know at the time that it was 
AlphaFold).

We had long discussions with the CASP team, as -for us- this almost exact 
modelling was dream-like (or science fiction) and -at some point- we were even 
suspecting fraud, as our coordinates had travelled over the internet a few 
times around when interacting with colleagues.  The organisers reassured us 
that we were not the only target that had been “nailed” so no reason to suspect 
any wrongdoing.

To this day I am still baffled and I would be happy to hear from the community, 
maybe from some of the CASP participants.

The target is T024, the “perfect" models are domain-split version (T024-D1 and 
T024-D2), as AlphaFold2 did not perform so well on the complete assembly.
Deposited PDB is 6T1Z

Cedric

PS: I should also note that many other groups performed very well, much better 
than I would have dreamed, including on the full protein but just not as 
crazy-good.
—
Prof. Cedric Govaerts, Ph.D.
Universite Libre de Bruxelles
Campus Plaine. Phone :+32 2 650 53 77
Building BC, Room 1C4 203
Boulevard du Triomphe, Acces 2
1050 Brussels
Belgium
http://govaertslab.ulb.ac.be/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Tristan Croll
... and of course I meant "between model and target".

From: Tristan Croll 
Sent: 08 December 2020 16:35
To: CCP4BB@JISCMAIL.AC.UK ; Marko Hyvonen 

Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

An example: this is TS1038-D1 - designated by the CASP organisers as in the 
"free modelling" category due to the absence of any close homologues in the 
wwPDB. The experimental model is in tan, the AlphaFold2 prediction in cyan. As 
far as I'm concerned, the only way to describe this is "nailed it". Using 
ChimeraX's MatchMaker to do the alignment, 84 of 114 residues align to a 
CA-RMSD of 0.57 A, (2.3 A across all residues, with the outliers being one 
flexible-looking loop and the N-terminal tail). Further than that, it's nailed 
almost all the details - if you exclude surface-exposed residues, I count less 
than half a dozen sidechains with significantly different rotamers compared to 
the template. The upshot is that the difference between model and template 
appears easily within the range of variation you'd expect to see between 
different crystal forms of the same protein.

For comparison, the next best group got the three-strand beta-sheet at bottom 
right essentially correct, but everything else (apart from the vague fold) 
wrong. MatchMaker aligns 28 CA atoms with an RMSD of 0.64 A, but the overall 
CA-RMSD blows out to 9.6 A. So I don't think there's any denying that this is a 
spectacular advance that will change the field markedly.

Best regards,

Tristan



From: CCP4 bulletin board  on behalf of Marko Hyvonen 

Sent: 08 December 2020 15:07
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

Hi Ian,

The data on Alphafold2 target RMSDs seems to be correct, but that "resolution 
around 2.5Å", makes no sense, I agree  - had not noticed that before. I can see 
that this has been raised in the Twitter feed comments to his post too.

I was highlighting this more for the alternative viewpoint on the discussion 
and also on the interesting detail on the resources needed/available (assuming 
correct!).

Marko

On 08/12/2020 14:02, Ian Tickle wrote:

Hi Marko

I hope he hasn't confused resolution with RMSD error:

"Just keep in mind that (1) a lower RMSD represents a better predicted 
structure, and that (2) most experimental structures have a resolution around 
2.5 Å. Taking this into consideration, about a third (36%) of Group 427’s 
submitted targets were predicted with a root-mean-square deviation (RMSD) under 
2 Å, and 86% were under 5 Å, with a total mean of 3.8 Å."

Cheers

-- Ian



On Tue, 8 Dec 2020 at 13:51, Marko Hyvonen 
mailto:mh...@cam.ac.uk>> wrote:
Here is another take on this topic, by Carlos Quteiral (@c_outeiral), from a 
non-crystallographer's point of view, covering many of the points discussed in 
this thread  (incl. an example of the model guiding correction of the 
experimental structure).

https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/

Marko

On 08/12/2020 13:25, Tristan Croll wrote:
This is a number that needs to be interpreted with some care. 2 Å crystal 
structures in general achieve an RMSD of 0.2 Å on the portion of the crystal 
that's resolved, including loops that are often only in well-resolved 
conformations due to physiologically-irrelevant crystal packing interactions. 
The predicted models, on the other hand, are in isolation. Once you get to the 
level achieved by this last round of predictions, that starts making fair 
comparison somewhat more difficult*. Two obvious options that I see: (1) limit 
the comparison only to the stable core of the protein (in which case many of 
the predictions have RMSDs in the very low fractions of an Angstrom), or (2) 
compare ensembles derived from MD simulations starting from the experimental 
and predicted structure, and see how well they overlap.

-- Tristan

* There's one more thorny issue when you get to this level: it becomes more and 
more possible (even likely) that the prediction gets some things right that are 
wrong in the experimental structure.

From: CCP4 bulletin board <mailto:CCP4BB@JISCMAIL.AC.UK> 
on behalf of Ian Tickle <mailto:ianj...@gmail.com>
Sent: 08 December 2020 13:04
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK> 
<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)


There was a little bit of press-release hype: the release stated "a score of 
around 90 GDT is informally considered to be competitive with results obtained 
from experimental methods" and "our latest AlphaFold system achieves a median 
score of 92.

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Marko Hyvonen

  
  
Hi Ian, 
  
  The data on Alphafold2 target RMSDs seems to be correct, but that
  "resolution around 2.5Å", makes no sense, I agree  - had not
  noticed that before. I can see that this has been raised in the
  Twitter feed comments to his post too.   
  
  I was highlighting this more for the alternative viewpoint on the
  discussion and also on the interesting detail on the resources
  needed/available (assuming correct!).
  
  Marko

On 08/12/2020 14:02, Ian Tickle wrote:


  
  

  Hi Marko
  
  
  I hope he hasn't confused resolution with RMSD error:
  
  
  "Just
  keep in mind that (1) a lower RMSD represents a better
  predicted structure, and that (2) most experimental
  structures have a resolution around 2.5 Å. Taking this
  into consideration, about a third (36%) of Group 427’s
  submitted targets were predicted with a root-mean-square
  deviation (RMSD) under 2 Å, and 86% were under 5 Å, with a
  total mean of 3.8 Å."
  
  

  Cheers
  

  --
  Ian
  

  


  
  
  
On Tue, 8 Dec 2020 at 13:51,
  Marko Hyvonen <mh...@cam.ac.uk> wrote:


   Here is another take on this
  topic, by Carlos Quteiral (@c_outeiral), from a
  non-crystallographer's point of view, covering many of the
  points discussed in this
  thread  (incl. an example
  of the model guiding correction of the experimental
  structure).
  
  https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/
  
  Marko

On 08/12/2020 13:25, Tristan Croll wrote:


  
This is a number that needs to be interpreted with some
care. 2 Å crystal structures in general achieve an RMSD
of 0.2 Å on the portion of the crystal that's resolved,
including loops that are often only in well-resolved
conformations due to physiologically-irrelevant crystal
packing interactions. The predicted models, on the other
hand, are in isolation. Once you get to the level
achieved by this last round of predictions, that starts
making fair comparison somewhat more difficult*. Two
obvious options that I see: (1) limit the comparison
only to the stable core of the protein (in which case
many of the predictions have RMSDs in the very low
fractions of an Angstrom), or (2) compare ensembles
derived from MD simulations starting from the
experimental and predicted structure, and see how well
they overlap.
  

  
  -- Tristan
  
  
  * There's one more thorny issue
when you get to this level: it becomes more and more
possible (even likely) that the prediction gets some
things right that are wrong in the experimental
structure. 
  
  From: CCP4 bulletin
  board 
  on behalf of Ian Tickle 
  Sent: 08 December 2020 13:04
  To: CCP4BB@JISCMAIL.AC.UK
  
          Subject: Re: [ccp4bb] External: Re: [ccp4bb]
      AlphaFold: more thinking and less pipetting (?)
 
  
  

  

  
There was a little bit of press-release
  hype: the release stated "a score of
around 90 GDT is informally considered to be
competitive with results obtained from
experimental methods" and "our
latest AlphaFold system achieves a median
score of 92.4 GDT overall across all
targets. This means that our predictions
have an average error (RMSD) of
approximately 1.6 Angstroms,".

  
Experimental
  methods achieve an average error of around
   

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Ian Tickle
Hi Marko

I hope he hasn't confused resolution with RMSD error:

"Just keep in mind that (1) a lower RMSD represents a better predicted
structure, and that (2) most experimental structures have a resolution
around 2.5 Å. Taking this into consideration, about a third (36%) of Group
427’s submitted targets were predicted with a root-mean-square deviation
(RMSD) under 2 Å, and 86% were under 5 Å, with a total mean of 3.8 Å."

Cheers

-- Ian



On Tue, 8 Dec 2020 at 13:51, Marko Hyvonen  wrote:

> Here is another take on this topic, by Carlos Quteiral (@c_outeiral), from
> a non-crystallographer's point of view, covering many of the points discussed
> in this thread  (incl. an example of the model guiding correction of the
> experimental structure).
>
>
> https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/
>
> Marko
>
> On 08/12/2020 13:25, Tristan Croll wrote:
>
> This is a number that needs to be interpreted with some care. 2 Å crystal
> structures in general achieve an RMSD of 0.2 Å on the portion of the
> crystal that's resolved, including loops that are often only in
> well-resolved conformations due to physiologically-irrelevant crystal
> packing interactions. The predicted models, on the other hand, are in
> isolation. Once you get to the level achieved by this last round of
> predictions, that starts making fair comparison somewhat more difficult*.
> Two obvious options that I see: (1) limit the comparison only to the stable
> core of the protein (in which case many of the predictions have RMSDs in
> the very low fractions of an Angstrom), or (2) compare ensembles derived
> from MD simulations starting from the experimental and predicted structure,
> and see how well they overlap.
>
> -- Tristan
>
> * There's one more thorny issue when you get to this level: it becomes
> more and more possible (even likely) that the prediction gets some things
> right that are wrong in the experimental structure.
> --
> *From:* CCP4 bulletin board 
>  on behalf of Ian Tickle 
> 
> *Sent:* 08 December 2020 13:04
> *To:* CCP4BB@JISCMAIL.AC.UK 
> 
> *Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking
> and less pipetting (?)
>
>
> There was a little bit of press-release hype: the release stated "a score
> of around 90 GDT is informally considered to be competitive with results
> obtained from experimental methods" and "our latest AlphaFold system
> achieves a median score of 92.4 GDT overall across all targets. This means
> that our predictions have an average error (RMSD
> <https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions>)
> of approximately 1.6 Angstroms <https://en.wikipedia.org/wiki/Angstrom>,".
>
> Experimental methods achieve an average error of around 0.2 Ang. or better
> at 2 Ang. resolution, and of course much better at atomic resolution (1
> Ang. or better), or around 0.5 Ang. at 3 Ang. resolution.  For
> ligand-binding studies I would say you need 3 Ang. resolution or better.
> 1.6 Ang. error is probably equivalent to around 4 Ang. resolution.  No
> doubt that will improve with time and experience, though I think it will be
> an uphill struggle to get better than 1 Ang. error, simply because the
> method can't be better than the data that go into it and 1-1.5 Ang.
> represents a typical spread of homologous models in the PDB.  So yes very
> competitive if you're desperate for a MR starting model, but not quite yet
> there for a refined high-resolution structure.
>
> Cheers
>
> -- Ian
>
>
> On Tue, 8 Dec 2020 at 12:11, Harry Powell - CCP4BB <
> 193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:
>
> Hi
>
> It’s a bit more than science by press release - they took part in CASP14
> where they were given sequences but no other experimental data, and did
> significantly better than the other homology modellers (who had access to
> the same data) when judge by independent analysis. There were things wrong
> with their structures, sure, but in almost every case they were less wrong
> than the other modellers (many of whom have been working on this problem
> for decades).
>
> It _will_ be more impressive once the methods they used (or equivalents)
> are implemented by other groups and are available to the “public” (I
> haven’t found an AlphaFold webserver to submit a sequence to, whereas the
> other groups in the field do make their methods readily available), but
> it’s still a step-change in protein structure prediction - it shows it can
> be done pretty well.
>
> Michel is right, of course; you can’t have homology modelling wit

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Marko Hyvonen

  
  
Here is another take on this topic, by Carlos
  Quteiral (@c_outeiral), from a non-crystallographer's point of
  view, covering many of the points discussed
  in this thread  (incl. an example
  of the model guiding correction of the experimental structure).
  
https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/
  
  Marko

On 08/12/2020 13:25, Tristan Croll
  wrote:


  
  
  
This is a number that needs to be interpreted with some care. 2
Å crystal structures in general achieve an RMSD of 0.2 Å on the
portion of the crystal that's resolved, including loops that are
often only in well-resolved conformations due to
physiologically-irrelevant crystal packing interactions. The
predicted models, on the other hand, are in isolation. Once you
get to the level achieved by this last round of predictions,
that starts making fair comparison somewhat more difficult*. Two
obvious options that I see: (1) limit the comparison only to the
stable core of the protein (in which case many of the
predictions have RMSDs in the very low fractions of an
Angstrom), or (2) compare ensembles derived from MD simulations
starting from the experimental and predicted structure, and see
how well they overlap.
  

  
  --
Tristan
  
  
  *
There's one more thorny issue when you get to this level: it
becomes more and more possible (even likely) that the
prediction gets some things right that are wrong in the
experimental structure. 
  
  From: CCP4
  bulletin board  on behalf of Ian
  Tickle 
  Sent: 08 December 2020 13:04
  To: CCP4BB@JISCMAIL.AC.UK 
  Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold:
  more thinking and less pipetting (?)
 
  
  

  

  
There was a little bit of press-release hype: the
  release stated "a score of around 90 GDT is
informally considered to be competitive with results
obtained from experimental methods" and "our latest AlphaFold system achieves
a median score of 92.4 GDT overall across all
targets. This means that our predictions have an
average error (RMSD) of approximately 1.6 Angstroms,".

  
Experimental methods achieve an
  average error of around 0.2 Ang. or better at 2
  Ang. resolution, and of course much better at
  atomic resolution (1 Ang. or better), or around
  0.5 Ang. at 3 Ang. resolution.  For
  ligand-binding studies I would say you need 3 Ang.
  resolution or better.  1.6 Ang.
error is probably equivalent to around 4 Ang.
resolution.  No doubt that will improve with time
and experience, though I think it will be an uphill
struggle to get better than 1 Ang. error, simply
because the method can't be better than the data
that go into it and 1-1.5 Ang. represents a typical
spread of homologous models in the PDB.  So yes very
competitive if you're desperate for a MR starting
model, but not quite yet there for a refined
high-resolution structure.

  
Cheers

  
-- Ian

  
  

  



  On Tue, 8 Dec 2020 at
12:11, Harry Powell - CCP4BB <193323b1e616-dmarc-requ...@jiscmail.ac.uk>
wrote:
  
  
Hi

It’s a bit more than science by press release - they took
part in CASP14 where they were given sequences but no other
experimental data, and did significantly better than the
other homology modellers (who had access to the same data)
when judge by independent analysis. There were things wrong
with their structures, sure, but in almost every case they
were less wrong than the other modellers (many of whom have
been working on this problem for decades).

It _will_ be more impressive once the methods they used (or
equivalents) are implemented by other groups and are
available to the “public” (I 

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Ian Tickle
Hi Tristan,

Point taken: unobserved parts of the structure have a very large (if not
undefined) experimental error!

I'd be interested to see how that average 1.6 Ang. error is distributed in
space: presumably that data is in the CASP analysis somewhere.

Cheers

-- Ian


On Tue, 8 Dec 2020 at 13:25, Tristan Croll  wrote:

> This is a number that needs to be interpreted with some care. 2 Å crystal
> structures in general achieve an RMSD of 0.2 Å on the portion of the
> crystal that's resolved, including loops that are often only in
> well-resolved conformations due to physiologically-irrelevant crystal
> packing interactions. The predicted models, on the other hand, are in
> isolation. Once you get to the level achieved by this last round of
> predictions, that starts making fair comparison somewhat more difficult*.
> Two obvious options that I see: (1) limit the comparison only to the stable
> core of the protein (in which case many of the predictions have RMSDs in
> the very low fractions of an Angstrom), or (2) compare ensembles derived
> from MD simulations starting from the experimental and predicted structure,
> and see how well they overlap.
>
> -- Tristan
>
> * There's one more thorny issue when you get to this level: it becomes
> more and more possible (even likely) that the prediction gets some things
> right that are wrong in the experimental structure.
> --
> *From:* CCP4 bulletin board  on behalf of Ian
> Tickle 
> *Sent:* 08 December 2020 13:04
> *To:* CCP4BB@JISCMAIL.AC.UK 
> *Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking
> and less pipetting (?)
>
>
> There was a little bit of press-release hype: the release stated "a score
> of around 90 GDT is informally considered to be competitive with results
> obtained from experimental methods" and "our latest AlphaFold system
> achieves a median score of 92.4 GDT overall across all targets. This means
> that our predictions have an average error (RMSD
> <https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions>)
> of approximately 1.6 Angstroms <https://en.wikipedia.org/wiki/Angstrom>,".
>
> Experimental methods achieve an average error of around 0.2 Ang. or better
> at 2 Ang. resolution, and of course much better at atomic resolution (1
> Ang. or better), or around 0.5 Ang. at 3 Ang. resolution.  For
> ligand-binding studies I would say you need 3 Ang. resolution or better.
> 1.6 Ang. error is probably equivalent to around 4 Ang. resolution.  No
> doubt that will improve with time and experience, though I think it will be
> an uphill struggle to get better than 1 Ang. error, simply because the
> method can't be better than the data that go into it and 1-1.5 Ang.
> represents a typical spread of homologous models in the PDB.  So yes very
> competitive if you're desperate for a MR starting model, but not quite yet
> there for a refined high-resolution structure.
>
> Cheers
>
> -- Ian
>
>
> On Tue, 8 Dec 2020 at 12:11, Harry Powell - CCP4BB <
> 193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:
>
> Hi
>
> It’s a bit more than science by press release - they took part in CASP14
> where they were given sequences but no other experimental data, and did
> significantly better than the other homology modellers (who had access to
> the same data) when judge by independent analysis. There were things wrong
> with their structures, sure, but in almost every case they were less wrong
> than the other modellers (many of whom have been working on this problem
> for decades).
>
> It _will_ be more impressive once the methods they used (or equivalents)
> are implemented by other groups and are available to the “public” (I
> haven’t found an AlphaFold webserver to submit a sequence to, whereas the
> other groups in the field do make their methods readily available), but
> it’s still a step-change in protein structure prediction - it shows it can
> be done pretty well.
>
> Michel is right, of course; you can’t have homology modelling without
> homologous models, which are drawn from the PDB - but the other modellers
> had the same access to the PDB (just as we all do…).
>
> Just my two ha’porth.
>
> Harry
>
> > On 8 Dec 2020, at 11:33, Goldman, Adrian 
> wrote:
> >
> > My impression is that they haven’t published the code, and it is science
> by press-release.  If one of us tried it, we would - rightly - get hounded
> out of time.
> >
> > Adrian
> >
> >
> >
> >> On 4 Dec 2020, at 15:57, Michel Fodje 
> wrote:
> >>
> >> I think the results from AlphaFold2, although exciting and a
> breakthrough are being exaggerated just a bit.

Re: [ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Artem Evdokimov
Well that is sad, and true, and also very common. I have personally
experienced dozens of cases where methods from literature do not reproduce
because (and this is important) the authors "just slap some generic
boilerplate" instead of the actual methods. My favorite is always to read
stuff like "such and such protein was cloned into bacterial expression
vector, expressed and and purified using standard methods" and then later
find out through considerable effort and twisting hands of original
researchers that the protein can only be expressed when fused with a Spider
Monkey cadherin domain and expressed in minimal medium supplemented with 5%
Pregnant Horse Urine at exactly 13.5 degrees C. And then purified using the
Spider Monkey cadherin monoclonal antibody. And the yield is 1 mg in 24
liters. None of which was ever disclosed in literature...

Sorry for the rant, I guess I am just saying that literature, IMO, has long
ago stopped being generally directly reproducible. Not getting into the
obvious reasons as to why it happened, but still sad that it happened.

Artem

On Tue, Dec 8, 2020, 8:28 AM Hughes, Jonathan <
jon.hug...@bot3.bio.uni-giessen.de> wrote:

> scientific research requires that experimental results must be testable,
> so you have to publish your methods too. if the alphafold2 people don't
> make their code accessible, they are playing a game with different rules.
> maybe it's called capitalism: i gather they're a private company
>
> best
>
> jon
>
>
>
> *Von:* CCP4 bulletin board  *Im Auftrag von *Goldman,
> Adrian
> *Gesendet:* Dienstag, 8. Dezember 2020 12:33
> *An:* CCP4BB@JISCMAIL.AC.UK
> *Betreff:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking
> and less pipetting (?)
>
>
>
> My impression is that they haven’t published the code, and it is science
> by press-release.  If one of us tried it, we would - rightly - get hounded
> out of time.
>
>
>
> Adrian
>
>
>
>
>
>
>
> On 4 Dec 2020, at 15:57, Michel Fodje  wrote:
>
>
>
> I think the results from AlphaFold2, although exciting and a breakthrough
> are being exaggerated just a bit.  We know that all the information
> required for the 3D structure is in the sequence. The protein folding
> problem is simply how to go from a sequence to the 3D structure. This is
> not a complex problem in the sense that cells solve it deterministically.
> Thus the problem is due to lack of understanding and not due to
> complexity.  AlphaFold and all the others trying to solve this problem are
> “cheating” in that they are not just using the sequence, they are using
> other sequences like it (multiple-sequence alignments), and they are using
> all the structural information contained in the PDB.  All of this
> information is not used by the cells.   In short, unless AlphaFold2 now
> allows us to understand how exactly a single protein sequence produces a
> particular 3D structure, the protein folding problem is hardly solved in a
> theoretical sense. The only reason we know how well AlphaFold2 did is
> because the structures were solved and we could compare with the
> predictions, which means verification is lacking.
>
>
>
> The protein folding problem will be solved when we understand how to go
> from a sequence to a structure, and can verify a given structure to be
> correct without experimental data. Even if AlphaFold2 got 99% of structures
> right, your next interesting target protein might be the 1%. How would you
> know?   Until then, what AlphaFold2 is telling us right now is that all
> (most) of the information present in the sequence that determines the 3D
> structure can be gleaned in bits and pieces scattered between homologous
> sequences, multiple-sequence alignments, and other protein 3D structures in
> the PDB.  Deep Learning allows a huge amount of data to be thrown at a
> problem and the back-propagation of the networks then allows careful
> fine-tuning of weights which determine how relevant different pieces of
> information are to the prediction.  The networks used here are humongous
> and a detailed look at the weights (if at all feasible) may point us in the
> right direction.
>
>
>
>
>
> *From:* CCP4 bulletin board  *On Behalf Of *Nave,
> Colin (DLSLtd,RAL,LSCI)
> *Sent:* December 4, 2020 9:14 AM
> *To:* CCP4BB@JISCMAIL.AC.UK
> *Subject:* External: Re: [ccp4bb] AlphaFold: more thinking and less
> pipetting (?)
>
>
>
> The subject line for Isabel’s email is very good.
>
>
>
> I do have a question (more a request) for the more computer scientist
> oriented people. I think it is relevant for where this technology will be
> going. It comes from trying to understand whether problems addressed by
> Alpha are NP, NP hard, 

[ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Hughes, Jonathan
scientific research requires that experimental results must be testable, so you 
have to publish your methods too. if the alphafold2 people don't make their 
code accessible, they are playing a game with different rules. maybe it's 
called capitalism: i gather they're a private company
best
jon

Von: CCP4 bulletin board  Im Auftrag von Goldman, Adrian
Gesendet: Dienstag, 8. Dezember 2020 12:33
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

My impression is that they haven’t published the code, and it is science by 
press-release.  If one of us tried it, we would - rightly - get hounded out of 
time.

Adrian




On 4 Dec 2020, at 15:57, Michel Fodje 
mailto:michel.fo...@lightsource.ca>> wrote:

I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are “cheating” in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel’s email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin



From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Isabel Garcia-Saez
Sent: 03 December 2020 11:18
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez  PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des M

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Tristan Croll
This is a number that needs to be interpreted with some care. 2 Å crystal 
structures in general achieve an RMSD of 0.2 Å on the portion of the crystal 
that's resolved, including loops that are often only in well-resolved 
conformations due to physiologically-irrelevant crystal packing interactions. 
The predicted models, on the other hand, are in isolation. Once you get to the 
level achieved by this last round of predictions, that starts making fair 
comparison somewhat more difficult*. Two obvious options that I see: (1) limit 
the comparison only to the stable core of the protein (in which case many of 
the predictions have RMSDs in the very low fractions of an Angstrom), or (2) 
compare ensembles derived from MD simulations starting from the experimental 
and predicted structure, and see how well they overlap.

-- Tristan

* There's one more thorny issue when you get to this level: it becomes more and 
more possible (even likely) that the prediction gets some things right that are 
wrong in the experimental structure.

From: CCP4 bulletin board  on behalf of Ian Tickle 

Sent: 08 December 2020 13:04
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)


There was a little bit of press-release hype: the release stated "a score of 
around 90 GDT is informally considered to be competitive with results obtained 
from experimental methods" and "our latest AlphaFold system achieves a median 
score of 92.4 GDT overall across all targets. This means that our predictions 
have an average error 
(RMSD<https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions>)
 of approximately 1.6 Angstroms<https://en.wikipedia.org/wiki/Angstrom>,".

Experimental methods achieve an average error of around 0.2 Ang. or better at 2 
Ang. resolution, and of course much better at atomic resolution (1 Ang. or 
better), or around 0.5 Ang. at 3 Ang. resolution.  For ligand-binding studies I 
would say you need 3 Ang. resolution or better.  1.6 Ang. error is probably 
equivalent to around 4 Ang. resolution.  No doubt that will improve with time 
and experience, though I think it will be an uphill struggle to get better than 
1 Ang. error, simply because the method can't be better than the data that go 
into it and 1-1.5 Ang. represents a typical spread of homologous models in the 
PDB.  So yes very competitive if you're desperate for a MR starting model, but 
not quite yet there for a refined high-resolution structure.

Cheers

-- Ian


On Tue, 8 Dec 2020 at 12:11, Harry Powell - CCP4BB 
<193323b1e616-dmarc-requ...@jiscmail.ac.uk<mailto:193323b1e616-dmarc-requ...@jiscmail.ac.uk>>
 wrote:
Hi

It’s a bit more than science by press release - they took part in CASP14 where 
they were given sequences but no other experimental data, and did significantly 
better than the other homology modellers (who had access to the same data) when 
judge by independent analysis. There were things wrong with their structures, 
sure, but in almost every case they were less wrong than the other modellers 
(many of whom have been working on this problem for decades).

It _will_ be more impressive once the methods they used (or equivalents) are 
implemented by other groups and are available to the “public” (I haven’t found 
an AlphaFold webserver to submit a sequence to, whereas the other groups in the 
field do make their methods readily available), but it’s still a step-change in 
protein structure prediction - it shows it can be done pretty well.

Michel is right, of course; you can’t have homology modelling without 
homologous models, which are drawn from the PDB - but the other modellers had 
the same access to the PDB (just as we all do…).

Just my two ha’porth.

Harry

> On 8 Dec 2020, at 11:33, Goldman, Adrian 
> mailto:adrian.gold...@helsinki.fi>> wrote:
>
> My impression is that they haven’t published the code, and it is science by 
> press-release.  If one of us tried it, we would - rightly - get hounded out 
> of time.
>
> Adrian
>
>
>
>> On 4 Dec 2020, at 15:57, Michel Fodje 
>> mailto:michel.fo...@lightsource.ca>> wrote:
>>
>> I think the results from AlphaFold2, although exciting and a breakthrough 
>> are being exaggerated just a bit.  We know that all the information required 
>> for the 3D structure is in the sequence. The protein folding problem is 
>> simply how to go from a sequence to the 3D structure. This is not a complex 
>> problem in the sense that cells solve it deterministically.  Thus the 
>> problem is due to lack of understanding and not due to complexity.  
>> AlphaFold and all the others trying to solve this problem are “cheating” in 
>> that they are not just using the sequence, they are using other sequences 
>> like it (multiple-sequence a

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Harry Powell - CCP4BB
s 
> >> because the structures were solved and we could compare with the 
> >> predictions, which means verification is lacking.
> >>  
> >> The protein folding problem will be solved when we understand how to go 
> >> from a sequence to a structure, and can verify a given structure to be 
> >> correct without experimental data. Even if AlphaFold2 got 99% of 
> >> structures right, your next interesting target protein might be the 1%. 
> >> How would you know?   Until then, what AlphaFold2 is telling us right now 
> >> is that all (most) of the information present in the sequence that 
> >> determines the 3D structure can be gleaned in bits and pieces scattered 
> >> between homologous sequences, multiple-sequence alignments, and other 
> >> protein 3D structures in the PDB.  Deep Learning allows a huge amount of 
> >> data to be thrown at a problem and the back-propagation of the networks 
> >> then allows careful fine-tuning of weights which determine how relevant 
> >> different pieces of information are to the prediction.  The networks used 
> >> here are humongous and a detailed look at the weights (if at all feasible) 
> >> may point us in the right direction.
> >>  
> >>  
> >> From: CCP4 bulletin board  On Behalf Of Nave, Colin 
> >> (DLSLtd,RAL,LSCI)
> >> Sent: December 4, 2020 9:14 AM
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less 
> >> pipetting (?)
> >>  
> >> The subject line for Isabel’s email is very good.
> >>  
> >> I do have a question (more a request) for the more computer scientist 
> >> oriented people. I think it is relevant for where this technology will be 
> >> going. It comes from trying to understand whether problems addressed by 
> >> Alpha are NP, NP hard, NP complete etc. My understanding is that the 
> >> previous successes of Alpha were for complete information games such as 
> >> Chess and Go. Both the rules and the present position were available to 
> >> both sides. The folding problem might be in a different category. It would 
> >> be nice if someone could explain the difference (if any) between Go and 
> >> the protein folding problem perhaps using the NP type categories.
> >>  
> >> Colin
> >>  
> >>  
> >>  
> >> From: CCP4 bulletin board  On Behalf Of Isabel 
> >> Garcia-Saez
> >> Sent: 03 December 2020 11:18
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)
> >>  
> >> Dear all,
> >>  
> >> Just commenting that after the stunning performance of AlphaFold that uses 
> >> AI from Google maybe some of us we could dedicate ourselves to the noble 
> >> art of gardening, baking, doing Chinese Calligraphy, enjoying the clouds 
> >> pass or everything together (just in case I have already prepared my 
> >> subscription to Netflix).
> >>  
> >> https://www.nature.com/articles/d41586-020-03348-4
> >>  
> >> Well, I suppose that we still have the structures of complexes (at the 
> >> moment). I am wondering how the labs will have access to this technology 
> >> in the future (would it be for free coming from the company DeepMind - 
> >> Google?). It seems that they have already published some code. Well, 
> >> exciting times. 
> >>  
> >> Cheers,
> >>  
> >> Isabel
> >>  
> >>  
> >> Isabel Garcia-Saez  PhD
> >> Institut de Biologie Structurale
> >> Viral Infection and Cancer Group (VIC)-Cell Division Team
> >> 71, Avenue des Martyrs
> >> CS 10090
> >> 38044 Grenoble Cedex 9
> >> France
> >> Tel.: 00 33 (0) 457 42 86 15
> >> e-mail: isabel.gar...@ibs.fr
> >> FAX: 00 33 (0) 476 50 18 90
> >> http://www.ibs.fr/
> >>  
> >>  
> >> To unsubscribe from the CCP4BB list, click the following link:
> >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >> 
> >>  
> >> 
> >> -- 
> >> 
> >> This e-mail and any attachments may contain confidential, copyright and or 
> >> privileged material, and are for the use of the intended addressee only. 
> >> If you are not the intended addressee or an authorised recipient of the 
> >> addressee please notify us of 

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Ian Tickle
hen allows careful
> fine-tuning of weights which determine how relevant different pieces of
> information are to the prediction.  The networks used here are humongous
> and a detailed look at the weights (if at all feasible) may point us in the
> right direction.
> >>
> >>
> >> From: CCP4 bulletin board  On Behalf Of Nave,
> Colin (DLSLtd,RAL,LSCI)
> >> Sent: December 4, 2020 9:14 AM
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less
> pipetting (?)
> >>
> >> The subject line for Isabel’s email is very good.
> >>
> >> I do have a question (more a request) for the more computer scientist
> oriented people. I think it is relevant for where this technology will be
> going. It comes from trying to understand whether problems addressed by
> Alpha are NP, NP hard, NP complete etc. My understanding is that the
> previous successes of Alpha were for complete information games such as
> Chess and Go. Both the rules and the present position were available to
> both sides. The folding problem might be in a different category. It would
> be nice if someone could explain the difference (if any) between Go and the
> protein folding problem perhaps using the NP type categories.
> >>
> >> Colin
> >>
> >>
> >>
> >> From: CCP4 bulletin board  On Behalf Of Isabel
> Garcia-Saez
> >> Sent: 03 December 2020 11:18
> >> To: CCP4BB@JISCMAIL.AC.UK
> >> Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)
> >>
> >> Dear all,
> >>
> >> Just commenting that after the stunning performance of AlphaFold that
> uses AI from Google maybe some of us we could dedicate ourselves to the
> noble art of gardening, baking, doing Chinese Calligraphy, enjoying the
> clouds pass or everything together (just in case I have already prepared my
> subscription to Netflix).
> >>
> >> https://www.nature.com/articles/d41586-020-03348-4
> >>
> >> Well, I suppose that we still have the structures of complexes (at the
> moment). I am wondering how the labs will have access to this technology in
> the future (would it be for free coming from the company DeepMind -
> Google?). It seems that they have already published some code. Well,
> exciting times.
> >>
> >> Cheers,
> >>
> >> Isabel
> >>
> >>
> >> Isabel Garcia-Saez  PhD
> >> Institut de Biologie Structurale
> >> Viral Infection and Cancer Group (VIC)-Cell Division Team
> >> 71, Avenue des Martyrs
> >> CS 10090
> >> 38044 Grenoble Cedex 9
> >> France
> >> Tel.: 00 33 (0) 457 42 86 15
> >> e-mail: isabel.gar...@ibs.fr
> >> FAX: 00 33 (0) 476 50 18 90
> >> http://www.ibs.fr/
> >>
> >>
> >> To unsubscribe from the CCP4BB list, click the following link:
> >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >>
> >>
> >>
> >> --
> >>
> >> This e-mail and any attachments may contain confidential, copyright and
> or privileged material, and are for the use of the intended addressee only.
> If you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> >> Any opinions expressed within this e-mail are those of the individual
> and not necessarily of Diamond Light Source Ltd.
> >> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> >> Diamond Light Source Limited (company no. 4375679). Registered in
> England and Wales with its registered office at Diamond House, Harwell
> Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> >>
> >>
> >>
> >> To unsubscribe from the CCP4BB list, click the following link:
> >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >>
> >>
> >> To unsubscribe from the CCP4BB list, click the following link:
> >> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >>
> >
> >
> > To unsubscribe from the CCP4BB list, click the following link:
> > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Harry Powell - CCP4BB
Hi

It’s a bit more than science by press release - they took part in CASP14 where 
they were given sequences but no other experimental data, and did significantly 
better than the other homology modellers (who had access to the same data) when 
judge by independent analysis. There were things wrong with their structures, 
sure, but in almost every case they were less wrong than the other modellers 
(many of whom have been working on this problem for decades).

It _will_ be more impressive once the methods they used (or equivalents) are 
implemented by other groups and are available to the “public” (I haven’t found 
an AlphaFold webserver to submit a sequence to, whereas the other groups in the 
field do make their methods readily available), but it’s still a step-change in 
protein structure prediction - it shows it can be done pretty well.

Michel is right, of course; you can’t have homology modelling without 
homologous models, which are drawn from the PDB - but the other modellers had 
the same access to the PDB (just as we all do…).

Just my two ha’porth.

Harry

> On 8 Dec 2020, at 11:33, Goldman, Adrian  wrote:
> 
> My impression is that they haven’t published the code, and it is science by 
> press-release.  If one of us tried it, we would - rightly - get hounded out 
> of time.
> 
> Adrian
> 
> 
> 
>> On 4 Dec 2020, at 15:57, Michel Fodje  wrote:
>> 
>> I think the results from AlphaFold2, although exciting and a breakthrough 
>> are being exaggerated just a bit.  We know that all the information required 
>> for the 3D structure is in the sequence. The protein folding problem is 
>> simply how to go from a sequence to the 3D structure. This is not a complex 
>> problem in the sense that cells solve it deterministically.  Thus the 
>> problem is due to lack of understanding and not due to complexity.  
>> AlphaFold and all the others trying to solve this problem are “cheating” in 
>> that they are not just using the sequence, they are using other sequences 
>> like it (multiple-sequence alignments), and they are using all the 
>> structural information contained in the PDB.  All of this information is not 
>> used by the cells.   In short, unless AlphaFold2 now allows us to understand 
>> how exactly a single protein sequence produces a particular 3D structure, 
>> the protein folding problem is hardly solved in a theoretical sense. The 
>> only reason we know how well AlphaFold2 did is because the structures were 
>> solved and we could compare with the predictions, which means verification 
>> is lacking.
>>  
>> The protein folding problem will be solved when we understand how to go from 
>> a sequence to a structure, and can verify a given structure to be correct 
>> without experimental data. Even if AlphaFold2 got 99% of structures right, 
>> your next interesting target protein might be the 1%. How would you know?   
>> Until then, what AlphaFold2 is telling us right now is that all (most) of 
>> the information present in the sequence that determines the 3D structure can 
>> be gleaned in bits and pieces scattered between homologous sequences, 
>> multiple-sequence alignments, and other protein 3D structures in the PDB.  
>> Deep Learning allows a huge amount of data to be thrown at a problem and the 
>> back-propagation of the networks then allows careful fine-tuning of weights 
>> which determine how relevant different pieces of information are to the 
>> prediction.  The networks used here are humongous and a detailed look at the 
>> weights (if at all feasible) may point us in the right direction.
>>  
>>  
>> From: CCP4 bulletin board  On Behalf Of Nave, Colin 
>> (DLSLtd,RAL,LSCI)
>> Sent: December 4, 2020 9:14 AM
>> To: CCP4BB@JISCMAIL.AC.UK
>> Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting 
>> (?)
>>  
>> The subject line for Isabel’s email is very good.
>>  
>> I do have a question (more a request) for the more computer scientist 
>> oriented people. I think it is relevant for where this technology will be 
>> going. It comes from trying to understand whether problems addressed by 
>> Alpha are NP, NP hard, NP complete etc. My understanding is that the 
>> previous successes of Alpha were for complete information games such as 
>> Chess and Go. Both the rules and the present position were available to both 
>> sides. The folding problem might be in a different category. It would be 
>> nice if someone could explain the difference (if any) between Go and the 
>> protein folding problem perhaps using the NP type categories.
>>  
>> Colin
>>  
>>  
>>  
>> From: CCP4 bulletin b

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Goldman, Adrian
My impression is that they haven’t published the code, and it is science by 
press-release.  If one of us tried it, we would - rightly - get hounded out of 
time.

Adrian



On 4 Dec 2020, at 15:57, Michel Fodje 
mailto:michel.fo...@lightsource.ca>> wrote:

I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are “cheating” in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel’s email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin



From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Isabel Garcia-Saez
Sent: 03 December 2020 11:18
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez  PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr<mailto:isabel.gar...@ibs.fr>
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



--

This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Nave, Colin (DLSLtd,RAL,LSCI)
Michel
Yes, a good point. relevant to the difference between AlphaGo and AlphaFold2. 
My understanding is that Alpha Go did begin with information about previous 
games but after this, it played against itself and became significantly better. 
AlphaFold2 relied perhaps completely on knowledge of previous "games" but 
didn't have an opponent to play against.

There is a difference between the intrinsic nature of the folding problem and 
the successful implementation, using additional information,  of AlphaFold2. I 
was really asking about the intrinsic nature of the folding problem (and Chess, 
Go) but, in practice, the question is probably not particularly relevant.

It might be true, for single isolated proteins that "all the information 
required for the 3D structure is in the sequence." However, many proteins can 
and do form amyloids. I think it was Chris Dobson who pointed out that most 
sequences would form amyloids and only a small number of sequences, tuned by 
natural selection, would form useful folds. Even these could easily revert to 
amyloids (otherwise known as the precipitant in the crystallisation well). 
Chaperones get involved and there is the issue of kinetic rather than 
thermodynamic control. See also James Holton's comments about energy 
minimisation. All this just indicates that the problem would be very hard 
without known structures. However, the advantage for predicting structure from 
sequence is that one can assume that the vast majority of sequences people are 
interested in will fold in to something useful, rather than an amyloid. Of 
course spider silk forms amyloid fibres and they are structurally useful.

All interesting issues
  Colin


From: CCP4 bulletin board  On Behalf Of Michel Fodje
Sent: 04 December 2020 15:58
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are "cheating" in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel's email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps usi

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-04 Thread Michel Fodje
I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are "cheating" in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board  On Behalf Of Nave, Colin 
(DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel's email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin



From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Isabel Garcia-Saez
Sent: 03 December 2020 11:18
To: CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>
Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez  PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble Cedex 9
France
Tel.: 00 33 (0) 457 42 86 15
e-mail: isabel.gar...@ibs.fr<mailto:isabel.gar...@ibs.fr>
FAX: 00 33 (0) 476 50 18 90
http://www.ibs.fr/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



--

This e-mail and any attachments may contain confidential, copyright and or 
privileged material, and are for the use of the intended addressee only. If you 
are not the intended addressee or an authorised recipient of the addressee 
please notify us of receipt by returning the e-mail and do not use, copy, 
retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not 
necessarily of Diamond Light Sour