Re: [ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Patrick Shaw Stewart
>they can maintain an advantage through several routes - they can
> publish in patents (so people can see what they’ve done, but not legally
> implement it )


In Europe and I think some other countries, inventions can only be patented
if they have *industrial applicability.*

In any case, academics all over the world tend to ignore them.



On Wed, Dec 9, 2020 at 12:18 PM Harry Powell - CCP4BB <
193323b1e616-dmarc-requ...@jiscmail.ac.uk> wrote:

> Hi
>
> Actually, since Deep Mind is a commercial organization (funded by
> shareholders and people who buy their services), I don’t think they are
> subject to the same rules as academia as regards making their source code
> public. It would be very nice if they would (could?) make their code
> public, but I don’t see any obligation to do so. Their responsibility is
> primarily to their shareholders (you can argue the rights and wrongs of
> that until the cows come home).
>
> Commercially, they can maintain an advantage through several routes - they
> can publish in patents (so people can see what they’ve done, but not
> legally implement it without a licence), they can keep it all confidential
> and hope that no-one manages to reverse engineer and implement it (at the
> risk of someone else publishing the details and removing their advantage),
> they can publish something that is honest but just misleading enough (or
> lacking in detail) to throw people off the scent, or…
>
> If they can provoke other developers to work out where they have gone
> wrong and produce something that competes with AlphaFold2, that would be
> great. If they can provide something like a web service that allows users
> to run their method, that would be great too, but the important thing is
> (that unless they had prior knowledge of the structures in CASP14) they’ve
> done something that no-one else has managed to do as well in spite of years
> of trying.
>
> Just my two ha’porth.
>
> Harry
>
> > On 9 Dec 2020, at 10:36, Hughes, Jonathan <
> jon.hug...@bot3.bio.uni-giessen.de> wrote:
> >
> > i think the answer to all these doubts and questions is quite simple:
> the AlphaFold2 people must make all details of their methods public (source
> code) and, as would probably be necessary, open their system for inspection
> and use by independent experts. isn't that what peer review and
> reproducibility are all about? those rules date from the time before every
> tom, dick and henriette could publicize anything they like inside their own
> zuckerberg bubble. my opinion is that this is a virtual infectious disease
> that will cause humanity far bigger problems than corona ever will – i just
> hope i'm wrong!
> >
> > best
> >
> > jon
> >
> >
> >
> > Von: CCP4 bulletin board  Im Auftrag von Mark J
> van Raaij
> > Gesendet: Mittwoch, 9. Dezember 2020 11:14
> > An: CCP4BB@JISCMAIL.AC.UK
> > Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking
> and less pipetting (?)
> >
> >
> >
> > on the day the news came out, I did wonder if the AlphaFold2 team
> somehow had access to all the preliminary PDB files sent around via Gmail
> (which belongs to the same company), but more as a joke/conspirational
> thought.
> >
> > "our" target T1052, was also predicted very well by domains and as a
> monomer. It will be interesting to see how well future iterations of the
> method can assemble the complete protein chain and the complete protein
> chains into the correct heteromer.
> >
> >
> >
> > Mark J van Raaij
> > Dpto de Estructura de Macromoleculas
> > Centro Nacional de Biotecnologia - CSIC
> > calle Darwin 3
> > E-28049 Madrid, Spain
> > tel. (+34) 91 585 4616
> >
> > Section Editor Acta Crystallographica F
> > https://journals.iucr.org/f/
> >
> >
> >
> > On 9 Dec 2020, at 10:37, Cedric Govaerts 
> wrote:
> >
> >
> >
> > Dear All
> >
> >
> >
> > After about 10 (!) years of (very) hard work we solved the structures of
> our dearest membrane transporter.  Dataset at 2.9 And resolution, fairly
> anisotropic, experimental phasing, and many long nights with Coot and
> Buster to achieve model refinement.
> >
> >
> >
> > The experimental structure had a well defined ligand nicely coordinated
> but also a lipid embedded inside the binding cavity (a complete surprise
> but biologically relevant) and two detergent molecules well defined
> (experimental/crystallisation artefact).
> >
> >
> >
> > As our paper was accepted basically when CASP organisers were calling
> for targets I offered my baby to the computing Gods. However we only
> provided the sequence to CASP, no info regarding any ligand or lipid.
> >
> >
> >
> > Less than a month after, the CASP team contacted us and send us the best
> model.  In fact it was 2 half models as the transporter is a pseudo dimer,
> with the N-lobe and C-lobe moving relative to each other during transport
> cycle, thus divided as two domains in CASP.
> >
> >
> >
> > The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the
> other. And yes, 

Re: [ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Harry Powell - CCP4BB
Hi

Actually, since Deep Mind is a commercial organization (funded by shareholders 
and people who buy their services), I don’t think they are subject to the same 
rules as academia as regards making their source code public. It would be very 
nice if they would (could?) make their code public, but I don’t see any 
obligation to do so. Their responsibility is primarily to their shareholders 
(you can argue the rights and wrongs of that until the cows come home).

Commercially, they can maintain an advantage through several routes - they can 
publish in patents (so people can see what they’ve done, but not legally 
implement it without a licence), they can keep it all confidential and hope 
that no-one manages to reverse engineer and implement it (at the risk of 
someone else publishing the details and removing their advantage), they can 
publish something that is honest but just misleading enough (or lacking in 
detail) to throw people off the scent, or…

If they can provoke other developers to work out where they have gone wrong and 
produce something that competes with AlphaFold2, that would be great. If they 
can provide something like a web service that allows users to run their method, 
that would be great too, but the important thing is (that unless they had prior 
knowledge of the structures in CASP14) they’ve done something that no-one else 
has managed to do as well in spite of years of trying.

Just my two ha’porth.

Harry

> On 9 Dec 2020, at 10:36, Hughes, Jonathan 
>  wrote:
> 
> i think the answer to all these doubts and questions is quite simple: the 
> AlphaFold2 people must make all details of their methods public (source code) 
> and, as would probably be necessary, open their system for inspection and use 
> by independent experts. isn't that what peer review and reproducibility are 
> all about? those rules date from the time before every tom, dick and 
> henriette could publicize anything they like inside their own zuckerberg 
> bubble. my opinion is that this is a virtual infectious disease that will 
> cause humanity far bigger problems than corona ever will – i just hope i'm 
> wrong!
> 
> best
> 
> jon
> 
>  
> 
> Von: CCP4 bulletin board  Im Auftrag von Mark J van 
> Raaij
> Gesendet: Mittwoch, 9. Dezember 2020 11:14
> An: CCP4BB@JISCMAIL.AC.UK
> Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and 
> less pipetting (?)
> 
>  
> 
> on the day the news came out, I did wonder if the AlphaFold2 team somehow had 
> access to all the preliminary PDB files sent around via Gmail (which belongs 
> to the same company), but more as a joke/conspirational thought.
> 
> "our" target T1052, was also predicted very well by domains and as a monomer. 
> It will be interesting to see how well future iterations of the method can 
> assemble the complete protein chain and the complete protein chains into the 
> correct heteromer.
> 
>  
> 
> Mark J van Raaij
> Dpto de Estructura de Macromoleculas
> Centro Nacional de Biotecnologia - CSIC
> calle Darwin 3
> E-28049 Madrid, Spain
> tel. (+34) 91 585 4616
> 
> Section Editor Acta Crystallographica F
> https://journals.iucr.org/f/
> 
>  
> 
> On 9 Dec 2020, at 10:37, Cedric Govaerts  wrote:
> 
>  
> 
> Dear All
> 
>  
> 
> After about 10 (!) years of (very) hard work we solved the structures of our 
> dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
> anisotropic, experimental phasing, and many long nights with Coot and 
> Buster to achieve model refinement. 
> 
>  
> 
> The experimental structure had a well defined ligand nicely coordinated but 
> also a lipid embedded inside the binding cavity (a complete surprise but 
> biologically relevant) and two detergent molecules well defined 
> (experimental/crystallisation artefact).
> 
>  
> 
> As our paper was accepted basically when CASP organisers were calling for 
> targets I offered my baby to the computing Gods. However we only provided the 
> sequence to CASP, no info regarding any ligand or lipid.
> 
>  
> 
> Less than a month after, the CASP team contacted us and send us the best 
> model.  In fact it was 2 half models as the transporter is a pseudo dimer, 
> with the N-lobe and C-lobe moving relative to each other during transport 
> cycle, thus divided as two domains in CASP.
> 
>  
> 
> The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. 
> And yes, group 427 was the superpower (did not know at the time that it was 
> AlphaFold).
> 
>  
> 
> We had long discussions with the CASP team, as -for us- this almost exact 
> modelling was dream-like (or science fiction) and -at some point- we were 
> even suspecting fraud, as our coordinates had travelled over the internet a 
> few times around when interacting with colleagues.  The organisers reassured 
> us that we were not the only target that had been “nailed” so no reason to 
> suspect any wrongdoing.
> 
>  
> 
> To this day I am still baffled and I would be happy to hear from 

[ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-09 Thread Hughes, Jonathan
i think the answer to all these doubts and questions is quite simple: the 
AlphaFold2 people must make all details of their methods public (source code) 
and, as would probably be necessary, open their system for inspection and use 
by independent experts. isn't that what peer review and reproducibility are all 
about? those rules date from the time before every tom, dick and henriette 
could publicize anything they like inside their own zuckerberg bubble. my 
opinion is that this is a virtual infectious disease that will cause humanity 
far bigger problems than corona ever will – i just hope i'm wrong!
best
jon

Von: CCP4 bulletin board  Im Auftrag von Mark J van Raaij
Gesendet: Mittwoch, 9. Dezember 2020 11:14
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

on the day the news came out, I did wonder if the AlphaFold2 team somehow had 
access to all the preliminary PDB files sent around via Gmail (which belongs to 
the same company), but more as a joke/conspirational thought.
"our" target T1052, was also predicted very well by domains and as a monomer. 
It will be interesting to see how well future iterations of the method can 
assemble the complete protein chain and the complete protein chains into the 
correct heteromer.

Mark J van Raaij
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
calle Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
Section Editor Acta Crystallographica F
https://journals.iucr.org/f/

On 9 Dec 2020, at 10:37, Cedric Govaerts 
mailto:cedric.govae...@ulb.ac.be>> wrote:

Dear All

After about 10 (!) years of (very) hard work we solved the structures of our 
dearest membrane transporter.  Dataset at 2.9 And resolution, fairly 
anisotropic, experimental phasing, and many long nights with Coot and 
Buster to achieve model refinement.

The experimental structure had a well defined ligand nicely coordinated but 
also a lipid embedded inside the binding cavity (a complete surprise but 
biologically relevant) and two detergent molecules well defined 
(experimental/crystallisation artefact).

As our paper was accepted basically when CASP organisers were calling for 
targets I offered my baby to the computing Gods. However we only provided the 
sequence to CASP, no info regarding any ligand or lipid.

Less than a month after, the CASP team contacted us and send us the best model. 
 In fact it was 2 half models as the transporter is a pseudo dimer, with the 
N-lobe and C-lobe moving relative to each other during transport cycle, thus 
divided as two domains in CASP.

The results were breathtaking. 0.7 And RSMD on one half, 0.6 on the other. And 
yes, group 427 was the superpower (did not know at the time that it was 
AlphaFold).

We had long discussions with the CASP team, as -for us- this almost exact 
modelling was dream-like (or science fiction) and -at some point- we were even 
suspecting fraud, as our coordinates had travelled over the internet a few 
times around when interacting with colleagues.  The organisers reassured us 
that we were not the only target that had been “nailed” so no reason to suspect 
any wrongdoing.

To this day I am still baffled and I would be happy to hear from the community, 
maybe from some of the CASP participants.

The target is T024, the “perfect" models are domain-split version (T024-D1 and 
T024-D2), as AlphaFold2 did not perform so well on the complete assembly.
Deposited PDB is 6T1Z

Cedric

PS: I should also note that many other groups performed very well, much better 
than I would have dreamed, including on the full protein but just not as 
crazy-good.
—
Prof. Cedric Govaerts, Ph.D.
Universite Libre de Bruxelles
Campus Plaine. Phone :+32 2 650 53 77
Building BC, Room 1C4 203
Boulevard du Triomphe, Acces 2
1050 Brussels
Belgium
http://govaertslab.ulb.ac.be/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Artem Evdokimov
Well that is sad, and true, and also very common. I have personally
experienced dozens of cases where methods from literature do not reproduce
because (and this is important) the authors "just slap some generic
boilerplate" instead of the actual methods. My favorite is always to read
stuff like "such and such protein was cloned into bacterial expression
vector, expressed and and purified using standard methods" and then later
find out through considerable effort and twisting hands of original
researchers that the protein can only be expressed when fused with a Spider
Monkey cadherin domain and expressed in minimal medium supplemented with 5%
Pregnant Horse Urine at exactly 13.5 degrees C. And then purified using the
Spider Monkey cadherin monoclonal antibody. And the yield is 1 mg in 24
liters. None of which was ever disclosed in literature...

Sorry for the rant, I guess I am just saying that literature, IMO, has long
ago stopped being generally directly reproducible. Not getting into the
obvious reasons as to why it happened, but still sad that it happened.

Artem

On Tue, Dec 8, 2020, 8:28 AM Hughes, Jonathan <
jon.hug...@bot3.bio.uni-giessen.de> wrote:

> scientific research requires that experimental results must be testable,
> so you have to publish your methods too. if the alphafold2 people don't
> make their code accessible, they are playing a game with different rules.
> maybe it's called capitalism: i gather they're a private company
>
> best
>
> jon
>
>
>
> *Von:* CCP4 bulletin board  *Im Auftrag von *Goldman,
> Adrian
> *Gesendet:* Dienstag, 8. Dezember 2020 12:33
> *An:* CCP4BB@JISCMAIL.AC.UK
> *Betreff:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking
> and less pipetting (?)
>
>
>
> My impression is that they haven’t published the code, and it is science
> by press-release.  If one of us tried it, we would - rightly - get hounded
> out of time.
>
>
>
> Adrian
>
>
>
>
>
>
>
> On 4 Dec 2020, at 15:57, Michel Fodje  wrote:
>
>
>
> I think the results from AlphaFold2, although exciting and a breakthrough
> are being exaggerated just a bit.  We know that all the information
> required for the 3D structure is in the sequence. The protein folding
> problem is simply how to go from a sequence to the 3D structure. This is
> not a complex problem in the sense that cells solve it deterministically.
> Thus the problem is due to lack of understanding and not due to
> complexity.  AlphaFold and all the others trying to solve this problem are
> “cheating” in that they are not just using the sequence, they are using
> other sequences like it (multiple-sequence alignments), and they are using
> all the structural information contained in the PDB.  All of this
> information is not used by the cells.   In short, unless AlphaFold2 now
> allows us to understand how exactly a single protein sequence produces a
> particular 3D structure, the protein folding problem is hardly solved in a
> theoretical sense. The only reason we know how well AlphaFold2 did is
> because the structures were solved and we could compare with the
> predictions, which means verification is lacking.
>
>
>
> The protein folding problem will be solved when we understand how to go
> from a sequence to a structure, and can verify a given structure to be
> correct without experimental data. Even if AlphaFold2 got 99% of structures
> right, your next interesting target protein might be the 1%. How would you
> know?   Until then, what AlphaFold2 is telling us right now is that all
> (most) of the information present in the sequence that determines the 3D
> structure can be gleaned in bits and pieces scattered between homologous
> sequences, multiple-sequence alignments, and other protein 3D structures in
> the PDB.  Deep Learning allows a huge amount of data to be thrown at a
> problem and the back-propagation of the networks then allows careful
> fine-tuning of weights which determine how relevant different pieces of
> information are to the prediction.  The networks used here are humongous
> and a detailed look at the weights (if at all feasible) may point us in the
> right direction.
>
>
>
>
>
> *From:* CCP4 bulletin board  *On Behalf Of *Nave,
> Colin (DLSLtd,RAL,LSCI)
> *Sent:* December 4, 2020 9:14 AM
> *To:* CCP4BB@JISCMAIL.AC.UK
> *Subject:* External: Re: [ccp4bb] AlphaFold: more thinking and less
> pipetting (?)
>
>
>
> The subject line for Isabel’s email is very good.
>
>
>
> I do have a question (more a request) for the more computer scientist
> oriented people. I think it is relevant for where this technology will be
> going. It comes from trying to understand whether problems addressed by
> Alpha are NP, NP hard, NP complete etc. My understanding is that the
> previous successes of Alpha were for complete information games such as
> Chess and Go. Both the rules and the present position were available to
> both sides. The folding problem might be in a different category. It would
> be nice if someone could explain 

[ccp4bb] AW: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-08 Thread Hughes, Jonathan
scientific research requires that experimental results must be testable, so you 
have to publish your methods too. if the alphafold2 people don't make their 
code accessible, they are playing a game with different rules. maybe it's 
called capitalism: i gather they're a private company
best
jon

Von: CCP4 bulletin board  Im Auftrag von Goldman, Adrian
Gesendet: Dienstag, 8. Dezember 2020 12:33
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less 
pipetting (?)

My impression is that they haven’t published the code, and it is science by 
press-release.  If one of us tried it, we would - rightly - get hounded out of 
time.

Adrian




On 4 Dec 2020, at 15:57, Michel Fodje 
mailto:michel.fo...@lightsource.ca>> wrote:

I think the results from AlphaFold2, although exciting and a breakthrough are 
being exaggerated just a bit.  We know that all the information required for 
the 3D structure is in the sequence. The protein folding problem is simply how 
to go from a sequence to the 3D structure. This is not a complex problem in the 
sense that cells solve it deterministically.  Thus the problem is due to lack 
of understanding and not due to complexity.  AlphaFold and all the others 
trying to solve this problem are “cheating” in that they are not just using the 
sequence, they are using other sequences like it (multiple-sequence 
alignments), and they are using all the structural information contained in the 
PDB.  All of this information is not used by the cells.   In short, unless 
AlphaFold2 now allows us to understand how exactly a single protein sequence 
produces a particular 3D structure, the protein folding problem is hardly 
solved in a theoretical sense. The only reason we know how well AlphaFold2 did 
is because the structures were solved and we could compare with the 
predictions, which means verification is lacking.

The protein folding problem will be solved when we understand how to go from a 
sequence to a structure, and can verify a given structure to be correct without 
experimental data. Even if AlphaFold2 got 99% of structures right, your next 
interesting target protein might be the 1%. How would you know?   Until then, 
what AlphaFold2 is telling us right now is that all (most) of the information 
present in the sequence that determines the 3D structure can be gleaned in bits 
and pieces scattered between homologous sequences, multiple-sequence 
alignments, and other protein 3D structures in the PDB.  Deep Learning allows a 
huge amount of data to be thrown at a problem and the back-propagation of the 
networks then allows careful fine-tuning of weights which determine how 
relevant different pieces of information are to the prediction.  The networks 
used here are humongous and a detailed look at the weights (if at all feasible) 
may point us in the right direction.


From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Nave, Colin (DLSLtd,RAL,LSCI)
Sent: December 4, 2020 9:14 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

The subject line for Isabel’s email is very good.

I do have a question (more a request) for the more computer scientist oriented 
people. I think it is relevant for where this technology will be going. It 
comes from trying to understand whether problems addressed by Alpha are NP, NP 
hard, NP complete etc. My understanding is that the previous successes of Alpha 
were for complete information games such as Chess and Go. Both the rules and 
the present position were available to both sides. The folding problem might be 
in a different category. It would be nice if someone could explain the 
difference (if any) between Go and the protein folding problem perhaps using 
the NP type categories.

Colin



From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
On Behalf Of Isabel Garcia-Saez
Sent: 03 December 2020 11:18
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

Dear all,

Just commenting that after the stunning performance of AlphaFold that uses AI 
from Google maybe some of us we could dedicate ourselves to the noble art of 
gardening, baking, doing Chinese Calligraphy, enjoying the clouds pass or 
everything together (just in case I have already prepared my subscription to 
Netflix).

https://www.nature.com/articles/d41586-020-03348-4

Well, I suppose that we still have the structures of complexes (at the moment). 
I am wondering how the labs will have access to this technology in the future 
(would it be for free coming from the company DeepMind - Google?). It seems 
that they have already published some code. Well, exciting times.

Cheers,

Isabel


Isabel Garcia-Saez  PhD
Institut de Biologie Structurale
Viral Infection and Cancer Group (VIC)-Cell Division Team
71, Avenue des Martyrs
CS 10090
38044 Grenoble