Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Leonid Sazanov

Thanks, I will try this.

Also, on CASP website there are such scores as RMS_ALL (can be seen in 
tables) and GDC_SC (for side-chains, not visible in tables for some reason).


RMS_ALL presumably includes side-chains and seems good for AlphaFold2 
models, between 1 to 2 Angstrom (apart from the same outliers as 
RMS_CA), although that is not quite at the experimental level.


Were any scores including side-chains included in ranking/evaluation (as 
we hear mostly about GDT_TS)?


If not, how can "experimental level" precision be claimed?


Thanks,

Leonid



On 11.12.20 13:56, Tristan Croll wrote:

I agree the website can be quite cryptic!

You can get all the targets as a tarball from 
https://predictioncenter.org/download_area/CASP14/targets/ 
<https://predictioncenter.org/download_area/CASP14/targets/>. For the 
predictions, you can either get them as PDB files on a case-by-case 
basis from the results section, or tarballs of all predictions for a 
given target from 
https://predictioncenter.org/download_area/CASP14/predictions_trimmed_to_domains/ 
<https://predictioncenter.org/download_area/CASP14/predictions_trimmed_to_domains/>. 
In the latter case, each file is essentially a PDB file without the 
.pdb extension, except with 4 lines added to the front looking 
something like:


PFRMAT TS
TARGET T1049
MODEL 2
PARENT N/A

Depending on your choice of viewer, you may need to remove these lines 
before attempting to open it.


The GDT_TS score only considers alpha carbons, so in principle it /is/ 
possible to get a high score on it while still having a model that's 
rubbish in every other respect. It's certainly worth complementing it 
with other scores - e.g. good old MolProbity, or SphereGrinder. The 
latter is quite good in principle - essentially, it places a 6 A 
radius sphere at each CA atom of the target, finds all heavy atoms in 
the sphere, and measures their RMSD to the corresponding atoms in the 
prediction. The actual implementation for CASP is a bit broad-brush, 
though - the score is just the fraction of spheres whose RMSD is under 
2 A.


In the last CASP round I pushed for the need to start adding metrics 
that directly compared the models in torsion space - far from the 
first time that's been suggested, but it's arguably only in the past 
few rounds that models have gotten good enough for this to be a useful 
discriminating measure. It doesn't appear that this has been added to 
the standard measures for CASP14, but if it had I can see that 
AlphaFold2 would have done extremely well - I only showed the ribbon 
representation for T1049 in my last email, but the sidechains in the 
core show pretty amazing agreement with the target.


Best regards,

Tristan
--------
*From:* Leonid Sazanov 
*Sent:* 11 December 2020 12:32
*To:* Tristan Croll ; CCP4BB@JISCMAIL.AC.UK 

*Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)


I see, thanks, that looks good.

Where can one download predicted_model+exp_model PDBs together?

I could easily find predicted models but not experimental - CASP 
website seems very cryptic.


Also, can you comment on how much GDT_TS depends on CA and how much on 
side chains positioning?


E.g. if it is >90, can one be sure that most side-chains are in the 
right place?


Thanks.

Leonid


On 11.12.20 13:12, Tristan Croll wrote:
I'm not Randy, but I do have an answer: like this. This is T1049-D1. 
AlphaFold prediction in red, experimental structure (6y4f) in green. 
Agreement is close to perfect, apart from the C-terminal tail which 
is way off - but clearly flexible and only resolved in this 
conformation in the crystal due to packing interactions. GDT_TS is 
93.1; RMS_CA is 3.68 - but if you exclude those tail residues, it's 
0.79. With an alignment cutoff of 1 A, you can align 109 of 134 CAs 
with an RMSD of 0.46 A.


*From:* CCP4 bulletin board  
<mailto:CCP4BB@JISCMAIL.AC.UK> on behalf of Leonid Sazanov 
 <mailto:saza...@ist.ac.at>

*Sent:* 11 December 2020 10:36
*To:* CCP4BB@JISCMAIL.AC.UK <mailto:CCP4BB@JISCMAIL.AC.UK> 
 <mailto:CCP4BB@JISCMAIL.AC.UK>
*Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)

Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 
3.0 Angstrom? Such a deviation can hardly be described as good as 
experimental. Could it be that GDT_TS is kind of designed to evaluate 
how well the general sub-domain level fold is predicted, rather than 
overall detail?


Thanks,
Leonid


>>>>>
Several people have mentioned lack of peer review as a reason to 
doubt the significance of the AlphaFold2 results.  There are 
different routes to peer review and, while the r

Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Leonid Sazanov

I see, thanks, that looks good.

Where can one download predicted_model+exp_model PDBs together?

I could easily find predicted models but not experimental - CASP website 
seems very cryptic.


Also, can you comment on how much GDT_TS depends on CA and how much on 
side chains positioning?


E.g. if it is >90, can one be sure that most side-chains are in the 
right place?


Thanks.

Leonid


On 11.12.20 13:12, Tristan Croll wrote:
I'm not Randy, but I do have an answer: like this. This is T1049-D1. 
AlphaFold prediction in red, experimental structure (6y4f) in green. 
Agreement is close to perfect, apart from the C-terminal tail which is 
way off - but clearly flexible and only resolved in this conformation 
in the crystal due to packing interactions. GDT_TS is 93.1; RMS_CA is 
3.68 - but if you exclude those tail residues, it's 0.79. With an 
alignment cutoff of 1 A, you can align 109 of 134 CAs with an RMSD of 
0.46 A.


*From:* CCP4 bulletin board  on behalf of 
Leonid Sazanov 

*Sent:* 11 December 2020 10:36
*To:* CCP4BB@JISCMAIL.AC.UK 
*Subject:* Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more 
thinking and less pipetting (?)

Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 
3.0 Angstrom? Such a deviation can hardly be described as good as 
experimental. Could it be that GDT_TS is kind of designed to evaluate 
how well the general sub-domain level fold is predicted, rather than 
overall detail?


Thanks,
Leonid


>>>>>
Several people have mentioned lack of peer review as a reason to doubt 
the significance of the AlphaFold2 results.  There are different 
routes to peer review and, while the results have not been published 
in a peer review journal, I would have to say (as someone who has been 
an assessor for two CASPs, as well as having editorial 
responsibilities for a peer-reviewed journal), the peer review at CASP 
is much more rigorous than the peer review that most papers undergo.  
The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a 
C-alpha trace is probably enough to get a target cancelled.  In some 
cases (as we’ve heard here) the people determining the structure are 
overly optimistic about when their structure solution will be 
finished, so even they may not know the structure at the time it is 
predicted. The assessors are blinded to the identities of the 
predictors, and they carry out months of calculations and inspections 
of the models, computing ranking scores before they find out who made 
the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the 
predictions get better, and they have new ideas of what to look for, 
but there’s always some overlap with “traditional” measures such as 
GDT-TS, GDT-HA (more stringent high-accuracy version of GDT) and lDDT.




Of course we’d all like to know the details of how AlphaFold2 works, 
and the DeepMind people could have been (and should be) much more 
forthcoming, but their results are real.  They didn’t have any way of 
cheating, being selective about what they reported, or gaming the 
system in any other way that the other groups couldn’t do.  (And yes, 
when we learned that DeepMind was behind the exceptionally good 
results two years ago at CASP13, we made the same half-jokes about 
whether Gmail had been in the database they were mining!)




Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 
<https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1>


This message was issued to members of www.jiscmail.ac.uk/CCP4BB 
<http://www.jiscmail.ac.uk/CCP4BB>, a mailing list hosted by 
www.jiscmail.ac.uk <http://www.jiscmail.ac.uk>, terms & conditions are 
available at https://www.jiscmail.ac.uk/policyandsecurity/ 
<https://www.jiscmail.ac.uk/policyandsecurity/>


--
Prof. Leonid Sazanov FRS
IST Austria
Am Campus 1
A-3400 Klosterneuburg
Austria

Phone: +43 2243 9000 3026
E-mail: saza...@ist.ac.at




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] External: Re: [ccp4bb] AlphaFold: more thinking and less pipetting (?)

2020-12-11 Thread Leonid Sazanov
Dear Randy,

Can you comment on why for some of AplhaFold2 models with GDT_TS > 90 
(supposedly as good as experimental model) the RMS_CA (backbone) is > 3.0 
Angstrom? Such a deviation can hardly be described as good as experimental. 
Could it be that GDT_TS is kind of designed to evaluate how well the general 
sub-domain level fold is predicted, rather than overall detail?

Thanks,
Leonid


>
Several people have mentioned lack of peer review as a reason to doubt the 
significance of the AlphaFold2 results.  There are different routes to peer 
review and, while the results have not been published in a peer review journal, 
I would have to say (as someone who has been an assessor for two CASPs, as well 
as having editorial responsibilities for a peer-reviewed journal), the peer 
review at CASP is much more rigorous than the peer review that most papers 
undergo.  The targets are selected from structures that have recently been 
solved but not published or disseminated, and even just tweeting a C-alpha 
trace is probably enough to get a target cancelled.  In some cases (as we’ve 
heard here) the people determining the structure are overly optimistic about 
when their structure solution will be finished, so even they may not know the 
structure at the time it is predicted.  The assessors are blinded to the 
identities of the predictors, and they carry out months of calculations and 
inspections of the models, computing ranking scores before they find out who 
made the predictions.  Most assessors try to bring something new to the 
assessment, because the criteria should get more stringent as the predictions 
get better, and they have new ideas of what to look for, but there’s always 
some overlap with “traditional” measures such as GDT-TS, GDT-HA (more stringent 
high-accuracy version of GDT) and lDDT.



Of course we’d all like to know the details of how AlphaFold2 works, and the 
DeepMind people could have been (and should be) much more forthcoming, but 
their results are real.  They didn’t have any way of cheating, being selective 
about what they reported, or gaming the system in any other way that the other 
groups couldn’t do.  (And yes, when we learned that DeepMind was behind the 
exceptionally good results two years ago at CASP13, we made the same half-jokes 
about whether Gmail had been in the database they were mining!)



Best wishes,



Randy Read



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] refmac error

2018-08-02 Thread Leonid Sazanov
Hi, after updating to ccp4/7.0, I get this error when trying to run Refmac:

Dictionary path has not been defined
Check the environment variable CLIBD_MON
Current value of CLIBD_MON is
/mnt/nfs/clustersw/shared/ccp4/7.0/ccp4-7.0/lib/data/monomers
It should be set to wherever_dict/dic/
===> Error: Wrong path for the dictionary files
 Refmac:  Wrong path for the dictionary files

How to resolve this?
Many thanks!



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


[ccp4bb] PhD in structural biology of membrane protein complexes, IST Austria

2014-12-02 Thread Leonid Sazanov
Applications are invited for the PhD course in IST Austria, with specialisation 
in structural biology of large membrane protein complexes, in particular 
complex I. 

Complex I is central to bioenergetics – it is the first and largest enzyme of 
the respiratory chain in mitochondria and bacteria. It is also involved in a 
wide range of human diseases. Sazanov lab uses the bacterial enzyme as a 
‘minimal’ model of the more elaborate mammalian complex I. We have determined, 
by X-ray crystallography, all currently known atomic structures of complex I, 
starting with the hydrophilic domain (Science 2005, 2006), followed by the 
membrane domain (Nature 2010, 2011) and, finally, the recent (Nature 2013) 
structure of the entire Thermus thermophilus complex (536 kDa, 16 subunits, 9 
Fe-S clusters, 64 TM helices). Structures suggest a unique mechanism of 
coupling between electron transfer in the hydrophilic domain and proton 
translocation in the membrane domain, via long-range (up to ~200 Å) 
conformational changes. Future studies are aimed at the elucidation of this 
mechanism and the determination of the structure of much larger mitochondrial 
enzyme, as well as other related complexes 
(https://ist.ac.at/research/research-groups/sazanov-group/). We use both X-ray 
crystallography and the most advanced single particle cryo-EM methods.

The Institute of Science and Technology is located near Vienna on a dedicated 
campus.  IST Austria is a new institute dedicated to basic research, providing 
world-class research environment and benefits. PhD program at the IST lasts 4-5 
years and involves 1 year Phase 1 with rotations in 3 different groups, 
followed by Phase 2 conducting research in the selected group. Further 
information is available at https://ist.ac.at/graduate-school/phd-program/. The 
deadline for applications is 15th of January 2015. Informal queries should be 
forwarded to Dr. Leonid Sazanov (saza...@mrc-mbu.cam.ac.uk).


[ccp4bb] Post-doc position - membrane protein complex

2014-11-18 Thread Leonid Sazanov
A candidate is thought for a two-year position (an extension is possible) to 
work on structural characterisation of respiratory complex I from T. 
thermophilus.

Complex I is central to bioenergetics – it is the first and largest enzyme of 
the respiratory chain in mitochondria and bacteria. It is also involved in a 
wide range of human diseases. We use the bacterial enzyme as a ‘minimal’ model 
of the more elaborate mammalian complex I. We have determined, by X-ray 
crystallography, all currently known atomic structures of complex I, starting 
with the hydrophilic domain (Science 2005, 2006), followed by the membrane 
domain (Nature 2010, 2011) and, finally, the recent (Nature 2013) structure of 
the entire Thermus thermophilus complex (536 kDa, 16 subunits, 9 Fe-S clusters, 
64 TM helices). Structures suggest a unique mechanism of coupling between 
electron transfer in the hydrophilic domain and proton translocation in the 
membrane domain, via long-range (up to ~200 Å) conformational changes. 

In order to determine the intriguing coupling mechanism, we are now working on 
the structures of the complex in different redox states, with various bound 
substrates. We are also interested in establishing the mode of action of 
various inhibitors of complex I. The post holder will be involved in taking 
this challenging project further, which requires extensive experience in 
protein purification and X-ray crystallography, preferably acquired working 
with membrane proteins or macromolecular assemblies. The post is partly funded 
via collaboration with Pharma. Initially several months will be spent training 
in the MRC Mitochondrial Biology Unit in Cambridge, UK and then the project 
will be moved together with the rest of the group to the Institute of Science 
and Technology in Austria, near Vienna.  IST Austria is a new institute 
dedicated to basic research, providing world-class research environment and 
benefits. We are looking to fill the post by early 2015. All queries and 
applications with CVs and names of 2-3 academic referees should be forwarded to 
Dr. Leonid Sazanov (saza...@mrc-mbu.cam.ac.uk).


Re: [ccp4bb] crystals with large solvent content -dehydratation

2013-10-29 Thread Leonid Sazanov
Hi, you could try dehydration in microdialysis buttons - this allows for slow 
gradual increase in PEG over few days and full control of other parameters, 
including lowering salt concentration.
It was the only dehydration method that worked well for our large membrane 
protein:
http://www.ncbi.nlm.nih.gov/pubmed/21822288
Described in more detail here:
http://www.ncbi.nlm.nih.gov/pubmed/24059518


Re: [ccp4bb] refmac5 vs phenix refine mixed up

2013-01-24 Thread Leonid Sazanov
Most likely scenario is that Phenix by default assigns Rfree flag as 1, while 
ccp4/refmac - as 0.
That would explain your Rfree going down - because your Rfree reflections were 
refined by refmac.

It would be nice if default setting was the same in different suites.

Best wishes.


Re: [ccp4bb] crystal dehydration

2013-01-15 Thread Leonid Sazanov
In case if dehydration needs to be done slowly and under tight control of all 
parameters, one possibility is to use micro-dialysis  buttons.

We used it for a large membrane protein complex and diffraction improved from 
~7 to 2.7 A. The crystal is fished out and put into mother liquor solution in 
the button, sealed with dialysis membrane and the button is then placed into 
about 5 mls of mother liquor with slightly higher PEG concentration. Then you 
just exchange outside buffer every day or so for solutions containing higher 
concentrations of PEG. We went from ~9 to 30 % PEG4000 in about a week. You can 
easily observe crystal under microscope and if it cracks - you went too far/too 
quickly with PEG and need to use a bit less next time. Also, this method allows 
you to control all other components of the dehydrating solution - we needed to 
decrease salt concentration at the same time as increasing PEG. You can also 
introduce/increase cryo-protectant concentration at the same time. With these 
crystals, otherwise excellent dehydration machines already mentioned did not 
work, possibly because the process had to be really slow. The reference is 
here: http://www.ncbi.nlm.nih.gov/pubmed/21822288

Best wishes.


Re: [ccp4bb] Death of Rmerge

2012-06-01 Thread Leonid Sazanov
Hi, as we reported in our paper in Table 1 (actually Supplementary Table 1), at 
the end of Scaling 2, completeness in the outer shell after aniso truncation 
was 54%. Whilst 96% completeness and I/sigma 0.8 is of course before aniso 
truncation. I/sigma after truncation would be higher, but it is not clear to me 
how to calculate that number exactly, since aniso truncation is done post data 
scaling. One could of course re-process images in Mosflm with applied aniso 
limits and then scale data, but that would not be exactly the same.

From many trials with strongly anisotropic data we found that for map 
calculation and refinement it is best to cut data anisotropically where F/sigma 
is approaching 2.5-2.7 in each direction, as long as completeness in the outer 
shell remains above 50% or so. Usually the highest useful resolution is also 
where the correlation coefficient between random half-data-set estimates of 
intensities in SCALA falls below about 0.5 (as advocated by Phil Evans, I 
think). CC seems to be less affected by anisotropy (in this case it reached 0.5 
at 3.0 angstrom, which was another criterion to cut data at 3.0).

HTH.
Leo




I am little curious about the anisotropically truncated data for 3RKO:

Percent Possible(All)   96.0
Mean I Over Sigma(Observed) 0.8

In the supplementary table of the nature paper it was made clear that this 
3.16-3.0A, I/sigmaI=0.8 and Rmerge=1.216 shell was the outer shell of the 
anisotropically truncated data. The authors did also report the 
isotropically truncated resolution to be 3.2A with I/sigmaI=1.3 and 
Rmerge=73%.

The authors also stated in the main text that

the best native data set was anisotropically scaled and truncated to 3.4 Å, 
3.0 Å and 3.0 Å resolution, where the F/σ ratio drops to ~2.6–2.8 along 
the a*, b* and c* axes, respectively (scaling 2, Supplementary Table 1)

My question is, is the I/sigmaI=0.8 a consequence of many reflections with 
nearly 0 I/sigmaI being included in the calculation? Then what does the 96% 
completeness mean? Does it mean that 96% completeness in the spherical shell 
of 3.16-3.0A was achieved, by including a great number of I=0 reflections?


Zhijie


Re: [ccp4bb] an ambiguous result of molecular replacement

2012-03-30 Thread Leonid Sazanov
Hi, we had the same case in apparent C2221, with many similarly shifted Phaser 
solutions with high scores. The reason was that crystals were actually nearly 
perfectly twinned in P21, so indexing and processing indicated C2221. Once data 
was re-processed in P21, Phaser could easily find two distinct solutions - one 
for each of twin domains, with LLG scores roughly reflecting twin ratios. 
Similar case is discussed in detail here:
http://www.ncbi.nlm.nih.gov/pubmed/15039553


[ccp4bb] merging parts of models in COOT

2011-10-19 Thread Leonid Sazanov

Hi,
If I have two somewhat different overlayed models, is it possible in 
COOT to replace part of one model by another?
Similarly to O command: merge_atoms from_molecule residue_start 
residue_end to_molecule residue_start ?

That's a useful feature in O, but could not find it so far in COOT.
Thanks!

--
Dr. Leonid A. Sazanov
Research group leader
Medical Research Council
Mitochondrial Biology Unit
Wellcome Trust / MRC Building
Hills Road
Cambridge


Re: [ccp4bb] merging parts of models in COOT - SOLVED

2011-10-19 Thread Leonid Sazanov

Dear all, thanks for replies!

Indeed, text editing or other combinations of manipulations will do the 
trick of course, but I wanted to do it in one command, as I need to make 
many substitutions in a very big model as I go along.
Suggested (replace-fragment) (or also (copy-residue-range)) do the 
desired job.

Thanks!

Dr. Leonid A. Sazanov
Research group leader
Medical Research Council
Mitochondrial Biology Unit
Wellcome Trust / MRC Building
Hills Road
Cambridge


On 19/10/2011 14:41, Ed Pozharski wrote:

On Wed, 2011-10-19 at 12:20 +0100, Leonid Sazanov wrote:

Hi,
If I have two somewhat different overlayed models, is it possible in
COOT to replace part of one model by another?
Similarly to O command: merge_atomsfrom_molecule residue_start
residue_end  to_molecule residue_start  ?
That's a useful feature in O, but could not find it so far in COOT.
Thanks!


IIUC what you want to do, shouldn't this be a trivial operation in a
text editor?



Re: [ccp4bb] ccp4-6.1.13 - Austwick

2010-06-11 Thread Leonid Sazanov
Is there information somewhere on what is new or modified in Buccaneer 1.4 (any 
new keywords?) - could not find it?
Thanks!


[ccp4bb] PhD studentship in structural biology, Cambridge, UK

2009-04-14 Thread Leonid Sazanov
Dear all, I will highly appreciate if the ad below is brought to the attention 
of 
students who may be interested. Thanks!

Applications are invited from recent graduates or final year undergraduates for 
the following PhD project in the MRC Mitochondrial Biology Unit, Cambridge, UK:

Structure of bacterial respiratory complex I

The aim of the project is to determine the crystal structure of respiratory 
complex I, which plays a central role in cellular energy production and is 
implicated in many human neurodegenerative diseases. We have determined 
the first X-ray structure of the catalytic core of this large molecular machine 
(Science 311, 1430-6 and Science 309, 771-4) and now aim to solve the 
structure of the entire complex by X-ray crystallography. 

This studentship will provide an excellent opportunity to gain broad experience 
in structural biology, X-ray crystallography and biochemistry. The Medical 
Research Council Mitochondrial Biology Unit is a leading international research 
centre that provides an opportunity for cutting-edge research. Facilities 
include a range of fermentors/chromatography systems for protein purification, 
electron microscopes, crystallisation robots and home X-ray sources. We have 
regular access to ESRF, SLS and Diamond synchrotrons. Our students are 
normally part of the Graduate School of Biological Sciences of Cambridge 
University and become members of one of the University colleges. The MRC 
studentship providing college fees and stipend (about £14500 pa subject to 
review) will be offered on a competitive basis subject to eligibility (EU 
citizens). The starting date is October 2009. 

Enquiries should be directed to Dr. L.A. Sazanov (saza...@mrc-
mbu.cam.ac.uk)
Full details are available at http://www.mrc-mbu.cam.ac.uk/