[ccp4bb] Validation of structure prediction

2022-01-17 Thread dusan turk
Hi guys,

Maybe this may address a few of your questions. In 2019 we published a method 
to validate structures unbiased from the terms used in refinement and works on 
predicted models too.

Pražnikar, J., Tomić, M. & Turk, D. Validation and quality assessment of 
macromolecular structures using complex network analysis. Sci Rep 9, 1678 
(2019). https://doi.org/10.1038/s41598-019-38658-9 


???

best, dusan



> On 18 Jan 2022, at 01:00, CCP4BB automatic digest system 
>  wrote:
> 
> There are 5 messages totaling 1446 lines in this issue.
> 
> Topics of the day:
> 
>  1. Validation of structure prediction (2)
>  2. Improved support for extended PDBx/mmCIF structure factor files (2)
>  3. Structural Biology Cryo-EM TT Asst. Professor at University of Nebraska
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing 
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at 
> https://www.jiscmail.ac.uk/policyandsecurity/
> 
> --
> 
> Date:Mon, 17 Jan 2022 09:39:33 +0100
> From:Jan Dohnalek 
> Subject: Re: Validation of structure prediction
> 
> I think quite a bit of this "inconsistency" with protein structures comes
> from the fact that with our larger globules it is much more true that our
> model is an approximate time and space average of something that could have
> the ideal geometry.
> I.e. the way we are trying to represent the density is actually not that
> appropriate. The only "improvement" to this I think is the multiple model
> approach.
> 
> My 2 c.
> 
> Jan
> 
> 
> On Sat, Jan 15, 2022 at 9:29 PM James Holton  wrote:
> 
>> 
>> On 1/13/2022 11:14 AM, Tristan Croll wrote:
>> 
>> (please don’t actually do this)
>> 
>> 
>> Too late!  I've been doing that for years.  What happens, of course, is
>> the "geometry" improves, but the R factors go through the roof.  This I
>> expect comes as no surprise to anyone who has played with the "weight"
>> parameters in refinement, but maybe it should?  What is it about our
>> knowledge of chemical bond lengths, angles, and radii that is inconsistent
>> with the electron density of macromolecules, but not small molecules?  Why
>> do macro-models have a burning desire to leap away from the configuration
>> we know they adopt in reality?  If you zoom in on those "bad clashes"
>> individually, they don't look like something that is supposed to happen.
>> There is a LOT of energy stored up in those little springs.  I have a hard
>> time thinking that's for real. The molecule is no doubt doing something
>> else and we're just not capturing it properly.  There is information to be
>> had here, a lot of information.
>> 
>> This is why I too am looking for an all-encompassing "geometry score".
>> Right now I'm multiplying other scores together:
>> 
>> score = (1+Clashscore)*sin(worst_omega)*1./(1+worst_rama)*1/(1+worst_rota)
>> 
>> *Cbetadev*worst_nonbond*worst_bond*worst_angle*worst_dihedral*worst_chir*worst_plane
>> 
>> where things like worst_rama is the "%score" given to the worst
>> Ramachandran angle by phenix.ramalyze, and worst_bond is the largest
>> "residual" reported among all the bonds in the structure by molprobity or
>> phenix.geometry_minimization.  For "worst_nonbond" I'm plugging the
>> observed and ideal distances into a Leonard-Jones6-12 potential to convert
>> it into an "energy" that is always positive.
>> 
>> With x-ray data in hand, I've been multiplying this whole thing by Rwork
>> and trying to find clever ways to minimize the product.  Rfree is then, as
>> always, the cross-check.
>> 
>> Or does someone have a better idea?
>> 
>> -James Holton
>> MAD Scientist
>> 
>> 
>> On 1/13/2022 11:14 AM, Tristan Croll wrote:
>> 
>> Hard but not impossible - even when you *are* fitting to low-res density.
>> See https://twitter.com/crolltristan/status/1381258326223290373?s=21 for
>> example - no Ramachandran outliers, 1.3% sidechain outliers, clashscore of
>> 2... yet multiple regions out of register by anywhere up to 15 residues! I
>> never publicly named the structure (although I did share my rebuilt model
>> with the authors), but the videos and images in that thread should be
>> enough to illustrate the scale of the problem.
>> 
>> And that was *with* a map to fit! Take away the map, and run some MD
>> energy minimisation (perhaps with added Ramachandran and rotamer
>> restraints), and I think it would be easy to get your model to fool most
>> “simple” validation metrics (please don’t actually do this). The upshot is
>> that I still think validation of predicted models in the absence of at
>> least moderate-resolution experimental data is still a major challenge
>> requiring very careful 

Re: [ccp4bb] visual mask editor - why

2020-05-30 Thread dusan turk
 Eleanor Dodson 
> Subject: Re: visual mask editor - why
> 
> Wont mapmask do that?  Find the fraction coordinate of the blob extent then
> carve that section out of the map?
> Eleanor
> From documentation:
> 
> *XYZLIM [ASU] [CELL] [MATCH]  **Set the output
> map extent as `extend'. - are given in grid units or in fractional
> coordinates. It is possible to automatically extend to the CCP4 default
> asymmetric unit, or a whole unit cell, by specifying `XYZLIM ASU' or
> `XYZLIM CELL'. It is also possible to extend the map to match another map
> (given as MAPLIM) by specifiying `XYZLIM MATCH'. The default is to keep the
> extent of the input map.*
> 
> On Fri, 29 May 2020 at 08:03, Pavel Afonine  wrote:
> 
>> Hi Bernhard,
>> 
>> "Like comparing these map regions, excluding
>> 
>> intrusion of a solvent mask, etc.":
>> 
>> You didn't say much about the context.. So I'd say Polder map approach
>> comes to mind first based on these keywords. Next is "map comparison" (
>> https://doi.org/10.1107/S1399004714016289).
>> 
>> If none of the above: what we (or you) are missing?
>> 
>> Pavel
>> 
>> On Thu, May 28, 2020 at 11:17 AM Bernhard Rupp 
>> wrote:
>> 
>>> Maybe I should explain an example: Say coot detects an unmodelled blob
>>> (maybe a ligand). Now, I would like to do
>>> 
>>> a number of things without biasing towards a model. Like comparing these
>>> map regions, excluding
>>> 
>>> intrusion of a solvent mask, etc.
>>> 
>>> 
>>> 
>>> Now could coot for example just generate a mask around what it already
>>> knows are blobs?
>>> 
>>> Possible useful items could be a solvent mask not including that regions,
>>> or a density map
>>> 
>>> that includes only features with a certain boundary around that blob.
>>> 
>>> 
>>> 
>>> I pilfered some kludges together from different sources, but let’s just
>>> say inelegant would be a compliment.
>>> 
>>> 
>>> 
>>> Best, BR
>>> 
>>> 
>>> 
>>> Brief question: Does something like a visual density mask editor exist?
>>> 
>>> Thx, BR
>>> 
>>> --
>>> 
>>> Bernhard Rupp
>>> 
>>> http://www.hofkristallamt.org/
>>> 
>>> b...@hofkristallamt.org
>>> 
>>> +1 925 209 7429
>>> 
>>> +43 676 571 0536
>>> 
>>> --
>>> 
>>> Many plausible ideas vanish
>>> 
>>> at the presence of thought
>>> 
>>> --
>>> 
>>> 
>>> 
>>> --
>>> 
>>> To unsubscribe from the CCP4BB list, click the following link:
>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
>>> 
>> 
>> --
>> 
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1



Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://stef.ijs.si/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem.& Mol.& Struct. Biology
fax:  +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] [3dem] Which resolution?

2020-03-07 Thread dusan turk
James,

> On 7 Mar 2020, at 21:01, James Holton  wrote:
> 
> Yes, that's right.  Model B factors are fit to the data.  That Boverall gets 
> added to all atomic B factors in the model before the structure is written 
> out, yes?

Almost true. It depends how the programs are written. In MAIN this is not 
necessary. 

> The best estimate we have of the "true" B factor is the model B factors we 
> get at the end of refinement, once everything is converged, after we have 
> done all the building we can.  It is this "true B factor" that is a property 
> of the data, not the model, and it has the relationship to resolution and map 
> appearance that I describe below.  Does that make sense?

This is how it almost always is. Sometimes the best fit is achieved when model 
Baverage is higher than Fcalc fit to Fobs would suggest it. In such cases the 
difference is subtracted again during Fcalc to Fobs scaling. I did not 
investigate this any further, but maybe someone else has an idea or already 
established solution.

best, dusan



> 
> -James Holton



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] [3dem] Which resolution?

2020-03-07 Thread dusan turk
James,

The case you’ve chosen is not a good illustration of the relationship between 
atomic B and resolution.   The problem is that during scaling of Fcalc to Fobs 
also B-factor difference between the two sets of numbers is minimized. In the 
simplest form  with two constants Koverall and Boverall it looks like this:

sum_to_be_minimized = sum (FOBS**2 -  Koverall * FCALC**2 * exp(-1/d**2 * 
Boverall) )

Then one can include bulk solvent correction, anisotripic scaling, … In PHENIX 
it gets quite complex.  

Hence, almost regardless of the average model B you will always get the same 
map, because the “B" of the map will reflect the B of the FOBS.  When all 
atomic Bs are equal then they are also equal to average B.

best, dusan


> On 7 Mar 2020, at 01:01, CCP4BB automatic digest system 
>  wrote:
> 
>> On Thu, 5 Mar 2020 01:11:33 +0100, James Holton  wrote:
>> 
>>> The funny thing is, although we generally regard resolution as a primary
>>> indicator of data quality the appearance of a density map at the classic
>>> "1-sigma" contour has very little to do with resolution, and everything
>>> to do with the B factor.
>>> 
>>> Seriously, try it. Take any structure you like, set all the B factors to
>>> 30 with PDBSET, calculate a map with SFALL or phenix.fmodel and have a
>>> look at the density of tyrosine (Tyr) side chains.  Even if you
>>> calculate structure factors all the way out to 1.0 A the holes in the
>>> Tyr rings look exactly the same: just barely starting to form.  This is
>>> because the structure factors from atoms with B=30 are essentially zero
>>> out at 1.0 A, and adding zeroes does not change the map.  You can adjust
>>> the contour level, of course, and solvent content will have some effect
>>> on where the "1-sigma" contour lies, but generally B=30 is the point
>>> where Tyr side chains start to form their holes.  Traditionally, this is
>>> attributed to 1.8A resolution, but it is really at B=30.  The point
>>> where waters first start to poke out above the 1-sigma contour is at
>>> B=60, despite being generally attributed to d=2.7A.
>>> 
>>> Now, of course, if you cut off this B=30 data at 3.5A then the Tyr side
>>> chains become blobs, but that is equivalent to collecting data with the
>>> detector way too far away and losing your high-resolution spots off the
>>> edges.  I have seen a few people do that, but not usually for a
>>> published structure.  Most people fight very hard for those faint,
>>> barely-existing high-angle spots.  But why do we do that if the map is
>>> going to look the same anyway?  The reason is because resolution and B
>>> factors are linked.
>>> 
>>> Resolution is about separation vs width, and the width of the density
>>> peak from any atom is set by its B factor.  Yes, atoms have an intrinsic
>>> width, but it is very quickly washed out by even modest B factors (B >
>>> 10).  This is true for both x-ray and electron form factors. To a very
>>> good approximation, the FWHM of C, N and O atoms is given by:
>>> FWHM= sqrt(B*log(2))/pi+0.15
>>> 
>>> where "B" is the B factor assigned to the atom and the 0.15 fudge factor
>>> accounts for its intrinsic width when B=0.  Now that we know the peak
>>> width, we can start to ask if two peaks are "resolved".
>>> 
>>> Start with the classical definition of "resolution" (call it after Airy,
>>> Raleigh, Dawes, or whatever famous person you like), but essentially you
>>> are asking the question: "how close can two peaks be before they merge
>>> into one peak?".  For Gaussian peaks this is 0.849*FWHM. Simple enough.
>>> However, when you look at the density of two atoms this far apart you
>>> will see the peak is highly oblong. Yes, the density has one maximum,
>>> but there are clearly two atoms in there.  It is also pretty obvious the
>>> long axis of the peak is the line between the two atoms, and if you fit
>>> two round atoms into this peak you recover the distance between them
>>> quite accurately.  Are they really not "resolved" if it is so clear
>>> where they are?
>>> 
>>> In such cases you usually want to sharpen, as that will make the oblong
>>> blob turn into two resolved peaks.  Sharpening reduces the B factor and
>>> therefore FWHM of every atom, making the "resolution" (0.849*FWHM) a
>>> shorter distance.  So, we have improved resolution with sharpening!  Why
>>> don't we always do this?  Well, the reason is because of noise.
>>> Sharpening up-weights the noise of high-order Fourier terms and
>>> therefore degrades the overall signal-to-noise (SNR) of the map.  This
>>> is what I believe Colin would call reduced "contrast".  Of course, since
>>> we view maps with a threshold (aka contour) a map with SNR=5 will look
>>> almost identical to a map with SNR=500. The "noise floor" is generally
>>> well below the 1-sigma threshold, or even the 0-sigma threshold
>>> (https://doi.org/10.1073/pnas.1302823110).  As you turn up the
>>> sharpening you will see blobs split apart and also see new peaks rising
>>> 

Re: [ccp4bb] CCP4BB Digest - 27 Feb 2020 to 28 Feb 2020 (#2020-61)

2020-03-01 Thread dusan turk
Hi again,

First, I wish to thank everyone for their responses. I hope that no one minds 
that I include them in my response letter to the Editor ?

The idea of citing the Karplus and Diedrichs paper in Science has been 
essentially consumed already in our first response letter, which accompanied 
the submission of the revised manuscript,  and did not work. (Instead of the 
Science citation from 2012 we used a quote from the paper suggested below by 
Karplus and Diederichs, Curr Opin Struct Biol. 2015 Oct; 34: 60–68. )

As Phoebe mentioned, it would be good to use the momentum. My intention of 
writing to the bulletin board was not only to get help with argumentation, but 
also to raise the issue of how to address resolution cutoffs in 
crystallographic data in publications  to avoid such situations in general. I 
also think that the issue is more complicated than the last shell criterion 
with I/Isig > xx, Rmerge < yy, or cc1/2 > zz, or ...

1. Number of reflections in a shell effects the numbers significantly. The 
larger numbers of the shells there are the larger the numbers will be.  The 
number of reflections in a shell depends also on the highest resolution and 
unit cell size. Hence we have a dependance of potential criteria on several 
parameters. 

2. Refinement and data processing programs use different numbers of shells and 
even different ways of calculating shells. REFMAC typically uses 20 equal 
volume sliced shells), PHENIX is more complicated, as far as I understand 
(shell number may depend on the number of TEST set reflections in individual 
shell, shells can be defined according to equal  slicing volume,  some kind of 
log dependency or even linearly according to real space resolution), in MAIN 
there are  20 shells by default, but one can choose any of the mentioned 
slicing rules.

I suggest that we use this discussion to shape up guidelines that can later 
proposed for consideration to the IUCr committee for macromolecules.  I prefer 
soft as opposed to strict borders.  In the end, the structures do not speak for 
themselves, but are a mean to support one or more biologically relevant 
conclusions.

???

best wishes,
dusan turk


> On 29 Feb 2020, at 01:00, CCP4BB automatic digest system 
>  wrote:
> 
> 
> Date:Fri, 28 Feb 2020 16:03:22 +
> From:"Phoebe A. Rice" 
> Subject: Re: What resolution - X-ray diffraction round this time
> 
> Can we get some momentum for the "standard table 1" including TWO numbers - 
> outer limit used in refinement, and nominal resolution based on some standard 
> such as I/sigI =2 (or 3, or whatever the community can agree on)? That would 
> hopefully cut down on all the reviewer complaints of overstated resolution.
> 
> ~~~
> Phoebe A. Rice
> Dept. of Biochem & Mol. Biol. and
>  Committee on Microbiology
> https://voices.uchicago.edu/phoebericelab/
> 
> 
> On 2/28/20, 6:56 AM, "CCP4 bulletin board on behalf of Malý Martin" 
>  wrote:
> 
>Dear colleagues,
> 
>I agree with all the previous responses, it is a pity to throw away
>useful high-resolution data. The problem of high-resolution cutoff
>estimation is also nicely summarized in another paper by Andrew Karplus
>and Kay Diederichs "Assessing and maximizing data quality in
>macromolecular crystallography"
>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4684713/ . It is suggested
>using CC1/2 for the selection of the cutoff for data processing (not
>I/sigI or R_whatever). Later on, the decision should be validated
>performing the paired refinement protocol.
> 
>Good luck with the argumentation.
>Martin
> 
> 
>On 2/28/20 11:08 AM, LMB wrote:
>> Ask the referee - (apart from the other suggestions here)
>> 
>> ‘How would removing data Improve my model?”
>> 
>> Sent from my iPad
>> 
>>> On 28 Feb 2020, at 08:22, dusan turk  wrote:
>>> 
>>> Hi,
>>> 
>>> Browsing through the recent discussion on EM data resolution cutoff
>>> it occurred to me that the X-ray diffraction community isn’t that
>>> unanimous either.
>>> 
>>> My stand:
>>> 
>>> When the default resolution cutoff provided with the data processing
>>> software in electron density map calculation and refinement delivers
>>> quality maps noisier than expected and/or too high R-factors I start
>>> adjusting the resolution cutoff by lowering the resolution and trying
>>> alternative space group. Hence, I allow the data processing programs
>>> to suggest where to draw the line (be it CC1/2, I/sigI, R merge, R
>>> sym, R p.i.m. and R r.i.m, …) , unless there are problems.
>>> 
>>> Doing so, 

[ccp4bb] What resolution - X-ray diffraction round this time

2020-02-28 Thread dusan turk
Hi,

Browsing through the recent discussion on EM data resolution cutoff it occurred 
to me that the X-ray diffraction community isn’t that unanimous either.

My stand: 

When the default resolution cutoff provided with the data processing software 
in electron density map calculation and refinement delivers quality maps 
noisier than expected and/or too high R-factors I start adjusting the 
resolution cutoff by lowering the resolution and trying alternative space 
group.   Hence, I allow the data processing programs to suggest where to draw 
the line (be it CC1/2, I/sigI, R merge, R sym, R p.i.m. and R r.i.m, …) , 
unless there are problems. 

Doing so, I came into a dispute with a referee who shaped his request: 

"It is well accepted that the criteria for resolution cutoff should consider 
both I/SigI and Rmerge for the outer most shell. For data sets collected at 
synchrotron sources, the criteria of I/SigI > 5 and Rmerge <50% can be taken as 
a good practical reference.”

So where do we stand? Which are the most objective criteria for resolution 
cutoff to be used in diffraction data processing? Which number of shells to use 
when calculating the statistics? Do we have a consensus?

best wishes,

dusan turk



Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://stef.ijs.si/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem.& Mol.& Struct. Biology
fax:  +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] Does ncs bias R-free? And if so, can it be avoided by special selection of the free set?

2019-06-03 Thread dusan turk
n the normal
>> R-free and the R-free in shells, with up to 20-fold NCS present. I can't
>> comment on twinning, but with NCS it would seem that the normal CCP4 way of
>> picking the R-free set is as good as anything else!
>> On Sunday, 26 May 2019, 14:02:50 BST, dusan turk 
>> wrote:
>> 
>> 
>> Dear colleagues,
>> 
>> 
>>> Does ncs bias R-free? And if so, can it be avoided by special selection
>> of
>>the free set?
>> 
>> It occurs to me that we tend to forget that the objective of structure
>> determination is not the model with the lowest model bias, but the model
>> which is closest to the true structure. The structure without model bias is
>> the structure without a model - which is not really helpful.
>> 
>> An angle on the NCS issue is provided by the work of Silva & Rossmann
>> (1985, Acta Cryst B41, 147-157), who discarded most of data almost
>> proportionally to the level of NCS redundancy (using 1/7th for WORK set and
>> 6/7 for TEST set in the case of 10-fold NCS). They did it in 1990s in order
>> to make refinement of their large structure computationally feasible:
>> “Despite the reduction in the number of variables imposed by the
>> non-crystallographic constraints, the problem remained a formidable one if
>> all 298615 crystallographically independent reflections were to be used in
>> the refinement. However, the reduction of size of the asymmetric unit in
>> real space should be equivalent to a corresponding reduction in reciprocal
>> space. Hence, one-tenth of refinement of the independent data might suffice
>> for refinement.” In conclusion they stated that “This is the first time
>> that the structure of a complete virus has been refined by a
>> reciprocal-space method.” To conclude, to select an independent data set to
>> refined against, one should take an n-th fraction of reflections from the
>> data set containing the n-fold NCS.
>> 
>> Now on the bias of the concept of R-free itself. As we known, each term in
>> the Fourier series is orthogonal to all other terms, hence the projection
>> of any two terms on each other is zero. We also know that diffraction
>> pattern of a crystal structure is composed of Iobs which reflect Fobs. Fobs
>> are a Fourier series of terms . From measured set of Iobs we can directly
>> calculate |Fobs|, but not their phase. To calculate the phase in refinement
>> we use Fmodel structure factors, of which the most significant part are
>> Fcalc calculated from atomic model. However, the model is changed during
>> model building and refinement (atomic positions, B-factors and
>> occupancies), all Fmodel structure factors change in size and in phase
>> angle.
>> 
>> During refinement using a cross validated maximum likelihood target
>> function atomic model is fitted against the selected subset of |Fobs|,
>> called WORK set, using a corresponding subset of Fmodel. The remaining part
>> of structure factors of Fmodel, called the TEST set is used to calculate
>> the weighted terms used in refinement and is based on phase error
>> estimates. This Fmodel fraction equally depends on attributes of all atoms
>> of the model. As consequence, the TEST fraction of Fmodel structure factors
>> is model dependent. Now comes the catch, if the TEST fraction of structure
>> factors (Fobs) was truly independent from the model, then it should remain
>> so also during the refinement. As consequence and simultaneous proof of
>> this independency, the R-free should not be affected by refinement. As we
>> know this holds only for the incorrect structure solutions. Their atoms are
>> refined in direction that do not lead towards the true structure. As soon
>> as a structure solution is correct, its improvements will lower R-free
>> because the model is related to the true crystal structure. This is in my
>> opinion the only true value of the R-free gap criterion. The problems are
>> that use of the WORK subset makes refinement to aim off the true target and
>> that the use of TEST fraction for estimating phase error correctness is an
>> approximation not justified by the claim of independency of the TEST set. I
>> do not want to undermine the historical importance of the TEST set use for
>> refinement and structure validation, however we need and can do better.
>> 
>> As shown by Silva & Rossman in 1985 the concept of independency of a TEST
>> subset fraction of Fobs structure factors is not true for the structures
>> composed of equal copies of molecules present in asymmetric unit of a
>> crystal (crystals with NCS) . The same reasoni

Re: [ccp4bb] CCP4BB Digest - 23 Mar 2017 to 24 Mar 2017 (#2017-83)

2017-03-24 Thread Dusan Turk
Dear Alex,

try MAIN "http://www-bmb.ijs.si/;

it has a two way interface to Ramachandran plot, you can either display 
“HIS_RAMA” of individual clicked residues in displayed 3D  model and also the 
other way around, you can click on residues in Ramachandran plot and center on 
the residue in 3D model.

If you need some help to on how this works  do not hesitate to contact me.

regards,
dusan
 
> On 25 Mar 2017, at 01:00, CCP4BB automatic digest system 
>  wrote:
> 
> Date:Thu, 23 Mar 2017 20:39:46 -0700
> From:Alex Lee 
> Subject: software or tool for Individual residue in a Ramachandran plot graph?
> 
> Dear All,
> 
> Is there a tool or software which can give Ramachandran information of
> individual residues in a plot?
> 
> I used Coot to check for Ramachandran plots, but it shows all the residues
> in a coordinate I put in Coot, not individual one. I also use "residue
> info" in coot, it tells Ramachandran "phi psi" angles of individual
> residue, but it does not show it in a plot, only numbers.
> 
> Thanks ahead for any input.
> 


Re: [ccp4bb] Calculation of RSRZ Score in PDB Validation Reports

2016-11-29 Thread dusan turk
electron density,

which

should in theory increase the RSRZ score.  While of course every

structure is

different and the quality of it due to the rigor of the person building

the

model, I was wondering if there were any general trends related to

resolution

and RSRZ score.

Thanks,
Matt


--

Date:Mon, 28 Nov 2016 21:46:24 -0800
From:Ethan Merritt<merr...@u.washington.edu>
Subject: Re: Calculation of RSRZ Score in PDB Validation Reports

On Monday, 28 November 2016 08:35:44 PM Pavel Afonine wrote:

I find Lothar's comments regarding H and RSRZ excellent! I would think of
it as a pretty much bug report. I hope developers at that end listen. This
goes very well in line with Phoebe's comment earlier today.

I guess I'm a bit surprised that adding or subtracting hydrogens from the model
without re-refining or at least re-calculating Fc would affect RSRZ at all.
I had thought that RSRZ was obtained by comparing density in an Fc map
(or probably mFo-DFc) with the corresponding density in an Fo map.
I thought that the coordinates were used only to determine the per-residue
region of the map to be compared.

Going back to the 2004 Kleywegt paper that the PDB cites for calculation of
RSRZ I see that it's a bit ambiguous exactly what maps are being compared.
So maybe I'm wrong and the current coordinates are used directly to get
local "Fc density" by expanding 3D Gaussians without reference to a previously
calculated map from refined phases.

Can anyone clarify exactly what maps are being compared during wwPDB
validation?

Ethan



--
Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS "Jozef Stefan"
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem.& Mol.& Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com


Re: [ccp4bb] How many is too many free reflections?

2015-06-16 Thread dusan turk
 restraints in that 
 case) are used to reduce overfitting and obtain a more accurate structure. 
 Cross-validation made it possible to detect overfitting of the data when no 
 DEN restraints were used. We believe this should also apply when other types 
 of restraints are used (e.g., reference model restraints in phenix.refine, 
 REFMAC, or BUSTER).  
 
 In summary, we believe that cross-validation remains an important (and 
 conceptually simple) method to detect overfitting and for overall structure 
 validation.
 
 Axel
 
 Axel T. Brunger
 Professor and Chair, Department of Molecular and Cellular Physiology
 Investigator, HHMI
 Email: brun...@stanford.edu
 Phone: 650-736-1031
 Web: http://atbweb.stanford.edu
 
 Paul
 
 Paul Adams
 Deputy Division Director, Physical Biosciences Division, Lawrence Berkeley Lab
 Division Deputy for Biosciences, Advanced Light Source, Lawrence Berkeley Lab
 Adjunct Professor, Department of Bioengineering, U.C. Berkeley
 Vice President for Technology, the Joint BioEnergy Institute
 Laboratory Research Manager, ENIGMA Science Focus Area
 
 Tel: 1-510-486-4225, Fax: 1-510-486-5909
 
 http://cci.lbl.gov/paul
 On Jun 5, 2015, at 2:18 AM, Gerard Bricogne g...@globalphasing.com wrote:
 
 Dear Dusan,
 
 This is a nice paper and an interestingly different approach to
 avoiding bias and/or quantifying errors - and indeed there are all
 kinds of possibilities if you have a particular structure on which you
 are prepared to spend unlimited time and resources.
 
 The specific context in which Graeme's initial question led me to
 query instead who should set the FreeR flags, at what stage and on
 what basis? was that of the data analysis linked to high-throughput
 fragment screening, in which speed is of the essence at every step. 
 
 Creating FreeR flags afresh for each target-fragment complex
 dataset without any reference to those used in the refinement of the
 apo structure is by no means an irrecoverable error, but it will take
 extra computing time to let the refinement of the complex adjust to a
 new free set, starting from a model refined with the ignored one. It
 is in order to avoid the need for that extra time, or for a recourse
 to various debiasing methods, that the book-keeping faff described
 yesterday has been introduced. Operating without it is perfectly
 feasible, it is just likely to not be optimally direct.
 
 I will probably bow out here, before someone asks How many
 [e-mails from me] is too many? :-) .
 
 
 With best wishes,
 
  Gerard.
 
 --
 On Fri, Jun 05, 2015 at 09:14:18AM +0200, dusan turk wrote:
 Graeme,
 one more suggestion. You can avoid all the recipes by use all data for WORK 
 set and 0 reflections for TEST set regardless of the amount of data by 
 using the FREE KICK ML target. For explanation see our recent paper 
 Praznikar, J.  Turk, D. (2014) Free kick instead of cross-validation in 
 maximum-likelihood refinement of macromolecular crystal structures. Acta 
 Cryst. D70, 3124-3134. 
 
 Link to the paper you can find at 
 “http://www-bmb.ijs.si/doc/references.HTML”
 
 best,
 dusan
 
 
 
 On Jun 5, 2015, at 1:03 AM, CCP4BB automatic digest system 
 lists...@jiscmail.ac.uk wrote:
 
 Date:Thu, 4 Jun 2015 08:30:57 +
 From:Graeme Winter graeme.win...@gmail.com
 Subject: Re: How many is too many free reflections?
 
 Hi Folks,
 
 Many thanks for all of your comments - in keeping with the spirit of the BB
 I have digested the responses below. Interestingly I suspect that the
 responses to this question indicate the very wide range of resolution
 limits of the data people work with!
 
 Best wishes Graeme
 
 ===
 
 Proposal 1:
 
 10% reflections, max 2000
 
 Proposal 2: from wiki:
 
 http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Test_set
 
 including Randy Read recipe:
 
 So here's the recipe I would use, for what it's worth:
 1 reflections:set aside 10%
  1-2 reflections:  set aside 1000 reflections
  2-4 reflections:  set aside 5%
 4 reflections:set aside 2000 reflections
 
 Proposal 3:
 
 5% maximum 2-5k
 
 Proposal 4:
 
 3% minimum 1000
 
 Proposal 5:
 
 5-10% of reflections, minimum 1000
 
 Proposal 6:
 
 50 reflections per bin in order to get reliable ML parameter
 estimation, ideally around 150 / bin.
 
 Proposal 7:
 
 If lots of reflections (i.e. 800K unique) around 1% selected - 5% would be
 40k i.e. rather a lot. Referees question use of  5k reflections as test
 set.
 
 Comment 1 in response to this:
 
 Surely absolute # of test reflections is not relevant, percentage is.
 
 
 
 Approximate consensus (i.e. what I will look at doing in xia2) - probably
 follow Randy Read recipe from ccp4wiki as this seems to (probably) satisfy
 most of the criteria raised by everyone else.
 
 
 
 On Tue, Jun 2, 2015 at 11:26 AM Graeme Winter graeme.win...@gmail.com
 wrote:
 
 Hi Folks
 
 Had a vague comment handed my way

Re: [ccp4bb] How many is too many free reflections?

2015-06-05 Thread dusan turk
Graeme,
one more suggestion. You can avoid all the recipes by use all data for WORK set 
and 0 reflections for TEST set regardless of the amount of data by using the 
FREE KICK ML target. For explanation see our recent paper Praznikar, J.  Turk, 
D. (2014) Free kick instead of cross-validation in maximum-likelihood 
refinement of macromolecular crystal structures. Acta Cryst. D70, 3124-3134. 

Link to the paper you can find at “http://www-bmb.ijs.si/doc/references.HTML”

best,
dusan

 

 On Jun 5, 2015, at 1:03 AM, CCP4BB automatic digest system 
 lists...@jiscmail.ac.uk wrote:
 
 Date:Thu, 4 Jun 2015 08:30:57 +
 From:Graeme Winter graeme.win...@gmail.com
 Subject: Re: How many is too many free reflections?
 
 Hi Folks,
 
 Many thanks for all of your comments - in keeping with the spirit of the BB
 I have digested the responses below. Interestingly I suspect that the
 responses to this question indicate the very wide range of resolution
 limits of the data people work with!
 
 Best wishes Graeme
 
 ===
 
 Proposal 1:
 
 10% reflections, max 2000
 
 Proposal 2: from wiki:
 
 http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Test_set
 
 including Randy Read recipe:
 
 So here's the recipe I would use, for what it's worth:
  1 reflections:set aside 10%
   1-2 reflections:  set aside 1000 reflections
   2-4 reflections:  set aside 5%
 4 reflections:set aside 2000 reflections
 
 Proposal 3:
 
 5% maximum 2-5k
 
 Proposal 4:
 
 3% minimum 1000
 
 Proposal 5:
 
 5-10% of reflections, minimum 1000
 
 Proposal 6:
 
 50 reflections per bin in order to get reliable ML parameter
 estimation, ideally around 150 / bin.
 
 Proposal 7:
 
 If lots of reflections (i.e. 800K unique) around 1% selected - 5% would be
 40k i.e. rather a lot. Referees question use of  5k reflections as test
 set.
 
 Comment 1 in response to this:
 
 Surely absolute # of test reflections is not relevant, percentage is.
 
 
 
 Approximate consensus (i.e. what I will look at doing in xia2) - probably
 follow Randy Read recipe from ccp4wiki as this seems to (probably) satisfy
 most of the criteria raised by everyone else.
 
 
 
 On Tue, Jun 2, 2015 at 11:26 AM Graeme Winter graeme.win...@gmail.com
 wrote:
 
 Hi Folks
 
 Had a vague comment handed my way that xia2 assigns too many free
 reflections - I have a feeling that by default it makes a free set of 5%
 which was OK back in the day (like I/sig(I) = 2 was OK) but maybe seems
 excessive now.
 
 This was particularly in the case of high resolution data where you have a
 lot of reflections, so 5% could be several thousand which would be more
 than you need to just check Rfree seems OK.
 
 Since I really don't know what is the right # reflections to assign to a
 free set thought I would ask here - what do you think? Essentially I need
 to assign a minimum %age or minimum # - the lower of the two presumably?
 
 Any comments welcome!
 
 Thanks  best wishes Graeme
 
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com


[ccp4bb] crystal structures of enzymes with catalytic mutants

2015-04-13 Thread dusan turk
Hello,

I have a dispute with a grant referee about the scientific relevance of 
structures of catalytically inactive enzymes including their complexes with 
substrates or analogues. I was looking through pdb and the keyword search did 
not get me far. Could you please tell me 
- which enzyme structure of catalytically inactive mutant was published first 
and when
- which catalytically inactive enzyme structure in complex with substrate or 
part of it as a ligand was published first and when.

help of the community is highly appreciated,

best wishes,

dusan

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com http://www.skype.com/












Re: [ccp4bb] CCP4BB Digest - 1 Apr 2015 to 2 Apr 2015 (#2015-92)

2015-04-03 Thread dusan turk
Shane,

After you define which segments share proper or improper all NCS their 
parameters in MAIN environment, they will be calculated and stored in such 
macro files. 

$ cat mol_A_to_B.com 
! SAVING RMS FIT DATA: 
set matrix MAT_ROT number - 
  0.959598  -0.031620  -0.279592 - 
 -0.029443   0.976927  -0.211536 - 
  0.279830   0.211222   0.936526 
set vari XTRAN global real = 3.66 
set vari YTRAN global real =   105.12 
set vari ZTRAN global real =   -33.40 
return 

If you want to renormalize your matrices, you not only need to ensure that the 
length of your lines or columns is equal to 1, but also that they are 
orthogonal to each other, which is the easiest achieved by calculating the 
cross products of the matrix lines (a,b,c): (a * b = c ) through which you 
calculate first c and then either a or b.  I would not do it, because  the 
limited precision distorts the transformation and making in orthogonal will 
distort the accuracy of superimposition.

best,
dusan



 On Apr 3, 2015, at 1:00 AM, CCP4BB automatic digest system 
 lists...@jiscmail.ac.uk wrote:
 
 
 Date:Wed, 1 Apr 2015 20:47:25 -0400
 From:Shane Caldwell shane.caldwel...@gmail.com
 Subject: Re: Sortwater NCS Matrix input
 
 Alright, thanks! It's a good thing, then, I spent the afternoon brushing up
 on matrices.
 
 I guess the next, probably more general question for the bb is: which
 utilities export an NCS transformation matrix with more precision?
 *superpose* and *gesamt* only export three decimals, though I'm sure they
 use greater precision under the hood. I'm not opposed to exporting from
 coot or pymol either, I just haven't figured out how to do this yet - what
 would be the simplest way to calculate and export an NCS transformation
 matrix?
 
 Shane
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com


Re: [ccp4bb] CCP4BB Digest - 25 Mar 2015 to 26 Mar 2015 (#2015-86)

2015-03-26 Thread dusan turk
Dear Herman,

I am not sure what you really want. Maybe this can help you.

You can get the following numbers out of MAIN 
1. comparing a pair of selections. It makes sense only if they are equivalent.

 rms b-val  select my_selection_1 end select my_selection_2 end

2. Show command does statistics for a given selection. The “bond keyword 
considers fluctuations between bonding atoms, whereas without it the selection 
is considered as one group. 

 show b-val bond select my_selection end

best regards,
dusan


 On Mar 27, 2015, at 1:00 AM, CCP4BB automatic digest system 
 lists...@jiscmail.ac.uk wrote:
 
 Date:Thu, 26 Mar 2015 11:32:43 +
 From:herman.schreu...@sanofi.com mailto:herman.schreu...@sanofi.com
 Subject: r.m.s.d. ΔB
 
 Dear Bulletin Board,
 
 A referee wants for the “Table 1” in the supplementary information the 
 following data:
 
 The r.m.s.d. ΔB (bonded atoms) (Å2)
 All protein atoms
 Main chain – Main chain
 Side chain – Side chain
 Main chain – Side chain
 
 r.m.s.d. ΔB (Non-bonded contacts) (Å2)
 All protein atoms
 
 Using google I found at that some of these numbers could be calculated with 
 Moleman, although I am not sure to what extend this program is still 
 maintained.
 Older versions of Refmac would calculate r.m.s.d. ΔB’s for main chain and 
 side chain bonds, which I guess would be the “Main chain – Main chain” and 
 “Side chain – Side chain” values requested. However, what would should I 
 think of the “Main chain – Side chain” values; differences between Calpha and 
 Cbeta atoms?
 
 What would be the use of these numbers? The standard CCP4 validation 
 programs, or any validation program I know, do not calculate these numbers, 
 so they do not seem to be extremely important. If somebody could point me to 
 a program which could calculate these number without too much effort, I would 
 be happy to do it.  Otherwise, I would still be willing to go the extra mile 
 if someone could convince me that it is useful to have these numbers.
 
 Thank you for your help!
 Herman
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com http://www.skype.com/












Re: [ccp4bb] CCP4BB Digest - 5 Feb 2015 to 6 Feb 2015 (#2015-40)

2015-02-06 Thread dusan turk
Fred,

as you know discontinuous lattice is not physically possible. Therefore first 
make sure  to exclude your and deposition errors like a wrong space group and 
cell constants. However, it may be that some molecules are disordered and 
therefore absent in the structure solution. If there are no described errors, 
it is possible that it is similar case to the structure we published last year: 
Renko et al., 
Partial rotational lattice order–disorder in stefin B crystals. 
Acta Crystallogr D Biol Crystallogr. 2014 Apr 1; 70(Pt 4): 1015–1025.
Published online 2014 Mar 19. doi:  10.1107/S139900471491
PMCID: PMC3975888

best regards,
dusan



 On Feb 7, 2015, at 1:00 AM, CCP4BB automatic digest system 
 lists...@jiscmail.ac.uk wrote:
 
 
 Date:Fri, 6 Feb 2015 11:58:45 +0100
 From:Kerff Fred fke...@ulg.ac.be
 Subject: Absence of contact between layers in a crystal
 
 Hello,
 
 Looking at structure 2HR0 (The structure of complement C3b provides insights 
 into complement activation and regulation. »,Abdul Ajees, A.,  Gunasekaran, 
 K.,  Volanakis, J.E.,  Narayana, S.V.,  Kotwal, G.J.,  Krishna Murthy, H.M.;  
 (2006) Nature 444: 221-225), I noticed the absence of contacts between layers 
 in the crystal. Is it something that has already been observed in other 
 crystals?
 
 Best regards,
 
 Fred
 -
 Frédéric Kerff
 Chercheur qualifié F.R.S.-FNRS
 Cristallographie des protéines
 Centre d'Ingénierie des Protéines
 Université de Liège

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com


[ccp4bb] Cross-validation when test set is miniscule

2014-12-19 Thread dusan turk
Dear Derek,

I suggest you not not use the cross validation at all. With small data sets the 
refinement with cross validation is very unstable and the choice of the TEST 
set dependent. We explained why and suggested to use an alternative function, 
which can use all data in refinement.

Acta Cryst. (2014). D70, 3124-3134  [ doi:10.1107/S1399004714021336 
http://dx.doi.org/10.1107/S1399004714021336 ]
Free kick instead of cross-validation in maximum-likelihood refinement of 
macromolecular crystal structures

J. Praznikar 
http://scripts.iucr.org/cgi-bin/citedin?search_on=nameauthor_name=Praznikar%2C%20J%2E
 and D. Turk 
http://scripts.iucr.org/cgi-bin/citedin?search_on=nameauthor_name=Turk%2C%20D%2E
Synopsis: The maximum-likelihood free-kick target, which calculates model error 
estimates from the work set and a randomly displaced model, proved superior in 
the accuracy and consistency of refinement of crystal structures compared with 
the maximum-likelihood cross-validation target, which calculates error 
estimates from the test set and the unperturbed model.

Online 22 November 2014

best regards,
dusan


 On Dec 20, 2014, at 1:05 AM, CCP4BB automatic digest system 
 lists...@jiscmail.ac.uk wrote:
 
 Date:Fri, 19 Dec 2014 11:18:37 +
 From:Derek Logan derek.lo...@biochemistry.lu.se 
 mailto:derek.lo...@biochemistry.lu.se
 Subject: Cross-validation when test set is miniscule
 
 Hi everyone,
 
 Right now we have one of those very difficult Rfree situations where it's 
 impossible to generate a single meaningful Rfree set. Since we're in a bit of 
 a hurry with this structure it would be good if someone could point me in the 
 right direction. We have crystals with 1542 non-H atoms in the asymmetric 
 unit that diffract to only 3.6 Å in P65, which gives us a whopping 2300 
 reflections in total. 5% of this is only about 100 reflections. Luckily the 
 protein is only a single point mutation of a wild type that has been solved 
 to much better resolution, so we know what it should look like and I simply 
 want to investigate the effect of different levels of conservatism in the 
 refinement, e.g. NCS in xyz and B, group B-factors, reference model, 
 Ramachandran restraints etc. However since the quality criterion for this is 
 Rfree I'm not able to do this.
 
 I believe the correct approach is k-fold statistical cross-validation, but 
 can someone remind me of the correct way to do this? I've done a bit of 
 Googling without finding anything very helpful.
 
 Thanks
 Derek
 
 Derek Logan tel: +46 46 222 1443
 Associate Professor mob: +46 76 8585 707
 Dept. of Biochemistry and Structural Biology  www.cmps.lu.se 
 http://www.cmps.lu.se/http://www.cmps.lu.se http://www.cmps.lu.se/
 Centre for Molecular Protein Sciencewww.maxlab.lu.se/crystal 
 http://www.maxlab.lu.se/crystal
 Lund University, Box 124, 221 00 Lund, Sweden   www.saromics.com 
 http://www.saromics.com/

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com http://www.skype.com/












Re: [ccp4bb] Free Reflections as Percent and not a Number

2014-11-26 Thread dusan turk
Hello guys,

There is too much text in this discussion to respond to every part of it. Apart 
from “jiggle” in certain software like PHENIX and I believe in X-PLOR 
derivatives the word “shake” means the same. In the “MAIN” environment I use 
the word “kick” to randomly distort coordinates. It's first use introduced in 
the early 90’s was to improve the convergence of model refinement and 
minimization. I have seen it as a substitute to molecular dynamics under real 
or reciprocal crystallographic restraints (we call this simulated annealing or 
slow cooling) as it is computationally much faster.  The procedure in MAIN is 
called fast cooling” because the atoms move only under the energy potential 
energy terms with no kinetics energy present. The “fast cooled” the structure 
is thus frozen - take from a high energy state to the one with the lowest 
potential energy reachable. In order to reach the lowest possible point in 
potential energy landscape the kick at the beginning of each cooling cycle is 
lowered. The initial kick coordinate size is typically from 0.8A and drops down 
each cycle down to 0. The experience shows values beyond 0.8 may not lead to 
recovery of chemically reasonable structure in every part of it. Towards the 
end of refinement the starting kick is typically reduced to 0.05.  Apart from 
coordinates also B-factors can be kicked. 

Are the structures after “kick” cooling refinement the same as without the 
kick? My over two decades long experience shows that by kicking convergence of 
refinement is improved. The resulting structures can thus be different as the 
different repeating cooling cycles may shift them to a lower energy point. 
However, after the structure is refined (has converged), the different 
refinements will converge to approximately the same coordinates as Ian 
described.  I assume the this is the numerical error of the different 
procedures. As to the use of different TEST sets  we came to a different 
conclusion (see bellow).

As to the claim(s) that kicking/jiggling/shaking does or does not  remove the 
model bias the color of the answer is not black and white, but it is grey. 
Kicking namely reduces the model bias, but does not eliminate it.  We have 
shown this in our kick map paper by Praznikar J et al (2009) Averaged kick 
maps: less noise, more signal... and probably less bias. Acta Crystallogr D 
Biol Crystallogr. 921-31.  

As for the use of % or number of reflection for R-FREE and the TEST, my 
suggestion is not use the TEST set and concept of R-free at all. Namely, 
- excluding the data from the target changes the target, because in refinement 
the information present in every missing reflection can not be recovered from 
the rest of data.  (The Fourier series terms are orthogonal to each other, 
therefore information from each reflection is not present in any other 
reflection.) The absence of certain data thus contains a bias of their absence.
- In addition, in the ML refinement using the cross validation the shape of ML 
function is calculated from the TEST set of structure factors of chemically 
reasonable structure, which is regularized by chemical energy terms and thus 
contains systematic error.  Because of this propagation of this effect on the 
whole model structure under refinement, the cross validation introduces model 
bias in refinement.  
For detailed explanation you are invited to read our paper Free kick instead 
of cross-validation in maximum-likelihood refinement of macromolecular crystal 
structures”  which just appeared on line in Acta Crys D (2014). 

best regards,
dusan
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com


[ccp4bb] Betreff: Re: [ccp4bb] correlated alternate confs - validation?

2014-07-23 Thread dusan turk
Hello,

A solution exists also for PDB records:
 
As a consequence of a discussion with George which took place  years ago, I 
have taken the liberty to extend the PDB record length of 80 characters  and 
use additional characters to specify the group ID and their members list number 
and provided an interface to manipulate them.  Thereby the PDB limitations were 
extended to adopt the SHELX philosophy of treating combined groups of any 
composition.

ATOM  1  N   GLY A   1 -31.334 -24.247  23.250  0.54 34.05  1A   N  
 1   0
ATOM   2477  N   GLY A   1 -29.650 -24.643  23.839  0.46 41.88  1B   N  
 1   1

The records with our the group extension are not members of groups with partial 
occupancy. These two records shown are members of group ID 1, the atom in the 
first record belongs to group ID 1 members list number 0 and the second to 
group ID 1 and members list no 1.  (The members lists begin with 0.) 



 
On Jul 24, 2014, at 1:02 AM, CCP4BB automatic digest system 
lists...@jiscmail.ac.uk wrote:

 Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von
 Frances C. Bernstein
 Gesendet: Mittwoch, 23. Juli 2014 17:20
 An: CCP4BB@JISCMAIL.AC.UK
 Betreff: Re: [ccp4bb] correlated alternate confs - validation?
 
 I agree that it would be excellent to be able to associate alternate
 conformations (beyond the individual residue) but when we defined the PDB
 format we had an 80-column limitation per atom and so only one column was
 allowed for alternate conformations.  In an ASCII world only 36 characters
 were available to define alternate conformations.  This is inadequate to
 allow for many residues with independent alternate conformations - one
 residue with three conformations would use up 3 of the 36 characters.  Thus
 there was no way to say that alternate conformation A in one residue is or
 is not associated with alternate conformation A in another residue.
 
  Frances

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com












Re: [ccp4bb] Rotamer Selection in Low Resolution Data

2014-06-18 Thread dusan turk
 to place “optimal-fit to electron
 density side-chain rotamers into my model?
 Preferably in an NCS-independant manner?
 
 
 naively assuming that one of the three software packages that you did
 not mention by name is Phenix:
 yes, you can do it in a number of different ways. Let me know if
 interested and I will list all options.
 
 Pavel
 
 
 
 
 --
 
 End of CCP4BB Digest - 17 Jun 2014 to 18 Jun 2014 (#2014-166)
 *

Dr. Dusan Turk, Prof.
Head of Structural Biology Group http://bio.ijs.si/sbl/ 
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
http://www.cipkebip.org/
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.si
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com












[ccp4bb] postdoc position at Jozef Stefan Institute, Ljubljana, Slovenia

2013-09-23 Thread dusan turk
Dear All,

We are looking for a postdoctoral researcher for our Structural
Biology Group at the Dept. of Biochemistry and Molecular and
structural Biology at Jozef Stefan Institute, Ljubljana, Slovenia.  We
are studying human and mouse endosomal enzymes involved in protein
degradation and immune response and proteins at the surface of
pathogenic bacteria such as S. aureus and C. difficile.

We are looking for a self-initiative person, with a background in
molecular biology, protein biochemistry, and macromolecular
crystallography. Managing and communication skills for running a
scientific project in a group environment are expected. Tentative 
start is January 1st, 2014. 

We offer excellent research opportunities and a stimulating
environment for structural biology. Our laboratory is a member of a
recently established Centre of Excellence for Integrated Approaches in
Chemistry and Biology of Proteins and is equipped with
state-of-the-art equipment for protein expression and biochemistry, X-ray
crystallography, and mass spectrometry.

Applications (including CV with publication list, research experience,
copy of PhD certificate, and two persons with contact information for
references) and contact information (E-mail, phone, or Skype) should
be sent by mail or E-mail to

Prof. Dr. Dusan Turk
Jozef Stefan Institute
Dept. of Biochemistry and Molecular and Stuctural Biology
Jozef Stefan Instiutute
Jamova 39
1000 Ljubljana, Slovenia

Dr. Dusan Turk, Prof.
Head of Structural Biology Group
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.sihttp://bio.ijs.si/sbl/
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com








Re: [ccp4bb] Resolution and data/parameter ratio, which one is more important?

2013-03-16 Thread dusan turk
Dear Guangyu Zhu,

if this is not a hypothetical case you can refine both structures in each 
crystal form separately using whatever software and compare them later.
The structure can be refined also in both crystal forms simultaneously using 
the multi crystal NCS refinement as implemented in MAIN http://www-bmb.ijs.si/ 
and thereby double the data to parameter ratio when compared to the one crystal 
form data set refinement.

best regards,
dusan

On Mar 16, 2013, at 1:00 AM, CCP4BB automatic digest system 
lists...@jiscmail.ac.uk wrote:

 Date:Thu, 14 Mar 2013 20:27:44 -0400
 From:Guangyu Zhu g...@hwi.buffalo.edu
 Subject: Resolution and data/parameter ratio, which one is more important?
 
 I have this question. For exmaple, a protein could be crystallized in two 
 crystal forms. Two crystal form have same space group, and 1 molecule/asymm. 
 One crystal form diffracts to 3A with 50% solvent; and the other diffracts to 
 3.6A with 80% solvent. The cell volume of 3.6A crystal must be 5/2=2.5 times 
 larger because of higher solvent content. If both data collecte to same 
 completeness (say 100%), 3.6A data actually have higher data/parameter ratio, 
 5/2/(3.6/3)**3= 1.45 times to 3A data. For refinement, better data/parameter 
 should give more accurate structure, ie. 3.6A data is better. But higher 
 resolution should give a better resolved electron density map. So which 
 crystal form really give a better (more reliable and accurate) protein 
 structure?




Re: [ccp4bb] Who invented PDB format?

2013-01-08 Thread dusan turk
Hi Teri,

MAIN can do it: http://www-bmb.ijs.si/;.

this script will help you:

read file xx.pdb coor pdb
set coor sele all end deortho
write over file xx_fract.pdb coor pdb standard

for the opposite conversion replace the deortho with ortho.

When the PDB file doc not contain the CRYST1 read a cell constants file before 
the PDB file.

example of cell.dat file:

10.9 10 10 90 90 90.0

best regards,
dusan

On Jan 8, 2013, at 1:00 AM, CCP4BB automatic digest system 
lists...@jiscmail.ac.uk wrote:

 Date:Mon, 7 Jan 2013 11:18:19 +0530
 From:Teri Arman teriar...@gmail.com
 Subject: Re: Who invented PDB format?
 
 Fractional Coordiantes to Orthogonal Coordinates and Vice Versa
 
 Hi, I need help, how can I make coordiates of 100 of PDBs of different
 space groups to fractional coordiantes and vice versa. I do not find CCP4
 do it?  A program of fortran or C codes with possible suggestion may be
 helpful.
 Thank you.
 TA
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.sihttp://bio.ijs.si/sbl/
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com








[ccp4bb] Program for map rotation

2012-09-21 Thread dusan turk
Dear Niu,

MAIN can do it (http://www-bmb.ijs.si).

dusan

On Sep 21, 2012, at 1:06 AM, CCP4BB automatic digest system wrote:

 Date:Wed, 19 Sep 2012 15:25:50 -0700
 From:Niu Tou niutou2...@gmail.com
 Subject: Program for map rotation
 
 Dear colleagues,
 
 Is there any program can rotate a density map (generated by FFT, could be
 read in Coot and Pymol) given the matrix? I have tried
 Extension-Maps- Transform map by LSQ model fit, it looks doesnot work.
 While the program Edit/Rotate map  Mask in CCP4 gave an error message:
 mapmask:   mapextend - input map does not contain a whole ASU.
 
 Thanks!
 
 Niu
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.sihttp://bio.ijs.si/sbl/
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com








Re: [ccp4bb] Announcing a Web Server for the Grade ligand restraints generator.

2012-03-20 Thread dusan turk
Dear Gerard,

This is a marvelous political achievement. For years I have been trying to 
convince CCDC to allow us to
provide PURY restraint database http://pury.ijs.si/;, which is a compilation 
of CSDB, to unrestrained use  to public via a web server interface. Now you 
have achieved this with GRADE. 

Do I understand correctly that the CSD license restrain has been removed for 
anyone providing such a service or is the use of MOGUL the only way how one can 
provide CSDB derived parameters to public?

best wishes,
dusan


On Mar 21, 2012, at 1:01 AM, CCP4BB automatic digest system wrote:

 
 Date:Tue, 20 Mar 2012 08:56:46 +0100
 From:George Sheldrick gshe...@shelx.uni-ac.gwdg.de
 Subject: Re: Announcing a Web Server for the Grade ligand restraints 
 generator.
 
 Dear Gerard,
 
 That will be an extremely useful facility. Is a 'regrading' of the 
 monomer library planned, at least for common cofactors and 
 crystallization additives?
 
 Best wishes, George
 
 On 03/19/2012 06:22 PM, Gerard Bricogne wrote:
 Dear all,
 
The generation of reliable restraints for novel small-molecule
 ligands in macromolecular complexes is of great importance for both ligand
 placement into density maps and subsequent refinement. This has led us to
 develop Grade, a ligand restraint generator whose main source of restraint
 information is the Cambridge Structural Database (CSD) of small-molecule
 crystal structures, queried using the MOGUL program developed by the CCDC.
 Where small-molecule information is lacking, Grade uses quantum chemical
 procedures to obtain the restraint values.
 
Grade was released to academic users as part of the BUSTER package in
 July 2011 and has proved popular. However, a problem for numerous academic
 users has been that, in order to get the best restraints from Grade, a CSD
 system licence is necessary to make use of MOGUL. Although many institutions
 already have CSD site licences, and otherwise licences are available at a
 reasonable cost, this has prevented the use of Grade by small groups and
 occasional users.
 
To provide easy access to Grade, the CCDC has kindly agreed that we
 can provide a public Web server that includes the use of MOGUL in its
 invocation of Grade. The first version of the server is now available, free
 of charge, at
 
http://grade.globalphasing.org
 
We hope this server will prove useful to academic users. We will be
 very grateful for any feedback you might be able to provide about this
 server, so that we can keep improving it to meet the needs of the community.
 Please send us your feedback and comments at
 
  buster-deve...@globalphasing.com
 
 rather than write to a specific developer.
 
 
With best wishes,
 
The Global Phasing developers: Gerard Bricogne, Claus Flensburg,
Peter Keller, Wlodek Paciorek, Andrew Sharff, Oliver Smart,
Clemens Vonrhein and Thomas Womack.
 
 

Dr. Dusan Turk, Prof.
Head of Structural Biology Group
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.sihttp://bio.ijs.si/sbl/
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com








Re: [ccp4bb] Announcing a Web Server for the Grade ligand restraints generator.

2012-03-20 Thread dusan turk
Dear Gerard,

This is a marvelous political achievement. For years I have been trying to 
convince CCDC to allow us to
provide PURY restraint database http://pury.ijs.si/;, which is a compilation 
of CSDB, to unrestrained use  to public via a web server interface. Now you 
have achieved this with GRADE. 

Do I understand correctly that the CSD license restrain has been removed for 
anyone providing such a service or is the use of MOGUL the only way how one can 
provide CSDB derived parameters to public?

best wishes,
dusan

Dr. Dusan Turk, Prof.
Head of Structural Biology Group
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.sihttp://bio.ijs.si/sbl/
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com








Re: [ccp4bb] unusual bond lengths in PRODRG cif file

2012-01-11 Thread dusan turk
Hi Guys,

I just want to make aware that apart form PRODRG etc there exist also PURY 
server http://pury.ijs.si/, which offers a possibility to create restraints for 
heteromolecules.   It accepts PDB files, smiles format. Alternatively JME 
editor can be used to draw your compound.

PURY: a database of geometric restraints of hetero compounds for refinement in 
complexes with macromolecular structures
Miha Andrejasˇicˇ, Jure Prazˇnikar and Dusˇan Turk
Acta Cryst. (2008). D64, 1093–1109)

Its parameters are extracted from CSD. Its unlimited use is restrained to the 
CSD license holders, whereas for a small number of downloads its use is not 
restrained.

dusan




Dr. Dusan Turk, Assoc. Prof.
Head of Structural Biology Group
Head of Centre for Protein  and Structure Production
Centre of excellence for Integrated Approaches in Chemistry and Biology of 
Proteins, Scientific Director
Professor of Structural Biology at IPS Jozef Stefan
e-mail: dusan.t...@ijs.sihttp://bio.ijs.si/sbl/
phone: +386 1 477 3857   Dept. of Biochem. Mol. Struct. Biol.
fax:   +386 1 477 3984   Jozef Stefan Institute
Jamova 39, 1 000 Ljubljana,Slovenia
Skype: dusan.turk (voice over internet: www.skype.com