Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Pavel Afonine
Hi,

yes, shifts depend on resolution indeed. See pages 75-77 here:

http://www.phenix-online.org/presentations/latest/pavel_refinement_general.pdf

Pavel

On Fri, Oct 14, 2011 at 7:34 PM, Ed Pozharski wrote:

> On Fri, 2011-10-14 at 23:41 +0100, Phil Evans wrote:
> > I just tried refining a "finished" structure turning off the FreeR
> > set, in Refmac, and I have to say I can barely see any difference
> > between the two sets of coordinates.
>
> The amplitude of the shift, I presume, depends on the resolution and
> data quality.  With a very good 1.2A dataset refined with anisotropic
> B-factors to R~14% what I see is ~0.005A rms shift.  Which is not much,
> however the reported ML DPI is ~0.02A, so perhaps the effect is not that
> small compared to the precision of the model.
>
> On the other hand, the more "normal" example at 1.7A (and very good data
> refining down to R~15%) shows ~0.03A general variation with a variable
> test set.  Again, not much, but the ML DPI in this case is ~0.06A -
> comparable to the variation induced by the choice of the test set.
>
> Cheers,
>
> Ed.
>
> --
> Hurry up, before we all come back to our senses!
>  Julian, King of Lemurs
>


[ccp4bb] How to calculate energy?

2011-10-14 Thread Huayue Li

Dear all,
I obtained 20 peptide models (with lowest energy) calculated by CNS program. 
Now I want to make a table for structure statistics, but I don't know how to 
calculate Ebond Eangle Eimproper Evdw ENOE Ecdih Etotal , r.m.s. deviation from 
experimental constraints, and r.m.s. deviations from idealized geometry.
Where can I get these information? Just from the peptide pdb file output by 
CNS, or using another software? And what is idealized geometry?
 
Thanks!
 
 
Huayue Li, Ph. D
College of Pharmacy
Pusan National University
Geumjeong-gu, Jangjeon-dong
Busan 609-735, Korea
Tel: +82-51-510-2185


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Ed Pozharski
On Fri, 2011-10-14 at 23:41 +0100, Phil Evans wrote:
> I just tried refining a "finished" structure turning off the FreeR
> set, in Refmac, and I have to say I can barely see any difference
> between the two sets of coordinates.

The amplitude of the shift, I presume, depends on the resolution and
data quality.  With a very good 1.2A dataset refined with anisotropic
B-factors to R~14% what I see is ~0.005A rms shift.  Which is not much,
however the reported ML DPI is ~0.02A, so perhaps the effect is not that
small compared to the precision of the model.  

On the other hand, the more "normal" example at 1.7A (and very good data
refining down to R~15%) shows ~0.03A general variation with a variable
test set.  Again, not much, but the ML DPI in this case is ~0.06A -
comparable to the variation induced by the choice of the test set.

Cheers,

Ed.

-- 
Hurry up, before we all come back to our senses!
  Julian, King of Lemurs


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread James Stroud
Each R-free flag corresponds a particular HKL index. Redundancy refers to the 
number of times a reflection corresponding to a given HKL index is observed. 
The final structure factor of a given HKL can be thought of as an average of 
these redundant observations.

Related to your question, someone once mentioned that for each particular space 
group, there should be a preferred R-free assignment. As far as I know, nothing 
tangible ever came of that idea.

James



On Oct 14, 2011, at 5:34 PM, D Bonsor wrote:

> I may be missing something or someone could point out that I am wrong and why 
> as I am curious, but with a highly redundant dataset the difference between 
> refining the final model against the full dataset would be small based upon 
> the random selection of reflections for Rfree? 



Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Edward A. Berry

Now it would be interesting to refine this structure to convergence,
with the original free set. If I understood correctly Ian Tickle has
done essentially this, and the Free R returns essentially to its
original value: the minimum arrived at is independent of starting
point, perhaps within  limitation that one might get caught in a
different false minimum (which is unlikely given the miniscule changes
you see). If that is the case we should stop worrying about
"corrupting" the free set by refining against it or even using it
to make maps in which models will be adjusted.
This is a perennial discussion but I never saw the report that
in fact original free-R is _not_ recoverable by refining to
convergence.

Phil Evans wrote:

I just tried refining a "finished" structure turning off the FreeR set, in 
Refmac, and I have to say I can barely see any difference between the two sets of 
coordinates.

 From this n=1 trial, I can't see that it improves the model significantly, nor 
that it ruins the model irretrievably for future purposes.

I suspect we worry too much about these things

Phil Evans



Now it would be interesting to refine this structure to convergence,
with the original free set. If I understood correctly Ian Tickle has
done essentially this, and the Free R returns essentially to its
original value: the minimum arrived at is independent of starting
point, perhaps within  limitation that one might get caught in a
different false minimum (which is unlikely given the miniscule changes
you see). If that is the case we should stop worrying about
"corrupting" the free set by refining against it or even using it
to make maps in which models will be adjusted.
This is a perennial discussion but I never saw the report that
in fact original free-R is _not_ recoverable by refining to
convergence.
Indeed, perhaps we worry too much about such things.



On 14 Oct 2011, at 21:35, Nat Echols wrote:


On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang  wrote:
Sorry, I don't quite understand your reasoning for how the structure is 
rendered useless if one refined it with all data.

"Useless" was too strong a word (it's Friday, sorry).  I guess simulated 
annealing can address the model-bias issue, but I'm not totally convinced that this 
solves the problem.  And not every crystallographer will run SA every time he/she solves 
an isomorphous structure, so there's a real danger of misleading future users of the PDB 
file.  The reported R-free, of course, is still meaningless in the context of the 
deposited model.

Would your argument also apply to all the structures that were refined before 
R-free existed?

Technically, yes - but how many proteins are there whose only representatives 
in the PDB were refined this way?  I suspect very few; in most cases, a more 
recent model should be available.

-Nat




Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Thomas C. Terwilliger
Dear Gerard,

I'm very happy for the discussion to be on the CCP4 list (or on the IUCR
forums, or both).  I was only trying to not create too much traffic.

All the best,
Tom T

>> Dear Tom,
>>
>>  I am not sure that I feel happy with your invitation that views on
>> such
>> crucial matters as these deposition issues be communicated to you
>> off-list.
>> It would seem much healthier if these views were aired out within the BB.
>> Again!, some will say ... but the difference is that there is now a forum
>> for them, set up by the IUCr, that may eventually turn opinions into some
>> form of action.
>>
>>  I am sure that many subscribers to this BB, and not just you as a
>> member of some committees, would be interested to hear the full variety of
>> views on the desirable and the feasible in these areas, and to express
>> their
>> own for everyone to read and discuss.
>>
>>  Perhaps John Helliwell can elaborate on this and on the newly created
>> forum.
>>
>>
>>  With best wishes,
>>
>>   Gerard.
>>
>> --
>> On Fri, Oct 14, 2011 at 04:56:20PM -0600, Thomas C. Terwilliger wrote:
>>> For those who have strong opinions on what data should be deposited...
>>>
>>> The IUCR is just starting a serious discussion of this subject. Two
>>> committees, the "Data Deposition Working Group", led by John Helliwell,
>>> and the Commission on Biological Macromolecules (chaired by Xiao-Dong
>>> Su)
>>> are working on this.
>>>
>>> Two key issues are (1) feasibility and importance of deposition of raw
>>> images and (2) deposition of sufficient information to fully reproduce
>>> the
>>> crystallographic analysis.
>>>
>>> I am on both committees and would be happy to hear your ideas
>>> (off-list).
>>> I am sure the other members of the committees would welcome your
>>> thoughts
>>> as well.
>>>
>>> -Tom T
>>>
>>> Tom Terwilliger
>>> terwilli...@lanl.gov
>>>
>>>
>>> >> This is a follow up (or a digression) to James comparing test set to
>>> >> missing reflections.  I also heard this issue mentioned before but
>>> was
>>> >> always too lazy to actually pursue it.
>>> >>
>>> >> So.
>>> >>
>>> >> The role of the test set is to prevent overfitting.  Let's say I have
>>> >> the final model and I monitored the Rfree every step of the way and
>>> can
>>> >> conclude that there is no overfitting.  Should I do the final
>>> refinement
>>> >> against complete dataset?
>>> >>
>>> >> IMCO, I absolutely should.  The test set reflections contain
>>> >> information, and the "final" model is actually biased towards the
>>> >> working set.  Refining using all the data can only improve the
>>> accuracy
>>> >> of the model, if only slightly.
>>> >>
>>> >> The second question is practical.  Let's say I want to deposit the
>>> >> results of the refinement against the full dataset as my final model.
>>> >> Should I not report the Rfree and instead insert a remark explaining
>>> the
>>> >> situation?  If I report the Rfree prior to the test set removal, it
>>> is
>>> >> certain that every validation tool will report a mismatch.  It does
>>> not
>>> >> seem that the PDB has a mechanism to deal with this.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Ed.
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Oh, suddenly throwing a giraffe into a volcano to make water is
>>> crazy?
>>> >> Julian, King of
>>> Lemurs
>>> >>
>>
>> --
>>
>>  ===
>>  * *
>>  * Gerard Bricogne g...@globalphasing.com  *
>>  * *
>>  * Global Phasing Ltd. *
>>  * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
>>  * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
>>  * *
>>  ===
>>


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread D Bonsor
I may be missing something or someone could point out that I am wrong and why 
as I am curious, but with a highly redundant dataset the difference between 
refining the final model against the full dataset would be small based upon the 
random selection of reflections for Rfree? 


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Gerard Bricogne
Dear Tom,

 I am not sure that I feel happy with your invitation that views on such
crucial matters as these deposition issues be communicated to you off-list.
It would seem much healthier if these views were aired out within the BB. 
Again!, some will say ... but the difference is that there is now a forum
for them, set up by the IUCr, that may eventually turn opinions into some
form of action.

 I am sure that many subscribers to this BB, and not just you as a
member of some committees, would be interested to hear the full variety of
views on the desirable and the feasible in these areas, and to express their
own for everyone to read and discuss.

 Perhaps John Helliwell can elaborate on this and on the newly created
forum.


 With best wishes,
 
  Gerard.

--
On Fri, Oct 14, 2011 at 04:56:20PM -0600, Thomas C. Terwilliger wrote:
> For those who have strong opinions on what data should be deposited...
> 
> The IUCR is just starting a serious discussion of this subject. Two
> committees, the "Data Deposition Working Group", led by John Helliwell,
> and the Commission on Biological Macromolecules (chaired by Xiao-Dong Su)
> are working on this.
> 
> Two key issues are (1) feasibility and importance of deposition of raw
> images and (2) deposition of sufficient information to fully reproduce the
> crystallographic analysis.
> 
> I am on both committees and would be happy to hear your ideas (off-list). 
> I am sure the other members of the committees would welcome your thoughts
> as well.
> 
> -Tom T
> 
> Tom Terwilliger
> terwilli...@lanl.gov
> 
> 
> >> This is a follow up (or a digression) to James comparing test set to
> >> missing reflections.  I also heard this issue mentioned before but was
> >> always too lazy to actually pursue it.
> >>
> >> So.
> >>
> >> The role of the test set is to prevent overfitting.  Let's say I have
> >> the final model and I monitored the Rfree every step of the way and can
> >> conclude that there is no overfitting.  Should I do the final refinement
> >> against complete dataset?
> >>
> >> IMCO, I absolutely should.  The test set reflections contain
> >> information, and the "final" model is actually biased towards the
> >> working set.  Refining using all the data can only improve the accuracy
> >> of the model, if only slightly.
> >>
> >> The second question is practical.  Let's say I want to deposit the
> >> results of the refinement against the full dataset as my final model.
> >> Should I not report the Rfree and instead insert a remark explaining the
> >> situation?  If I report the Rfree prior to the test set removal, it is
> >> certain that every validation tool will report a mismatch.  It does not
> >> seem that the PDB has a mechanism to deal with this.
> >>
> >> Cheers,
> >>
> >> Ed.
> >>
> >>
> >>
> >> --
> >> Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
> >> Julian, King of Lemurs
> >>

-- 

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Thomas C. Terwilliger
For those who have strong opinions on what data should be deposited...

The IUCR is just starting a serious discussion of this subject. Two
committees, the "Data Deposition Working Group", led by John Helliwell,
and the Commission on Biological Macromolecules (chaired by Xiao-Dong Su)
are working on this.

Two key issues are (1) feasibility and importance of deposition of raw
images and (2) deposition of sufficient information to fully reproduce the
crystallographic analysis.

I am on both committees and would be happy to hear your ideas (off-list). 
I am sure the other members of the committees would welcome your thoughts
as well.

-Tom T

Tom Terwilliger
terwilli...@lanl.gov


>> This is a follow up (or a digression) to James comparing test set to
>> missing reflections.  I also heard this issue mentioned before but was
>> always too lazy to actually pursue it.
>>
>> So.
>>
>> The role of the test set is to prevent overfitting.  Let's say I have
>> the final model and I monitored the Rfree every step of the way and can
>> conclude that there is no overfitting.  Should I do the final refinement
>> against complete dataset?
>>
>> IMCO, I absolutely should.  The test set reflections contain
>> information, and the "final" model is actually biased towards the
>> working set.  Refining using all the data can only improve the accuracy
>> of the model, if only slightly.
>>
>> The second question is practical.  Let's say I want to deposit the
>> results of the refinement against the full dataset as my final model.
>> Should I not report the Rfree and instead insert a remark explaining the
>> situation?  If I report the Rfree prior to the test set removal, it is
>> certain that every validation tool will report a mismatch.  It does not
>> seem that the PDB has a mechanism to deal with this.
>>
>> Cheers,
>>
>> Ed.
>>
>>
>>
>> --
>> Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
>> Julian, King of Lemurs
>>


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Phil Evans
I just tried refining a "finished" structure turning off the FreeR set, in 
Refmac, and I have to say I can barely see any difference between the two sets 
of coordinates.

From this n=1 trial, I can't see that it improves the model significantly, nor 
that it ruins the model irretrievably for future purposes.   

I suspect we worry too much about these things

Phil Evans

On 14 Oct 2011, at 21:35, Nat Echols wrote:

> On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang  wrote:
> Sorry, I don't quite understand your reasoning for how the structure is 
> rendered useless if one refined it with all data.
> 
> "Useless" was too strong a word (it's Friday, sorry).  I guess simulated 
> annealing can address the model-bias issue, but I'm not totally convinced 
> that this solves the problem.  And not every crystallographer will run SA 
> every time he/she solves an isomorphous structure, so there's a real danger 
> of misleading future users of the PDB file.  The reported R-free, of course, 
> is still meaningless in the context of the deposited model.
> 
> Would your argument also apply to all the structures that were refined before 
> R-free existed?
> 
> Technically, yes - but how many proteins are there whose only representatives 
> in the PDB were refined this way?  I suspect very few; in most cases, a more 
> recent model should be available.
> 
> -Nat


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Ethan Merritt
On Friday, October 14, 2011 02:45:08 pm Ed Pozharski wrote:
> On Fri, 2011-10-14 at 13:07 -0700, Nat Echols wrote:
> > 
> > The benefit of including those extra 5% of data is always minimal 
> 
> And so is probably the benefit of excluding when all the steps that
> require cross-validation have already been performed.  My thinking is
> that excluding data from analysis should always be justified (and in the
> initial stages of refinement, it might be as it prevents overfitting),
> not the other way around.

A model with error bars is more useful than a marginally more
accurate model without error bars, not least because you are probably
taking it on faith that the second model is "more accurate".

Crystallographers were kind of late in realizing that a cross validation
test could be useful in assessing refinement.  What's more, we 
never really learned the whole lesson.  Rather than using the full
test, we use only one blade of the jackknife.  

http://en.wikipedia.org/wiki/Cross-validation_(statistics)#K-fold_cross-validation

The full test would involve running multiple parallel refinements, 
each one omiting a different disjoint set of reflections.  
The ccp4 suite is set up to do this,  since Rfree flags by default run 
from 0-19 and refmac lets you specify which 5% subset is to be omitted
from the current run. Of course, evaluating the end point becomes more
complex than looking at a single number "Rfree".

Surely someone must have done this!  But I can't recall ever reading
an analysis of such a refinement protocol.  
Does anyone know of relevant reports in the literature?

Is there a program or script that will collect K-fold parallel output
models and their residuals to generate a net indicator of model quality?

Ethan

-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
University of Washington, Seattle 98195-7742


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Quyen Hoang

Thanks for the clear explanation. I understood that.
But I was trying to understand how this would negatively affects the  
initial model to render it useless or less useful.
In the scenario that you presented, I would expect a better result  
(better model) if the initial model was refined with all data, thus  
more useful.
Sure, again in your scenario, the "new" structure has seen R-free  
reflections in the equivalent indexes of its replacement model, but  
their intensities should be different anyway, so I am not sure how  
this is bad. Even if the bias is huge, let's say this bias results in  
1% reduction in initial R-free (exaggerating here), how would this  
makes one's model bad or how would this be bad for one's science?
In the end, our objective is to build the best model possible and I  
think that more data would likely result in better model, not the  
other way around. If we can agree that refining a model with all data  
would result in a better model, then wouldn't not doing so constitute  
a compromise of model quality for a more "pure" statistic?


I had not refined a model with all data before (just to keep inline),  
but I  wondered if I was doing the best thing.


Cheers,
Quyen
On Oct 14, 2011, at 5:27 PM, Phil Jeffrey wrote:

Let's say you have two isomorphous crystals of two different protein- 
ligand complexes.  Same protein different ligand, same xtal form.   
Conventionally you'd keep the same free set reflections (hkl values)  
between the two datasets to reduce biasing.  However if the first  
model had been refined against all reflections there is no longer a  
free set for that model, thus all hkl's have seen the atoms during  
refinement, and so your R-free in the second complex is initially  
biased to the model from the first complex. [*]


The tendency is to do less refinement in these sort of isomorphous  
cases than in molecular replacement solutions, because the  
structural changes are usually far less (it is isomorphous after  
all) so there's a risk that the R-free will not be allowed to fully  
float free of that initial bias.  That makes your R-free look better  
than it actually is.


This is rather strongly analogous to using different free sets in  
the two datasets.


However I'm not sure that this is as big of a deal as it is being  
made to sound.  It can be dealt with straightforwardly.  However  
refining against all the data weakens the use of R-free as a  
validation tool for that particular model so the people that like to  
judge structures based on a single number (i.e. R-free) are going to  
be quite put out.


It's also the case that the best model probably *is* the one based  
on a careful last round of refinement against all data, as long as  
nothing much changes.  That would need to be quantified in some  
way(s).


Phil Jeffrey
Princeton

[* Your R-free is also initially model-biased in cases where the  
data are significant non-isomorphous or you're using two different  
xtal forms, to varying extents]





I still don't understand how a structure model refined with all data
would negatively affect the determination and/or refinement of an
isomorphous structure using a different data set (even without  
doing SA

first).

Quyen

On Oct 14, 2011, at 4:35 PM, Nat Echols wrote:


On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang mailto:qqho...@gmail.com>> wrote:

   Sorry, I don't quite understand your reasoning for how the
   structure is rendered useless if one refined it with all data.


"Useless" was too strong a word (it's Friday, sorry). I guess
simulated annealing can address the model-bias issue, but I'm not
totally convinced that this solves the problem. And not every
crystallographer will run SA every time he/she solves an isomorphous
structure, so there's a real danger of misleading future users of  
the
PDB file. The reported R-free, of course, is still meaningless in  
the

context of the deposited model.

   Would your argument also apply to all the structures that were
   refined before R-free existed?


Technically, yes - but how many proteins are there whose only
representatives in the PDB were refined this way? I suspect very  
few;

in most cases, a more recent model should be available.

-Nat






Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Craig A. Bingman
We have obligations that extend beyond simply presenting a "best" model.  

In an ideal world, the PDB would accept two coordinate sets and two sets of 
statistics, one for the last step where the cross-validation set was valid, and 
a final model refined against all the data.  Until there is a clear way to do 
that, and an unambiguous presentation of them to the public, IMO, the gains won 
by refinement against all the data are outweighed by the Confusion that it can 
cause when presenting model and associated statistics to the public.


On Oct 14, 2011, at 3:32 PM, Jan Dohnalek wrote:

> Regarding refinement against all reflections: the main goal of our work is to 
> provide the best possible representation of the experimental data in the form 
> of the structure model. Once the structure building and refinement process is 
> finished keeping the Rfree set separate does not make sense any more. Its 
> role finishes once the last set of changes have been done to the model and 
> verified ...
> 
> J. Dohnalek


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Ed Pozharski
On Fri, 2011-10-14 at 13:07 -0700, Nat Echols wrote:

> You should enter the statistics for the model and data that you
> actually deposit, not statistics for some other model that you might
> have had at one point but which the PDB will never see.  

If you read my post carefully, you'll see that I never suggested
reporting statistics for one model and depositing the other

> Not only does refining against R-free make it impossible to verify and
> validate your structure, it also means that any time you or anyone
> else wants to solve an isomorphous structure by MR using your
> structure as a search model, or continue the refinement with
> higher-resolution data, you will be starting with a model that has
> been refined against all reflections.  So any future refinements done
> with that model against isomorphous data are pre-biased, making your
> model potentially useless.

Frankly, I think you are exaggerating the magnitude of model bias in the
situation that I described.  You assume that the refinement will become
severely unstable after tossing in the test reflections.  Depending on
the resolution etc, the rms shift of the model may vary but if it even
is, say half an angstrom the model hardly becomes useless (and that is
hugely overestimated).  And at least in theory including *all the data*
should make the model more, not less accurate.

> The benefit of including those extra 5% of data is always minimal 

And so is probably the benefit of excluding when all the steps that
require cross-validation have already been performed.  My thinking is
that excluding data from analysis should always be justified (and in the
initial stages of refinement, it might be as it prevents overfitting),
not the other way around.

Cheers,

Ed.

-- 
"Hurry up before we all come back to our senses!"
   Julian, King of Lemurs


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Felix Frolow
Recently we (I mean WE - community) frequently refine structures around 1 
Angstrom resolution.
This is not what  for the Rfree was invented. It was invented to go away with 
3.0-2.8 Angstrom data
in times when people did not possess  facilities good enough to look on the 
electron density maps….
We finish (WE - I again mean - community) the refinement of our structures too 
early.

Dr Felix Frolow   
Professor of Structural Biology and Biotechnology
Department of Molecular Microbiology
and Biotechnology
Tel Aviv University 69978, Israel

Acta Crystallographica F, co-editor

e-mail: mbfro...@post.tau.ac.il
Tel:  ++972-3640-8723
Fax: ++972-3640-9407
Cellular: 0547 459 608

On Oct 14, 2011, at 22:35 , Nat Echols wrote:

> On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang  wrote:
> Sorry, I don't quite understand your reasoning for how the structure is 
> rendered useless if one refined it with all data.
> 
> "Useless" was too strong a word (it's Friday, sorry).  I guess simulated 
> annealing can address the model-bias issue, but I'm not totally convinced 
> that this solves the problem.  And not every crystallographer will run SA 
> every time he/she solves an isomorphous structure, so there's a real danger 
> of misleading future users of the PDB file.  The reported R-free, of course, 
> is still meaningless in the context of the deposited model.
> 
> Would your argument also apply to all the structures that were refined before 
> R-free existed?
> 
> Technically, yes - but how many proteins are there whose only representatives 
> in the PDB were refined this way?  I suspect very few; in most cases, a more 
> recent model should be available.
> 
> -Nat



Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Phil Jeffrey
Let's say you have two isomorphous crystals of two different 
protein-ligand complexes.  Same protein different ligand, same xtal 
form.  Conventionally you'd keep the same free set reflections (hkl 
values) between the two datasets to reduce biasing.  However if the 
first model had been refined against all reflections there is no longer 
a free set for that model, thus all hkl's have seen the atoms during 
refinement, and so your R-free in the second complex is initially biased 
to the model from the first complex. [*]


The tendency is to do less refinement in these sort of isomorphous cases 
than in molecular replacement solutions, because the structural changes 
are usually far less (it is isomorphous after all) so there's a risk 
that the R-free will not be allowed to fully float free of that initial 
bias.  That makes your R-free look better than it actually is.


This is rather strongly analogous to using different free sets in the 
two datasets.


However I'm not sure that this is as big of a deal as it is being made 
to sound.  It can be dealt with straightforwardly.  However refining 
against all the data weakens the use of R-free as a validation tool for 
that particular model so the people that like to judge structures based 
on a single number (i.e. R-free) are going to be quite put out.


It's also the case that the best model probably *is* the one based on a 
careful last round of refinement against all data, as long as nothing 
much changes.  That would need to be quantified in some way(s).


Phil Jeffrey
Princeton

[* Your R-free is also initially model-biased in cases where the data 
are significant non-isomorphous or you're using two different xtal 
forms, to varying extents]





I still don't understand how a structure model refined with all data
would negatively affect the determination and/or refinement of an
isomorphous structure using a different data set (even without doing SA
first).

Quyen

On Oct 14, 2011, at 4:35 PM, Nat Echols wrote:


On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang mailto:qqho...@gmail.com>> wrote:

Sorry, I don't quite understand your reasoning for how the
structure is rendered useless if one refined it with all data.


"Useless" was too strong a word (it's Friday, sorry). I guess
simulated annealing can address the model-bias issue, but I'm not
totally convinced that this solves the problem. And not every
crystallographer will run SA every time he/she solves an isomorphous
structure, so there's a real danger of misleading future users of the
PDB file. The reported R-free, of course, is still meaningless in the
context of the deposited model.

Would your argument also apply to all the structures that were
refined before R-free existed?


Technically, yes - but how many proteins are there whose only
representatives in the PDB were refined this way? I suspect very few;
in most cases, a more recent model should be available.

-Nat




Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Quyen Hoang
I still don't understand how a structure model refined with all data  
would negatively affect the determination and/or refinement of an  
isomorphous structure using a different data set (even without doing  
SA first).


Quyen

On Oct 14, 2011, at 4:35 PM, Nat Echols wrote:

On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang   
wrote:
Sorry, I don't quite understand your reasoning for how the structure  
is rendered useless if one refined it with all data.


"Useless" was too strong a word (it's Friday, sorry).  I guess  
simulated annealing can address the model-bias issue, but I'm not  
totally convinced that this solves the problem.  And not every  
crystallographer will run SA every time he/she solves an isomorphous  
structure, so there's a real danger of misleading future users of  
the PDB file.  The reported R-free, of course, is still meaningless  
in the context of the deposited model.


Would your argument also apply to all the structures that were  
refined before R-free existed?


Technically, yes - but how many proteins are there whose only  
representatives in the PDB were refined this way?  I suspect very  
few; in most cases, a more recent model should be available.


-Nat




Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Nat Echols
On Fri, Oct 14, 2011 at 1:20 PM, Quyen Hoang  wrote:

> Sorry, I don't quite understand your reasoning for how the structure is
> rendered useless if one refined it with all data.
>

"Useless" was too strong a word (it's Friday, sorry).  I guess simulated
annealing can address the model-bias issue, but I'm not totally convinced
that this solves the problem.  And not every crystallographer will run SA
every time he/she solves an isomorphous structure, so there's a real danger
of misleading future users of the PDB file.  The reported R-free, of course,
is still meaningless in the context of the deposited model.

Would your argument also apply to all the structures that were refined
> before R-free existed?


Technically, yes - but how many proteins are there whose only
representatives in the PDB were refined this way?  I suspect very few; in
most cases, a more recent model should be available.

-Nat


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Jan Dohnalek
Regarding refinement against all reflections: the main goal of our work is
to provide the best possible representation of the experimental data in the
form of the structure model. Once the structure building and refinement
process is finished keeping the Rfree set separate does not make sense any
more. Its role finishes once the last set of changes have been done to the
model and verified ...

J. Dohnalek


On Fri, Oct 14, 2011 at 10:23 PM, Craig A. Bingman <
cbing...@biochem.wisc.edu> wrote:

> Recent experience indicates that the PDB is checking these statistics very
> closely for new depositions.  The checks made by the PDB are intended to
> prevent accidents and oversights made by honest people from creeping into
> the database.  "Getting away" with something seems to imply some intention
> to deceive, and that is much more difficult to detect.
>
> On Oct 14, 2011, at 3:09 PM, Robbie Joosten wrote:
>
> The deposited R-free sets in the PDB are quite frequently 'unfree' or the
> wrong set was deposited (checking this is one of the recommendations in the
> VTF report in Structure). So at the moment you would probably get away with
> depositing an unfree R-free set ;)
>
>
>


-- 
Jan Dohnalek, Ph.D
Institute of Macromolecular Chemistry
Academy of Sciences of the Czech Republic
Heyrovskeho nam. 2
16206 Praha 6
Czech Republic

Tel: +420 296 809 390
Fax: +420 296 809 410


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Craig A. Bingman
Recent experience indicates that the PDB is checking these statistics very 
closely for new depositions.  The checks made by the PDB are intended to 
prevent accidents and oversights made by honest people from creeping into the 
database.  "Getting away" with something seems to imply some intention to 
deceive, and that is much more difficult to detect.

On Oct 14, 2011, at 3:09 PM, Robbie Joosten wrote:

> The deposited R-free sets in the PDB are quite frequently 'unfree' or the 
> wrong set was deposited (checking this is one of the recommendations in the 
> VTF report in Structure). So at the moment you would probably get away with 
> depositing an unfree R-free set ;)
> 



Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Quyen Hoang
Sorry, I don't quite understand your reasoning for how the structure  
is rendered useless if one refined it with all data.
Would your argument also apply to all the structures that were refined  
before R-free existed?


Quyen



You should enter the statistics for the model and data that you  
actually deposit, not statistics for some other model that you might  
have had at one point but which the PDB will never see.  Not only  
does refining against R-free make it impossible to verify and  
validate your structure, it also means that any time you or anyone  
else wants to solve an isomorphous structure by MR using your  
structure as a search model, or continue the refinement with higher- 
resolution data, you will be starting with a model that has been  
refined against all reflections.  So any future refinements done  
with that model against isomorphous data are pre-biased, making your  
model potentially useless.


I'm amazed that anyone is still depositing structures refined  
against all data, but the PDB does still get a few.  The benefit of  
including those extra 5% of data is always minimal in every paper  
I've seen that reports such a procedure, and far outweighed by  
having a reliable and relatively unbiased validation statistic that  
is preserved in the final deposition.  (The situation may be  
different for very low resolution data, but those structures are a  
tiny fraction of the PDB.)


-Nat


Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Robbie Joosten

Hi Ed,
 

> This is a follow up (or a digression) to James comparing test set to
> missing reflections. I also heard this issue mentioned before but was
> always too lazy to actually pursue it.
> 
> So.
> 
> The role of the test set is to prevent overfitting. Let's say I have
> the final model and I monitored the Rfree every step of the way and can
> conclude that there is no overfitting. Should I do the final refinement
> against complete dataset?
> 
> IMCO, I absolutely should. The test set reflections contain
> information, and the "final" model is actually biased towards the
> working set. Refining using all the data can only improve the accuracy
> of the model, if only slightly.
Hmm, if your R-free set is small the added value will also be small. If it is 
relatively big, then your previously established optimal weights may no longer 
be optimal. A more elegant thing to would be refine the model with, say, 20 
different 5% R-free sets, deposit the ensemble and report the average R(-free) 
plus a standard deviation. AFAIK, this is what the R-free set numbers that 
CCP4's FREERFLAG generates are for. Of course, in that case you should do 
enough refinement (and perhaps rebuilding) to make sure each R-free set is 
free. 

> The second question is practical. Let's say I want to deposit the
> results of the refinement against the full dataset as my final model.
> Should I not report the Rfree and instead insert a remark explaining the
> situation? If I report the Rfree prior to the test set removal, it is
> certain that every validation tool will report a mismatch. It does not
> seem that the PDB has a mechanism to deal with this.
The deposited R-free sets in the PDB are quite frequently 'unfree' or the wrong 
set was deposited (checking this is one of the recommendations in the VTF 
report in Structure). So at the moment you would probably get away with 
depositing an unfree R-free set ;)
 
Cheers,
Robbie
 
 
> 
> Cheers,
> 
> Ed.
> 
> 
> 
> -- 
> Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
> Julian, King of Lemurs
  

Re: [ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Nat Echols
On Fri, Oct 14, 2011 at 12:52 PM, Ed Pozharski wrote:

> The second question is practical.  Let's say I want to deposit the
> results of the refinement against the full dataset as my final model.
> Should I not report the Rfree and instead insert a remark explaining the
> situation?  If I report the Rfree prior to the test set removal, it is
> certain that every validation tool will report a mismatch.  It does not
> seem that the PDB has a mechanism to deal with this.
>

You should enter the statistics for the model and data that you actually
deposit, not statistics for some other model that you might have had at one
point but which the PDB will never see.  Not only does refining against
R-free make it impossible to verify and validate your structure, it also
means that any time you or anyone else wants to solve an isomorphous
structure by MR using your structure as a search model, or continue the
refinement with higher-resolution data, you will be starting with a model
that has been refined against all reflections.  So any future refinements
done with that model against isomorphous data are pre-biased, making your
model potentially useless.

I'm amazed that anyone is still depositing structures refined against all
data, but the PDB does still get a few.  The benefit of including those
extra 5% of data is always minimal in every paper I've seen that reports
such a procedure, and far outweighed by having a reliable and relatively
unbiased validation statistic that is preserved in the final deposition.
 (The situation may be different for very low resolution data, but those
structures are a tiny fraction of the PDB.)

-Nat


[ccp4bb] should the final model be refined against full datset

2011-10-14 Thread Ed Pozharski
This is a follow up (or a digression) to James comparing test set to
missing reflections.  I also heard this issue mentioned before but was
always too lazy to actually pursue it.

So.

The role of the test set is to prevent overfitting.  Let's say I have
the final model and I monitored the Rfree every step of the way and can
conclude that there is no overfitting.  Should I do the final refinement
against complete dataset?

IMCO, I absolutely should.  The test set reflections contain
information, and the "final" model is actually biased towards the
working set.  Refining using all the data can only improve the accuracy
of the model, if only slightly.

The second question is practical.  Let's say I want to deposit the
results of the refinement against the full dataset as my final model.
Should I not report the Rfree and instead insert a remark explaining the
situation?  If I report the Rfree prior to the test set removal, it is
certain that every validation tool will report a mismatch.  It does not
seem that the PDB has a mechanism to deal with this.

Cheers,

Ed.



-- 
Oh, suddenly throwing a giraffe into a volcano to make water is crazy?
Julian, King of Lemurs


Re: [ccp4bb] Ice rings...

2011-10-14 Thread James Holton


Automated outlier rejection in scaling will handle a lot of things, 
including ice.  Works better with high multiplicity.  Unless, of course, 
your ice rings are "even", then any integration error due to ice will be 
the same for all the symmetry mates and the scaling program will be none 
the wiser.  That said, the integration programs these days tend to have 
pretty sensible defaults for rejecting spots that have "weird" 
backgrounds.  Plenty of structures get solved from data that has 
horrible-looking ice rings using just the defaults.  In fact, I am 
personally unconvinced that ice rings are a significant problem in and 
of themselves.  More often, they are simply an indication that something 
else is wrong, like the crystal warmed up at some point.


  Nevertheless, if you suspect your ice rings are causing a problem, 
you can try to do something about them.  The "deice" program already 
mentioned sounds cool, but if you just want to try something quick, 
excluding the resolution ranges of your ice rings can be done in sftools 
like this:

select resol > 3.89
select resol < 3.93
absent col F SIGF DANO SIGDANO if col F > 0
and repeat this for each resolution range you want to exclude.  Best to 
get these ranges from your integration program's graphics display.


In mosflm, you can put "EXCLUDE ICE" on either the "AUTOINDEX" or 
"RESOLUTION" keywords and have any spots on the canonical hexagonal ice 
spacings removed automatically.  The problem with excluding resolution 
ranges, of course, is that your particular "ice rings" may not be where 
they are supposed to be.  Either due to something physical, like the 
cooling rate, or something artificial, like an error in the camera 
parameters.  It is also possible that what you think are "ice rings" are 
actually "salt rings".  Some salts will precipitate out upon 
cryo-cooling.  Large ice/salt crystals can also produce a lot of 
non-Bragg scatter, which means that you can get sharp features far away 
from the resolution range you expect.  On the other hand, if you have 
cubic ice instead of hexagonal ice (very common in MX samples), then 
there are no rings at 3.91A, 3.45A, 2.68A and throwing out these 
resolution ranges would be a waste.


Another way to exclude ice is to crank up background-based rejection 
criteria.  In denzo/HKL2K, you do this with the "reject fraction" 
keyword, and in mosflm, REJECT MINBG does pretty much the same thing.  
There are lots of rejection options in integration programs, and which 
one works in your particular case depends on what your ice rings look 
like.  Noone has written a machine-vision type program that can 
recognize and handle all the cases. You will need to play with these 
options until the spots you "don't like" turn red in the display.


Of course, the best way to deal with ice rings would be to inspect each 
and every one of the spots you have near ice rings and decide on its 
intensity manually.  Then edit the hkl file.



Which brings me to perhaps a more important point: What, exactly, is the 
"problem" you are having that makes you think the ice rings are to 
blame?  Can't get an MR solution?  Can't get MAD/SAD phases?


Ice has a bad rep in MX, and an undeserved one IMHO.  In fact, by 
controlling either cryoprotectant concentration or cooling rate 
carefully, you can achieve a mixture of amorphous and cubic ice, and 
this mixture has a specific volume (density) intermediate between the 
two.  Many crystals diffract much better when you are able to match the 
specific volume of the stuff in the solvent channels to the specific 
volume protein lattice is "trying" to achieve on its own.  A great deal 
of effort has gone into characterizing this phenomenon (authors: Juers, 
Weik, Warkentin, Thorne and many others), but I often meet frustrated 
cryo-screeners who seem to have never heard of any of it!


 In general, the automated "outlier rejection" protocols employed by 
modern software have taken care of most of the problems ice rings 
introduce.  For example, difference Pattersons are VERY sensitive to 
outliers, and all it takes is one bad spot to give you huge ripples that 
swamp all you peaks, but every heavy-atom finding program I am aware of 
calculates Pattersons only after fist doing an "outlier rejection" 
step.  You might also think that ice rings would mess up your preciously 
subtle anomalous differences, but again, outlier rejection to the rescue.


Now, that said, depending on automated outlier rejection to save you is 
of course a questionable policy, but it is an equally bad idea to 
pretend that it doesn't exist either.  It is funny how in MX we are all 
ready to grab our torch and pitchfork if we hear of someone manually 
editing their hkl files to get rid of reflections they "don't like", but 
as long as "the software" does it, it is okay.  Plausible deniability 
runs deep.



-James Holton
MAD Scientist


On 10/11/2011 8:16 AM, Francis E Reyes wrote:

All,


So I have two intense ic

Re: [ccp4bb] Ice rings... [maps and missing reflections]

2011-10-14 Thread James Holton

On 10/11/2011 12:33 PM, Garib N Murshudov wrote:

We need better way of estimating "unobserved" reflections.


Indeed we do!  Because this appears to be the sum total of how the 
correctness of the structure is judged.  It is easy to forget I think 
that from the "point of view" of the refinement program, all reflections 
flagged as belonging to the "free" set are, in effect, "missing".   So 
Rfree is really just a score for how well DFc agrees with Fobs?


-James Holton
MAD Scientist


Re: [ccp4bb] data processing problem with ice rings

2011-10-14 Thread James Holton
These rings are nanocrystalline cubic ice (ice Ic, as opposed to the 
"usual" ice Ih).  It is an interesting substance in that noone has ever 
prepared a large single crystal of it.  In fact, for very small crystals 
it can be hard to distinguish it from amorphous ice (or "glassy 
water").  The three main rings that you see from ice Ic coincide almost 
exactly with the centroids of the three main diffuse rings of glassy 
water, and as the ice Ic crystals get smaller, the rings get fatter 
(Scherrer broadening).  You can even measure the size of the 
crystallites by measuring the width of the rings.  At the limit of 1-2 
unit cells wide, the diffraction pattern of ice Ic powder looks almost 
exactly like that of glassy water, so I suppose one could say that there 
is a continuum of phases between the two.


And yes, there are crystals that "like" a certain mixture of cubic ice 
and amorphous water in their solvent channels.  Other's don't like it at 
all.  But I agree with JS below that the problem here is not the ice 
rings.  Probably overlaps?  Best to look only at spots inside the 3.8A 
circle until you figure out what is going on.


-James Holton
MAD Scientist

On 10/13/2011 11:20 PM, James Stroud wrote:

First of all, are you sure those are ice rings? They do not look typical. I 
think you might have salt crystals from dehydration *before* freezing. 
Otherwise, I think your freezing went well. Maybe try a humidity controlled 
environment when you freeze.

Second, I'm not so sure the bad stats come from the contaminating rings. The 
lattice seems to have some sort of problem, like a split lattice. You might be 
able to tackle this problem by increasing your spot size or skewing it's shape 
to compensate for the split. You need to investigate several images throughout 
the run to see whether and how to manipulate your spot size. Sometimes, the 
split lengthens the spots in the direction of the phi axis and you get lucky. 
But I think the phi axis might be horizontal in this picture, which makes 
things a little trickier. From one image, it is difficult to tell the pathology 
of this crystal.

In principle, if you can accurately measure the most high-resolution spots 
visible (which appear to be about 1.9 Å, guessing from your log file) then you 
will have a pretty good data set, even with the contaminating rings.

Personally, I'd use Denzo for this data, but I don't know what is vogue with 
the community right now. I still use O, so my tastes might be somewhat 
antiquated.

James



On Oct 13, 2011, at 11:12 PM, ChenTiantian wrote:


Hi there,
I am processing a dataset which has bad ice rings (as you can see in the attach 
png file).
I tried both XDS and imosflm, and got similar results, it seems that adding " 
EXCLUDE_RESOLUTION_RANGE" cannot get rid of the effects of the ice rings.
the following is part of the CORRECT.LP which is the second attached file, you 
can find more details there.

   SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE>= -3.0 AS FUNCTION OF RESOLUTION
  RESOLUTION NUMBER OF REFLECTIONSCOMPLETENESS R-FACTOR  R-FACTOR 
COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
LIMIT OBSERVED  UNIQUE  POSSIBLE OF DATA   observed  expected   
   Corr

  4.24   371525537  5545   99.9%  46.9% 52.7%
371502.4850.8%19.4%   -28%   0.5135136
  3.01   553449002  9840   91.5%  62.7% 65.1%
551161.7668.3%48.1%   -28%   0.5207760
  2.46   84636   12699 12703  100.0%  67.4% 84.7%
846341.5573.0%54.2%   -19%   0.513   12104
  2.13   97910   14743 14987   98.4% 254.5%199.3%
979080.16   276.2%  4899.9%   -23%   0.473   14037
  1.90  110260   16846 16940   99.4% 299.2%303.3%   
1102450.06   325.0%   -99.9%   -17%   0.422   15995
  1.74  118354   18629 18744   99.4%1062.0%   1043.6%   
118317   -0.20  1156.4%   -99.9%   -13%   0.380   17414
  1.61  122958   20193 20331   99.3% 967.5%   1571.1%   
1228680.10  1059.7%   987.3%-2%   0.402   18348
  1.51  125075   21554 21794   98.9% 838.9%   1355.1%   
1249330.08   922.6%  1116.9%-1%   0.402   18977
  1.42   72057   17042 23233   73.4% 640.8%775.3%
703910.08   732.5%   826.7%-8%   0.425   10003
 total  823746  136245144117   94.5% 166.4%166.7%   
8215620.40   181.1%   296.7%   -15%   0.435  119774

Note that I/SIGMA of each resolution shell is<2.5, so how should I do to 
process the dataset properly? Any suggestion about this super ice rings?
Thanks!

Tiantian

--
Shanghai Institute of Materia Medica, Chinese Academy of Sciences
Address: Room 101, 646 Songtao Road, Zhangjiang Hi-Tech Park,
Shanghai, 201203



[ccp4bb] Quips about "stunning" software and the first structure it helped solve

2011-10-14 Thread Gerard DVD Kleywegt

Hi all,

The Protein Data Bank in Europe (PDBe; pdbe.org) regularly produces Quips, 
short stories about QUite Interesting Pdb Structures (pdbe.org/quips). Quips 
address biologically interesting aspects of one or more PDB entries, coupled 
with interactive graphics views and often a mini-tutorial or suggestions for 
further exploration using PDBe services and resources.


Today another Quips episode was released. It looks back at the first crystal 
structure that was solved with the program Phaser and also tries to explain in 
(almost) layman's terms how Molecular Replacement works. The accompanying 
mini-tutorial shows you how to do multiple structure superimposition using 
PDBeFold (SSM).


The Quips story can be found here: http://pdbe.org/quips?story=Phaser

There is also an RSS feed that informs you whenever there is a new Quips 
article available. For links to this and several other feeds, see 
http://pdbe.org/rss


---

If you have an interesting structure whose story you would like to tell (with 
our help) in the form of a Quips episode, please contact us at p...@ebi.ac.uk


--Gerard

---
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
ger...@ebi.ac.uk . pdbe.org
Secretary: Pauline Haslam  pdbe_ad...@ebi.ac.uk


[ccp4bb] First contours of a vision for the future of validation at the PDB

2011-10-14 Thread Gerard DVD Kleywegt

(Posted on behalf of wwPDB)

The Worldwide Protein Data Bank (wwPDB; wwpdb.org) is pleased to direct PDB 
depositors and users to the recommendations of the wwPDB X-ray Validation Task 
Force (VTF) that were published in the journal Structure this week (2011, vol. 
19: 1395-1412; http://www.cell.com/structure/abstract/S0969-2126(11)00285-1).


The wwPDB X-ray VTF was convened in 2008 to collect expert recommendations and 
develop consensus on validation methods that should be applied to crystal 
structures (models and data) in the PDB and to identify software applications 
to perform these validation tasks. These recommendations are the basis of a 
new validation suite that will be part of the new Common Tool for Deposition 
and Annotation (D&A) that is currently being developed by the wwPDB partners. 
The D&A tool and the X-ray validation pipeline will go into production by the 
end of 2012 at all wwPDB deposition sites (RCSB PDB, PDBe, PDBj and BMRB). 
From that moment in time on, depositors of X-ray crystal structures at the PDB 
will be provided with a detailed validation report. Such reports can be 
submitted to journals to accompany manuscripts describing new structures, and 
several publishers are working towards making such reports mandatory. Once the 
D&A tool is in production, the wwPDB partners also plan to provide the 
validation pipeline as a server, allowing crystallographers to assess their 
models before deposition and publication. Additional VTFs have been convened 
for NMR (by wwPDB) and 3DEM (by EMDataBank).


The wwPDB greatly appreciates the efforts of the authors of the X-ray VTF 
report: Randy J. Read, Paul D. Adams, W. Bryan Arendall III, Axel T. Brunger, 
Paul Emsley, Robbie P. Joosten, Gerard J. Kleywegt, Eugene B. Krissinel, 
Thomas Ltteke, Zbyszek Otwinowski, Anastassis Perrakis, Jane S. Richardson, 
William H. Sheffler, Janet L. Smith, Ian J. Tickle, Gert Vriend and Peter H. 
Zwart.


--

--Gerard

---
Gerard J. Kleywegt, PDBe, EMBL-EBI, Hinxton, UK
ger...@ebi.ac.uk . pdbe.org
Secretary: Pauline Haslam  pdbe_ad...@ebi.ac.uk


[ccp4bb]

2011-10-14 Thread kavya
Respected Sir,

I am sorry I wrote it wrongly, its resolution-
independent X-ray weight rather. I use ccp4i
so the input for the weight is what i mentioned
previously-
"Refinement parameters- weighing term (when
auto weighing is turned off)" in refmac.

Thanking you
With Regads
M. Kavyashree
-Ian Tickle  wrote: -

To: ka...@rishi.serc.iisc.ernet.in
From: Ian Tickle 
Date: 10/14/2011 04:34PM
Cc: CCP4 bulletin board 
Subject: Re:

> Yes, the weight mentioned in the paper was
> weight matrix, but the one i used was the
> option under "Refinement parameters- weighing
> term (when auto weighing is turned off)".
> But If I really wasnt to change the weight matrix
> where should I change (in the code?)?

No, the weights referred to in the paper are definitely the ones given
as "WEIGHT AUTO x".  I can say that with confidence because I never
use "WEIGHT MATRIX x" for reasons I explained.  I think what I said is
that it's _related_ to the matrix weight, which it obviously is by
some constant but unknown factor.

You don't have to change any code (that has been done for you!), but
if you're using CCP4I (sorry I don't so I can't give you precise
instructions), you probably have to edit the script before submission
(there should be a button in the "Run" menu for that).

Cheers

-- Ian

> No, I dint mean a big difference, not in the
> coordinates, but the values of R-factors and
> other terms. I thought it was quite different.
> So you mean that it is not of much concern?

I would say that it's not a big difference, just tightening up of the
geometry, as I said.

Cheers

-- Ian


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


[ccp4bb]

2011-10-14 Thread Ian Tickle
> Yes, the weight mentioned in the paper was
> weight matrix, but the one i used was the
> option under "Refinement parameters- weighing
> term (when auto weighing is turned off)".
> But If I really wasnt to change the weight matrix
> where should I change (in the code?)?

No, the weights referred to in the paper are definitely the ones given
as "WEIGHT AUTO x".  I can say that with confidence because I never
use "WEIGHT MATRIX x" for reasons I explained.  I think what I said is
that it's _related_ to the matrix weight, which it obviously is by
some constant but unknown factor.

You don't have to change any code (that has been done for you!), but
if you're using CCP4I (sorry I don't so I can't give you precise
instructions), you probably have to edit the script before submission
(there should be a button in the "Run" menu for that).

Cheers

-- Ian

> No, I dint mean a big difference, not in the
> coordinates, but the values of R-factors and
> other terms. I thought it was quite different.
> So you mean that it is not of much concern?

I would say that it's not a big difference, just tightening up of the
geometry, as I said.

Cheers

-- Ian


[ccp4bb]

2011-10-14 Thread kavya
Respected Sir,

Yes, the weight mentioned in the paper was
weight matrix, but the one i used was the
option under "Refinement parameters- weighing
term (when auto weighing is turned off)".
But If I really wasnt to change the weight matrix
where should I change (in the code?)?

No, I dint mean a big difference, not in the
coordinates, but the values of R-factors and
other terms. I thought it was quite different.
So you mean that it is not of much concern?


Thanking you
With Regards
M. Kavyashree

-Ian Tickle  wrote: -

To: ka...@rishi.serc.iisc.ernet.in
From: Ian Tickle 
Date: 10/14/2011 04:00PM
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] Optimisation of weights

Hi your X-ray weight of .08 seems very small, the optimal value is
normally in the range 1 to 4 (I usually set it initially at the
median, i.e. 2.5).  But which weight keyword did you use "WEIGHT
MATRIX .08" or "WEIGHT AUTO .08" (the latter is I think undocumented,
so I'm guessing the first)?  Anyway I would strongly advise the
latter: the difference is that the MATRIX weight is on a completely
arbitrary scale, whereas the AUTO weight is at least relative to the
theoretical value of 1 (even though the optimal value may not be 1 in
practice, at least your initial guess will be in the same ball park).
Note that what Refmac calls "automatic weighting" is not the same as
what X-PLOR, CNS & phenix call "automatic weighting" (at least that's
my understanding).  "WEIGHT AUTO" in Refmac is the same as "WEIGHT
AUTO 10", whereas auto-weighting in X-PLOR corresponds to "WEIGHT AUTO
1" in Refmac.  Not surprisingly these give quite different results!

The optimal B factor weight is also around 1, see the paper for
typical values.

I'm still not clear precisely what you meant by ""there was quite a
difference".  I don't see that big a difference between the 2 runs,
just a slight tightening up of the geometry.  Are you saying you see
big differences in the refined co-ordinates?  That would be a cause
for concern.

Cheers

-- Ian


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: [ccp4bb] "Insufficient virtual memory"

2011-10-14 Thread Ian Tickle
> Try the GNU (compiler) and see what it says. ;)

Hi Francois - I won't bore you with the long list of compiler errors
that gfortran gives with my code (ifort compiles the identical code
without error and up until now it has worked just fine on both 32 & 64
bit machines as long as don't try to allocate > 2Gb).

I think we'll have to splash out on an Intel license for a 64-bit
machine (thanks for the low-down on the bugs, Harry).

Anyway thanks to all for the suggestions.

Cheers

-- Ian


Re: [ccp4bb] Optimisation of weights

2011-10-14 Thread Ian Tickle
Hi your X-ray weight of .08 seems very small, the optimal value is
normally in the range 1 to 4 (I usually set it initially at the
median, i.e. 2.5).  But which weight keyword did you use "WEIGHT
MATRIX .08" or "WEIGHT AUTO .08" (the latter is I think undocumented,
so I'm guessing the first)?  Anyway I would strongly advise the
latter: the difference is that the MATRIX weight is on a completely
arbitrary scale, whereas the AUTO weight is at least relative to the
theoretical value of 1 (even though the optimal value may not be 1 in
practice, at least your initial guess will be in the same ball park).
Note that what Refmac calls "automatic weighting" is not the same as
what X-PLOR, CNS & phenix call "automatic weighting" (at least that's
my understanding).  "WEIGHT AUTO" in Refmac is the same as "WEIGHT
AUTO 10", whereas auto-weighting in X-PLOR corresponds to "WEIGHT AUTO
1" in Refmac.  Not surprisingly these give quite different results!

The optimal B factor weight is also around 1, see the paper for typical values.

I'm still not clear precisely what you meant by ""there was quite a
difference".  I don't see that big a difference between the 2 runs,
just a slight tightening up of the geometry.  Are you saying you see
big differences in the refined co-ordinates?  That would be a cause
for concern.

Cheers

-- Ian

On Fri, Oct 14, 2011 at 11:19 AM,   wrote:
> Respected Sir,
>
> For one of the structures that I did optimisation
> had values - (resolution of the data - 2.35Ang)
>
> Before optimization- (Bfactor weight=1.0, X-ray Weight - auto)
> R factor  0.2362
> R free    0.2924
> -LLfree   7521.8
> rmsBOND   0.0160
> zBOND     0.660
>
> After optimisation- (B-factor weight=0.2, X-ray Weight - 0.08)
> R factor  0.2327
> R free    0.2882
> -LLfree   7495.7
> rmsBOND   0.0111
> zBOND     0.460
>
> Also can you tell me what is the limit for B-factor weight hat can be varied.
>
> Thanking you
> With Regards
> M. Kavyashree
>
>
> Sorry I just re-read your last email and realised and didn't read it
> properly the first time.  But what I said still stands: you can of
> course try to optimise the weights at an early stage (before adding
> waters say), there's no harm doing that, but there's also not much
> point since you'll have to do it all again with the complete model,
> since adding a lot of waters will undoubtedly change the optimal
> weights.  So I just leave the weight optimisation until the model is
> complete.  As long as the initial weights are "in the same ball park",
> so that your RMSZ(bonds) is around 0.5 for typical resolutions (a bit
> lower for low resolution, a bit higher for very high resolution) it
> won't affect interpretation of maps etc.
>
> Cheers
>
> -- Ian
>
> On Fri, Oct 14, 2011 at 9:37 AM, Ian Tickle  wrote:
>> It must be the same complete model that you refined previously, I
>> doubt that it will give the correct answer if you leave out the waters
>> for example.
>>
>> You say "there was quite a difference".  Could you be more specific:
>> what were the values of the weights, R factors and RMSZ(bonds/angles)
>> before and after weight optimisation?
>>
>> Cheers
>>
>> -- Ian
>>
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>


Re: [ccp4bb] Optimisation of weights

2011-10-14 Thread kavya
Respected Sir,

For one of the structures that I did optimisation
had values - (resolution of the data - 2.35Ang)

Before optimization- (Bfactor weight=1.0, X-ray Weight - auto)
R factor  0.2362
R free0.2924
-LLfree   7521.8
rmsBOND   0.0160
zBOND 0.660

After optimisation- (B-factor weight=0.2, X-ray Weight - 0.08)
R factor  0.2327
R free0.2882
-LLfree   7495.7
rmsBOND   0.0111
zBOND 0.460

Also can you tell me what is the limit for B-factor weight hat can be varied.

Thanking you
With Regards
M. Kavyashree


Sorry I just re-read your last email and realised and didn't read it
properly the first time.  But what I said still stands: you can of
course try to optimise the weights at an early stage (before adding
waters say), there's no harm doing that, but there's also not much
point since you'll have to do it all again with the complete model,
since adding a lot of waters will undoubtedly change the optimal
weights.  So I just leave the weight optimisation until the model is
complete.  As long as the initial weights are "in the same ball park",
so that your RMSZ(bonds) is around 0.5 for typical resolutions (a bit
lower for low resolution, a bit higher for very high resolution) it
won't affect interpretation of maps etc.

Cheers

-- Ian

On Fri, Oct 14, 2011 at 9:37 AM, Ian Tickle  wrote:
> It must be the same complete model that you refined previously, I
> doubt that it will give the correct answer if you leave out the waters
> for example.
>
> You say "there was quite a difference".  Could you be more specific:
> what were the values of the weights, R factors and RMSZ(bonds/angles)
> before and after weight optimisation?
>
> Cheers
>
> -- Ian
>


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: [ccp4bb] "Insufficient virtual memory"

2011-10-14 Thread Francois Berenger

On 10/14/2011 06:31 PM, Ian Tickle wrote:

Hello all, some Fortran developer out there must know the answer to
this one.  I'm getting a "forrtl: severe (41): insufficient virtual
memory" error when allocating dynamic memory from a F95 program
compiled with Intel Fortran v11.1.059.  The program was compiled on an
old ia-32 Linux box with 1Gb RAM + 2Gb swap (I only have one Intel
license to compile on this machine), but I'm running it on a brand new
x86-64 box with 12Gb RAM + 8Gb swap.  This should be ample: the
program's maximum total memory requirement (code + static data +
dynamic data) should be no more than 3Gb.

My question is: what do I have to do to make it work?  According to
the ifort man page I need to specify "-mcmodel=medium -shared-intel".

It says: "If your program has COMMON blocks and local data with a
total size smaller than 2GB -mcmodel=small is sufficient.  COMMONs
larger than 2GB require mcmodel=medium or -mcmodel=large.  Allocation
of memory larger than 2GB can be done with any setting of -mcmodel."

I'm a bit confused about the difference here between COMMONS>  2Gb
(which I don't have) and "allocation of memory">  2Gb (which I assume
I do).

When I try setting -mcmodel=medium (and -shared-intel) I get "ifort:
command line warning #10148: option '-mcmodel' not supported".  Is
this telling me that I have to compile on the 64-bit machine?
Whatever happened to cross-compilation?

All suggestions greatly appreciated!


Try the GNU (compiler) and see what it says. ;)


[ccp4bb] "Insufficient virtual memory"

2011-10-14 Thread Ian Tickle
Hello all, some Fortran developer out there must know the answer to
this one.  I'm getting a "forrtl: severe (41): insufficient virtual
memory" error when allocating dynamic memory from a F95 program
compiled with Intel Fortran v11.1.059.  The program was compiled on an
old ia-32 Linux box with 1Gb RAM + 2Gb swap (I only have one Intel
license to compile on this machine), but I'm running it on a brand new
x86-64 box with 12Gb RAM + 8Gb swap.  This should be ample: the
program's maximum total memory requirement (code + static data +
dynamic data) should be no more than 3Gb.

My question is: what do I have to do to make it work?  According to
the ifort man page I need to specify "-mcmodel=medium -shared-intel".

It says: "If your program has COMMON blocks and local data with a
total size smaller than 2GB -mcmodel=small is sufficient.  COMMONs
larger than 2GB require mcmodel=medium or -mcmodel=large.  Allocation
of memory larger than 2GB can be done with any setting of -mcmodel."

I'm a bit confused about the difference here between COMMONS > 2Gb
(which I don't have) and "allocation of memory" > 2Gb (which I assume
I do).

When I try setting -mcmodel=medium (and -shared-intel) I get "ifort:
command line warning #10148: option '-mcmodel' not supported".  Is
this telling me that I have to compile on the 64-bit machine?
Whatever happened to cross-compilation?

All suggestions greatly appreciated!

-- Ian


[ccp4bb]

2011-10-14 Thread Ian Tickle
Sorry I just re-read your last email and realised and didn't read it
properly the first time.  But what I said still stands: you can of
course try to optimise the weights at an early stage (before adding
waters say), there's no harm doing that, but there's also not much
point since you'll have to do it all again with the complete model,
since adding a lot of waters will undoubtedly change the optimal
weights.  So I just leave the weight optimisation until the model is
complete.  As long as the initial weights are "in the same ball park",
so that your RMSZ(bonds) is around 0.5 for typical resolutions (a bit
lower for low resolution, a bit higher for very high resolution) it
won't affect interpretation of maps etc.

Cheers

-- Ian

On Fri, Oct 14, 2011 at 9:37 AM, Ian Tickle  wrote:
> It must be the same complete model that you refined previously, I
> doubt that it will give the correct answer if you leave out the waters
> for example.
>
> You say "there was quite a difference".  Could you be more specific:
> what were the values of the weights, R factors and RMSZ(bonds/angles)
> before and after weight optimisation?
>
> Cheers
>
> -- Ian
>
> On Fri, Oct 14, 2011 at 8:42 AM,   wrote:
>> Respected Sir,
>>
>> Thank you for your clarification. I had adopted this
>> method recently. My doubt was if we have to optimize
>> these two parameters during refinement, should we have
>> the whole model along with water and ligands or only
>> protein with few water positioning is enough. The reason
>> why I am asking because there was quite a difference
>> when I refined the same structure without the optimization
>> and with optimization of these two parameters.
>>
>> Thanking you
>> With regards
>> M. Kavyashree
>>
>> -CCP4 bulletin board  wrote: -
>>
>>    To: CCP4BB@JISCMAIL.AC.UK
>>    From: Ian Tickle
>>    Sent by: CCP4 bulletin board
>>    Date: 10/14/2011 12:34PM
>>    Subject: Re: [ccp4bb] Optimisation of weights
>>
>>    Hi Kavya
>>
>>    The resolutions of the structures mentioned in the paper were only
>>    examples, the Rfree/-LLfree minimisation method (which are actually
>>    due to Axel Brunger & Gerard Bricogne respectively) does not depend on
>>    resolution.
>>
>>    If the structures are already solved & refined, you don't need to do
>>    any model building, it should be within the radius of convergence with
>>    the new weights - it's only a small adjustment after all.
>>
>>    Cheers
>>
>>    -- Ian
>>
>>    On Fri, Oct 14, 2011 at 6:12 AM,   wrote:
>>    > Dear users,
>>    >
>>    > Can the optimization of the X-ray weighing factor
>>    > and B-factor (overall wt) as mentioned in the paper
>>    > Acta Cryst. (2007). D63, 1274–1281 by Dr.Ian Tickel,
>>    > be used for the refinement of the data sets beyond
>>    > the resolution range mentioned in the paper: 1.33 -
>>    > 2.55 Ang?
>>    >
>>    > Also the structures that were used to optimize these
>>    > parameters were already solved and refined, so when
>>    > we are solving a new structure to what extent does the
>>    > model has to be built before starting the optimization?
>>    >
>>    > Thanking you
>>    > With Regards
>>    > M. Kavyashree
>>    >
>>    >
>>    > --
>>    > This message has been scanned for viruses and
>>    > dangerous content by MailScanner, and is
>>    > believed to be clean.
>>    >
>>
>>    --
>>    This message has been scanned for viruses and
>>    dangerous content by MailScanner, and is
>>    believed to be clean.
>>
>>
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>


[ccp4bb]

2011-10-14 Thread Ian Tickle
It must be the same complete model that you refined previously, I
doubt that it will give the correct answer if you leave out the waters
for example.

You say "there was quite a difference".  Could you be more specific:
what were the values of the weights, R factors and RMSZ(bonds/angles)
before and after weight optimisation?

Cheers

-- Ian

On Fri, Oct 14, 2011 at 8:42 AM,   wrote:
> Respected Sir,
>
> Thank you for your clarification. I had adopted this
> method recently. My doubt was if we have to optimize
> these two parameters during refinement, should we have
> the whole model along with water and ligands or only
> protein with few water positioning is enough. The reason
> why I am asking because there was quite a difference
> when I refined the same structure without the optimization
> and with optimization of these two parameters.
>
> Thanking you
> With regards
> M. Kavyashree
>
> -CCP4 bulletin board  wrote: -
>
>    To: CCP4BB@JISCMAIL.AC.UK
>    From: Ian Tickle
>    Sent by: CCP4 bulletin board
>    Date: 10/14/2011 12:34PM
>    Subject: Re: [ccp4bb] Optimisation of weights
>
>    Hi Kavya
>
>    The resolutions of the structures mentioned in the paper were only
>    examples, the Rfree/-LLfree minimisation method (which are actually
>    due to Axel Brunger & Gerard Bricogne respectively) does not depend on
>    resolution.
>
>    If the structures are already solved & refined, you don't need to do
>    any model building, it should be within the radius of convergence with
>    the new weights - it's only a small adjustment after all.
>
>    Cheers
>
>    -- Ian
>
>    On Fri, Oct 14, 2011 at 6:12 AM,   wrote:
>    > Dear users,
>    >
>    > Can the optimization of the X-ray weighing factor
>    > and B-factor (overall wt) as mentioned in the paper
>    > Acta Cryst. (2007). D63, 1274–1281 by Dr.Ian Tickel,
>    > be used for the refinement of the data sets beyond
>    > the resolution range mentioned in the paper: 1.33 -
>    > 2.55 Ang?
>    >
>    > Also the structures that were used to optimize these
>    > parameters were already solved and refined, so when
>    > we are solving a new structure to what extent does the
>    > model has to be built before starting the optimization?
>    >
>    > Thanking you
>    > With Regards
>    > M. Kavyashree
>    >
>    >
>    > --
>    > This message has been scanned for viruses and
>    > dangerous content by MailScanner, and is
>    > believed to be clean.
>    >
>
>    --
>    This message has been scanned for viruses and
>    dangerous content by MailScanner, and is
>    believed to be clean.
>
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


Re: [ccp4bb] Akta Prime

2011-10-14 Thread Charles Allerston
Dear Michael,


alternative people to service FPLC systems in the UK are called LC Services 
http://www.lcservs.com/ and came highly recommended to us.


cheers

charlie


From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Michael 
Colaneri
Sent: 12 October 2011 19:29
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Akta Prime

Dear all,

We have an AktaPrime and GE Lifesciences stop servicing these instruments 
because they are getting old.  Does anyone know of a third party company that 
gives contracts to maintain these instruments?  Thank you.

Mike Colaneri


[ccp4bb] PhD position in Structural Immunology/Bacteriology at the Karolinska Institute in Stockholm, Sweden

2011-10-14 Thread Adnane Achour
Karolinska Institutet has a vacancy for A doctoral student with doctoral grant 
in Structural Immunology/Bacteriology 
Department
Department of Medicine, Huddinge 
Description of research group and project
Center for Infectious Medicine (CIM), Department of Medicine, Huddinge is 
looking for a PhD student who will work with Associate Professor Adnane Achour 
and his colleagues in order to determine the three-dimensional structures of 
virulence-associated proteins derived from various pathogens. The Achour 
research group is a part of the Center for Infectious Medicine (CIM), 
Department of Medicine Huddinge, and is localized within the Department of 
Microbiology, Tumor and Cell biology (MTC) at Solna campus. The research 
projects of the Achour group are bridging immunology and structural biology. 
The PhD student candidate will actively participate in projects aiming to 
unravel the molecular basis of action of specific pathogen-associated proteins. 
Besides X-ray crystallography, techniques and approaches included within the 
proposed projects include molecular biology, biochemistry and biophysical 
approaches such as surface plasmon resonance. The overall aim of the projects 
is to assess the molecular basis for recognition of ligands by specific 
pathogen-associated proteins and their mode of action. 
Duties and responsibilities
The proposed projects represent a unique opportunity for a PhD student to 
acquire deep insights into immunology/microbiology, biochemistry and structural 
biology using a variety of techniques already well established within the 
Achour laboratory. The duties and responsibilities of the PhD candidate are to 
lead the proposed project(s) under the close supervision of his/her 
supervisors, and in close collaboration with his/her colleagues within the 
research group. By the end of the PhD study period, the candidate should have 
acquired a deep knowledge in molecular biology and biochemistry as well as a 
broad knowledge in immunology and structural biology. The PhD candidate should 
demonstrate excellent ability to design, perform, interpret, critically assess 
and contextualize generated data from experimental work. Furthermore, he/she 
should demonstrate aptitude to discuss and spread research results to both the 
national and international scientific community as well as to the 
non-scientific community. 
Eligibility/Qualifications
The eligible candidate must hold a Master´s degree in immunology, biophysics, 
mathematics, biomedicine, bacteriology, cell biology or equivalent. Previous 
experience in molecular biology and/or biochemistry is desirable but not 
absolutely required. We are seeking a highly motivated, creative person with 
good technical skills. The successful applicant should be fluent in English, 
with excellent communication capacity combined with the ability to interact 
effectively and work productively in a team. Emphasis will be placed on 
personal suitability as well as genuine enthusiasm for the topic.

To be eligible, an applicant must 
1. hold a Master’s (second-cycle) degree 
2. have completed course requirements of at least 240 HECs, of which at least 
60 at second-cycle level, or 
3. have otherwise acquired essentially equivalent competence in Sweden or 
abroad. 

Those who meet the requirements for general eligibility before July 1st, 2007, 
i.e. had completed a programme of higher education for at least 180 HEC or the 
equivalent, will continue to do so until the end of July, 2015. 

Selection of qualified applicants is made on the grounds of their ability to 
benefit from doctoral (third-cycle) education. Karolinska Institutet applies 
the following assessment criteria: 
- Documented subject knowledge of significance to the research project
- Analytical skills confirmed by a scientific dissertation or the equivalent
- Other documented knowledge/experience of potential significance to doctoral 
(third-cycle) education in the subject 
An overall assessment of the applicants’ qualifications will be made. 

The vacancy will be filled on condition that the successful applicant is 
admitted to doctoral (third-cycle) education.

For further information on doctoral (third-cycle) education at Karolinska 
Institutet, see www.ki.se/doctoral.

Contact person: Ass. Prof. Adnane Achour, phone: 08-5248 6216, 076-8090567, 
adnane.ach...@ki.se 

Union representatives: SACO: Ingrid Dahlman, ingrid.dahlman@ki., OFR: Eva 
Sjölin, eva.sjo...@ki.se 
Form of employment:
Doctoral student 
Notes
A doctoral student with doctoral grant yrs 1, and 2.
Temporary doctoral student yrs 3, and 4.
Please send your application, marked with reference number 6187/2011, to reach 
us by no later than November 6 2011   to KI jobb or, 
to Anna Maria Bernstein, M 54; Karolinska Institutet, 141 86 Stockholm, 
anna.maria.bernst...@ki.se. The following documents should be enclosed with 
your application in English or Swedish:
- Personal letter of introduction and CV
- Copy of 

Re: [ccp4bb] data processing problem with ice rings

2011-10-14 Thread vandana kukshal
Hello ,
Can any one send me pdf of this paper as its a old paper and not
accessible here .
   M.F. Perutz, Preparation of haemoglobin crystals. *J. Cryst. Growth*
, * 2 * (1968), pp. 54–56.


On Fri, Oct 14, 2011 at 10:42 AM, ChenTiantian
wrote:

> Hi there,
> I am processing a dataset which has bad ice rings (as you can see in the
> attach png file).
> I tried both XDS and imosflm, and got similar results, it seems that adding
> " EXCLUDE_RESOLUTION_RANGE" cannot get rid of the effects of the ice rings.
> the following is part of the CORRECT.LP which is the second attached file,
> you can find more details there.
>
>   SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF
> RESOLUTION
>  RESOLUTION NUMBER OF REFLECTIONSCOMPLETENESS R-FACTOR  R-FACTOR
> COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
>LIMIT OBSERVED  UNIQUE  POSSIBLE OF DATA   observed
> expected  Corr
>
>  4.24   371525537  5545   99.9%  46.9% 52.7%
> 371502.4850.8%19.4%   -28%   0.5135136
>  3.01   553449002  9840   91.5%  62.7% 65.1%
> 551161.7668.3%48.1%   -28%   0.5207760
>  2.46   84636   12699 12703  100.0%  67.4% 84.7%
> 846341.5573.0%54.2%   -19%   0.513   12104
>  2.13   97910   14743 14987   98.4% 254.5%199.3%
> 979080.16   276.2%  4899.9%   -23%   0.473   14037
>  1.90  110260   16846 16940   99.4% 299.2%303.3%
> 1102450.06   325.0%   -99.9%   -17%   0.422   15995
>  1.74  118354   18629 18744   99.4%1062.0%   1043.6%
> 118317   -0.20  1156.4%   -99.9%   -13%   0.380   17414
>  1.61  122958   20193 20331   99.3% 967.5%   1571.1%
> 1228680.10  1059.7%   987.3%-2%   0.402   18348
>  1.51  125075   21554 21794   98.9% 838.9%   1355.1%
> 1249330.08   922.6%  1116.9%-1%   0.402   18977
>  1.42   72057   17042 23233   73.4% 640.8%
> 775.3%703910.08   732.5%   826.7%-8%   0.425   10003
> total  823746  136245144117   94.5% 166.4%166.7%
> 8215620.40   181.1%   296.7%   -15%   0.435  119774
>
> Note that I/SIGMA of each resolution shell is <2.5, so how should I do to
> process the dataset properly? Any suggestion about this super ice rings?
> Thanks!
>
> Tiantian
>
> --
> Shanghai Institute of Materia Medica, Chinese Academy of Sciences
> Address: Room 101, 646 Songtao Road, Zhangjiang Hi-Tech Park,
> Shanghai, 201203
>



-- 
Vandana kukshal


[ccp4bb]

2011-10-14 Thread kavya
Respected Sir,

Thank you for your clarification. I had adopted this
method recently. My doubt was if we have to optimize
these two parameters during refinement, should we have
the whole model along with water and ligands or only
protein with few water positioning is enough. The reason
why I am asking because there was quite a difference
when I refined the same structure without the optimization
and with optimization of these two parameters.

Thanking you
With regards
M. Kavyashree

-CCP4 bulletin board  wrote: -

To: CCP4BB@JISCMAIL.AC.UK
From: Ian Tickle
Sent by: CCP4 bulletin board
Date: 10/14/2011 12:34PM
Subject: Re: [ccp4bb] Optimisation of weights

Hi Kavya

The resolutions of the structures mentioned in the paper were only
examples, the Rfree/-LLfree minimisation method (which are actually
due to Axel Brunger & Gerard Bricogne respectively) does not depend on
resolution.

If the structures are already solved & refined, you don't need to do
any model building, it should be within the radius of convergence with
the new weights - it's only a small adjustment after all.

Cheers

-- Ian

On Fri, Oct 14, 2011 at 6:12 AM,   wrote:
> Dear users,
>
> Can the optimization of the X-ray weighing factor
> and B-factor (overall wt) as mentioned in the paper
> Acta Cryst. (2007). D63, 1274–1281 by Dr.Ian Tickel,
> be used for the refinement of the data sets beyond
> the resolution range mentioned in the paper: 1.33 -
> 2.55 Ang?
>
> Also the structures that were used to optimize these
> parameters were already solved and refined, so when
> we are solving a new structure to what extent does the
> model has to be built before starting the optimization?
>
> Thanking you
> With Regards
> M. Kavyashree
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: [ccp4bb] data processing problem with ice rings

2011-10-14 Thread Petri Kursula
Your main problem is not the ice rings but a wrong lattice/indexing solution. R 
factors are very high for even low res shells and I/sigma very low. To me this 
tells you are not finding your diffraction spots at all.

First thing to try: Take more images for the indexing step and use only the 
strongest spots. And do not refine distance during indexing, as you probably 
have a pretty high mosaicity. 

Petri

On Oct 14, 2011, at 7:12 AM, ChenTiantian wrote:

> Hi there,
> I am processing a dataset which has bad ice rings (as you can see in the 
> attach png file).
> I tried both XDS and imosflm, and got similar results, it seems that adding " 
> EXCLUDE_RESOLUTION_RANGE" cannot get rid of the effects of the ice rings.
> the following is part of the CORRECT.LP which is the second attached file, 
> you can find more details there. 
> 
>   SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
>  RESOLUTION NUMBER OF REFLECTIONSCOMPLETENESS R-FACTOR  R-FACTOR 
> COMPARED I/SIGMA   R-meas  Rmrgd-F  Anomal  SigAno   Nano
>LIMIT OBSERVED  UNIQUE  POSSIBLE OF DATA   observed  expected  
> Corr
> 
>  4.24   371525537  5545   99.9%  46.9% 52.7%
> 371502.4850.8%19.4%   -28%   0.5135136
>  3.01   553449002  9840   91.5%  62.7% 65.1%
> 551161.7668.3%48.1%   -28%   0.5207760
>  2.46   84636   12699 12703  100.0%  67.4% 84.7%
> 846341.5573.0%54.2%   -19%   0.513   12104
>  2.13   97910   14743 14987   98.4% 254.5%199.3%
> 979080.16   276.2%  4899.9%   -23%   0.473   14037
>  1.90  110260   16846 16940   99.4% 299.2%303.3%   
> 1102450.06   325.0%   -99.9%   -17%   0.422   15995
>  1.74  118354   18629 18744   99.4%1062.0%   1043.6%   
> 118317   -0.20  1156.4%   -99.9%   -13%   0.380   17414
>  1.61  122958   20193 20331   99.3% 967.5%   1571.1%   
> 1228680.10  1059.7%   987.3%-2%   0.402   18348
>  1.51  125075   21554 21794   98.9% 838.9%   1355.1%   
> 1249330.08   922.6%  1116.9%-1%   0.402   18977
>  1.42   72057   17042 23233   73.4% 640.8%775.3%
> 703910.08   732.5%   826.7%-8%   0.425   10003
> total  823746  136245144117   94.5% 166.4%166.7%   
> 8215620.40   181.1%   296.7%   -15%   0.435  119774
> 
> Note that I/SIGMA of each resolution shell is <2.5, so how should I do to 
> process the dataset properly? Any suggestion about this super ice rings?
> Thanks!
> 
> Tiantian
> 
> -- 
> Shanghai Institute of Materia Medica, Chinese Academy of Sciences
> Address: Room 101, 646 Songtao Road, Zhangjiang Hi-Tech Park,
> Shanghai, 201203 
> 


---
Petri Kursula, PhD
Group Leader, Docent of Neurobiochemistry
Department of Biochemistry, University of Oulu, Finland
Department of Chemistry, University of Hamburg, Germany
Visiting Scientist (CSSB-HZI, DESY, Hamburg, Germany)
www.biochem.oulu.fi/kursula
www.desy.de/~petri
petri.kurs...@oulu.fi
petri.kurs...@desy.de
---



Re: [ccp4bb] Optimisation of weights

2011-10-14 Thread Ian Tickle
Hi Kavya

The resolutions of the structures mentioned in the paper were only
examples, the Rfree/-LLfree minimisation method (which are actually
due to Axel Brunger & Gerard Bricogne respectively) does not depend on
resolution.

If the structures are already solved & refined, you don't need to do
any model building, it should be within the radius of convergence with
the new weights - it's only a small adjustment after all.

Cheers

-- Ian

On Fri, Oct 14, 2011 at 6:12 AM,   wrote:
> Dear users,
>
> Can the optimization of the X-ray weighing factor
> and B-factor (overall wt) as mentioned in the paper
> Acta Cryst. (2007). D63, 1274–1281 by Dr.Ian Tickel,
> be used for the refinement of the data sets beyond
> the resolution range mentioned in the paper: 1.33 -
> 2.55 Ang?
>
> Also the structures that were used to optimize these
> parameters were already solved and refined, so when
> we are solving a new structure to what extent does the
> model has to be built before starting the optimization?
>
> Thanking you
> With Regards
> M. Kavyashree
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>