Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread David Briggs

Hi Ibrahim,


On 04/06/07, Ibrahim M. Moustafa [EMAIL PROTECTED] wrote:


Hi all,

While reading a crystallographic paper describing the structure
of an apo-protein and its complex I noticed that

   the authors described the space goup as P6122 for the unit cell:
a=141.9, b=143.9, c=380.4
  Could this be considered as a typo or I'm missing something here!
the requirement for the hexagonal is a = b # Cright?



You are correct, for Hexagonal, a=b - so It's got to be a typo -  data most
processing software wouldn't let you do this.

Another observation in that paper too: the B-factors for the 2.4 A

and 3.2 A structures are 39 and 40?? Does this make sense to anyone?



They're quoting Wilson B-factors, I imagine. A small but rather important
difference - where was this published?


The last question: In the same paper, for the complex structure R and

Rfree are equal (30%) is that an indication for improper refinement
in these published structure? I'd love to hear your comments on that too.



Well, it certainly is a little suspicious looking - I've had similar
experiences to Ed, regarding similar R  Rfrees from rigid rigid body
refinement prior to positional refinement. Have the authors deposited the
Structure factors? I would use EDS to check the maps out: eds.bmc.uu.se/eds/


  thanks,

  Ibrahim



HTH,  Dave



--
---
David Briggs, PhD.
Father  Crystallographer
www.dbriggs.talktalk.net
iChat AIM ID: DBassophile
---
Anyone who is capable of getting themselves made President should on no
account be allowed to do the job. - Douglas Adams


Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread Eleanor Dodson

Yes;  a==b for P6i - prob. a typo..

B factors at 3.2A are hard to fix - it will depend on scaling convention 
to some extent..

Can you download the data and re-run refinement for your own satisfaction.

If R ==Rfree for the complex then I suspect they did not transfer the 
FreeR flags from the apo-protein data to the complex.

Again if the data is available you may be able to check this.
Eleanor



Ibrahim M. Moustafa wrote:

Hi all,

   While reading a crystallographic paper describing the structure of 
an apo-protein and its complex I noticed that


  the authors described the space goup as P6122 for the unit cell: 
a=141.9, b=143.9, c=380.4 !


 Could this be considered as a typo or I'm missing something here! the 
requirement for the hexagonal is a = b # Cright?


Another observation in that paper too: the B-factors for the 2.4 A and 
3.2 A structures are 39 and 40?? Does this make sense to anyone??


The last question: In the same paper, for the complex structure R and 
Rfree are equal (30%) is that an indication for improper refinement in 
these published structure? I'd love to hear your comments on that too.


  thanks,
 Ibrahim




-- 


Ibrahim M. Moustafa, Ph.D.
Biochemistry and Molecular Biology Dept.
201 Althouse Lab., Uinversity Park
Pennsylvania State University, PA16802

Tel.  (814)863-8703
Fax. (814)865-7927
--  





Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread Ibrahim M. Moustafa

Hi All,

   Thanks a lot for all reply with valuable inputs. In my original 
post: I meant a = b does not equal c. I used # for does not equal.


  Many asked where is that paper published! Actually the paper is 
under revision. When reading, I assumed the unit cell dimensions (or 
the space group) is a typo as others thought.


  The low B value for the low resolution structure makes me 
suspicious that something is wrong. In my little experience, and as 
others mentioned, B-factor is expected to be around 70-80 for 2.8 A 
structure and very likely higher for 3.0 A structure. David Briggs 
suggested that they reported the Wilson B-factor; however, clearly, 
it is reported as the B-factor of the refined structure. Also, the 
Rwork = Rfree indicated that something is not right with the 
refinement protocol but I was not sure what that could be! The 
suspect that they did not transfer the FreeR sounds reasonable explanation.


   thanks a lot,
Ibrahim




At 03:50 AM 6/5/2007, Eleanor Dodson wrote:

Yes;  a==b for P6i - prob. a typo..

B factors at 3.2A are hard to fix - it will depend on scaling 
convention to some extent..

Can you download the data and re-run refinement for your own satisfaction.

If R ==Rfree for the complex then I suspect they did not transfer 
the FreeR flags from the apo-protein data to the complex.

Again if the data is available you may be able to check this.
Eleanor



Ibrahim M. Moustafa wrote:

Hi all,

   While reading a crystallographic paper describing the structure 
of an apo-protein and its complex I noticed that


  the authors described the space goup as P6122 for the unit cell: 
a=141.9, b=143.9, c=380.4 !


 Could this be considered as a typo or I'm missing something here! 
the requirement for the hexagonal is a = b # Cright?


Another observation in that paper too: the B-factors for the 2.4 A 
and 3.2 A structures are 39 and 40?? Does this make sense to anyone??


The last question: In the same paper, for the complex structure R 
and Rfree are equal (30%) is that an indication for improper 
refinement in these published structure? I'd love to hear your 
comments on that too.


  thanks,
 Ibrahim




-- 


Ibrahim M. Moustafa, Ph.D.
Biochemistry and Molecular Biology Dept.
201 Althouse Lab., Uinversity Park
Pennsylvania State University, PA16802

Tel.  (814)863-8703
Fax. (814)865-7927
-- 






--
Ibrahim M. Moustafa, Ph.D.
Biochemistry and Molecular Biology Dept.
201 Althouse Lab., University Park
Pennsylvania State University, PA16802

Tel.  (814)863-8703
Fax. (814)865-7927
--  


Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread Edward A Berry

You have a good point there and I would be interested in hearing
some other opinions, so I take the liberty of reposting-

My instinctive preference is that each structure should be
supported solely by the data that is deposited with it -
(one dataset one structure) but in terms of good science
we want to produce the best model we can, and that might be
the rigid-body-located structure from another dataset.
In particular the density for the ligand might be clearer
before overfitting with the low resolution data.

Even if the free-R set is not preserved for the new crystal,
R and R-free tend to diverge rapidly once any kind of
fitting with a low data/param is performed, so I think
the new structure must not have been refined much beyond
rigid body (and over-all B which is included in any kind
of refinement).  And that choice may be well justified.
Ed

cdekker wrote:

Hi,

Your reply to the ccp4bb has confused me a bit. I am currently refining 
a low res structure and realise that I don't know what to expect for 
final R and Rfree - it is definitely not what most people would publish. 
So the absolute values of R and Rfree are not telling me much, the only 
gauge I have is that as long as both R and Rfree are decreasing I am 
improving the model (and yes, at the moment that is only rigid body 
refinement).
In your email reply you suggest that even though a refinement to 
convergence that will lead to an increased Rfree (and lower R? - a 
classic case of overfitting!) would be a better model than the 
rigid-body-refined only model. This is what confuses me.
I can see your reasoning that starting with an atomic model to solve 
low-res data can lead to this behaviour, but then should the solution 
not be a modification of the starting model (maybe high B-factors?) to 
compensate for the difference in resolution of model and data?


Carien

On 4 Jun 2007, at 19:38, Edward A Berry wrote:


Ibrahim M. Moustafa wrote:
The last question: In the same paper, for the complex structure R and 
Rfree are equal (30%) is that an indication for improper refinement 
in these published structure? I'd love to hear your comments on that 
too.

Several times I solved low resolution structures using high resolution
models, and noticed that R-free increased during atomic positional
refinement.  This could be expected from the assertion that after
refinement to convergence, the final values should not depend on
the starting point: If I had started with a crude model and refined
against low resolution data, Rfree would not have gone as low as the
high-resolution model, so if I start with the high resolution model
and refine, Rfree should worsen to the same value as the structure
converges to the same point.

Thinking about the main purpose of the Rfree statistic, in a very
real way this tells me that the model was better before this step
of refinement, and it would be better to omit the minimization step.
Perhaps this is what the authors did.

   On the other hand it does not seem quite right submit a model that
has simply been rigid-body-refined against the data- I would prefer to
refine to convergence and submit the best model that can be supported
by the data alone, rather than a better model which is really the model
from a better dataset repositioned in the new crystal.

Ed



The Institute of Cancer Research: Royal Cancer Hospital, a charitable 
Company Limited by Guarantee, Registered in England under Company No. 
534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.


This e-mail message is confidential and for use by the addressee only.  
If the message is received by anyone other than the addressee, please 
return the message to the sender by replying to it and then delete the 
message from your computer and network.


Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread Phil Jeffrey
Wouldn't the desirability of this depend on the extent to which the 
molecule has moved between the high-resolution and low-resolution 
datasets ?  I would have thought that there was an effective information 
transfer between R-work and R-free once the rigid body movements became 
too large, which might provide one with an over-optimistic idea of what 
the R-free would be with the high-resolution model with the 
low-resolution data.


Phil
Princeton NJ

Edward A Berry wrote:


Even if the free-R set is not preserved for the new crystal,
R and R-free tend to diverge rapidly once any kind of
fitting with a low data/param is performed, so I think
the new structure must not have been refined much beyond
rigid body (and over-all B which is included in any kind
of refinement).  And that choice may be well justified.
Ed

cdekker wrote:

Hi,

Your reply to the ccp4bb has confused me a bit. I am currently 
refining a low res structure and realise that I don't know what to 
expect for final R and Rfree - it is definitely not what most people 
would publish. So the absolute values of R and Rfree are not telling 
me much, the only gauge I have is that as long as both R and Rfree are 
decreasing I am improving the model (and yes, at the moment that is 
only rigid body refinement).
In your email reply you suggest that even though a refinement to 
convergence that will lead to an increased Rfree (and lower R? - a 
classic case of overfitting!) would be a better model than the 
rigid-body-refined only model. This is what confuses me.
I can see your reasoning that starting with an atomic model to solve 
low-res data can lead to this behaviour, but then should the solution 
not be a modification of the starting model (maybe high B-factors?) to 
compensate for the difference in resolution of model and data?


Carien

On 4 Jun 2007, at 19:38, Edward A Berry wrote:


Ibrahim M. Moustafa wrote:
The last question: In the same paper, for the complex structure R 
and Rfree are equal (30%) is that an indication for improper 
refinement in these published structure? I'd love to hear your 
comments on that too.

Several times I solved low resolution structures using high resolution
models, and noticed that R-free increased during atomic positional
refinement.  This could be expected from the assertion that after
refinement to convergence, the final values should not depend on
the starting point: If I had started with a crude model and refined
against low resolution data, Rfree would not have gone as low as the
high-resolution model, so if I start with the high resolution model
and refine, Rfree should worsen to the same value as the structure
converges to the same point.

Thinking about the main purpose of the Rfree statistic, in a very
real way this tells me that the model was better before this step
of refinement, and it would be better to omit the minimization step.
Perhaps this is what the authors did.

   On the other hand it does not seem quite right submit a model that
has simply been rigid-body-refined against the data- I would prefer to
refine to convergence and submit the best model that can be supported
by the data alone, rather than a better model which is really the model
from a better dataset repositioned in the new crystal.

Ed


Re: [ccp4bb] B-factor Space gr questions!

2007-06-05 Thread Ethan Merritt
On Tuesday 05 June 2007 12:19, Edward A Berry wrote:
 You have a good point there and I would be interested in hearing
 some other opinions, so I take the liberty of reposting-
 
 My instinctive preference is that each structure should be
 supported solely by the data that is deposited with it -
 (one dataset one structure) but in terms of good science
 we want to produce the best model we can, and that might be
 the rigid-body-located structure from another dataset.

I don't think that is quite the right way to look at it.
In general we refine our model so that it both
 - agrees with the data
 - agrees with a priori knowledge

In maximum likelihood terms: we want to find the model that is
the most likely explanation for our observed data.
An inherently unlikely model is also an inherently unlikely
explanation. Therefore we focus on likely models.

We impose geometric restraints because we believe that we have
a better a priori expectation for bond lengths and angles than
can be determined de novo from the data in this one experiment.

Similarly we impose the known sequence of our protein on the
model, even if the maps are not sufficiently good to identify
each amino acid directly from the electron density.

If we have an a priori expectation for the conformation of
the whole protein, or large pieces of it, then we should 
account for this in the model, even if the data is not 
sufficiently good to reproduce this expectation de novo.

Therefore if you have a high-resolution structure available,
the best treatment of low-resolution data may well be to
place the known structure as a rigid body.  If you suspect
hinge motions or other large scale inter-domain shifts, you
might want to refine the hinge angle explicitly, but unfortunately
our usual refinement programs are not really set up for this.  


These are important issues, and are close to the heart
of the Maximum Likelihood approach to model refinement.


Ethan

 cdekker wrote:
  Hi,
  
  Your reply to the ccp4bb has confused me a bit. I am currently refining 
  a low res structure and realise that I don't know what to expect for 
  final R and Rfree - it is definitely not what most people would publish. 
  So the absolute values of R and Rfree are not telling me much, the only 
  gauge I have is that as long as both R and Rfree are decreasing I am 
  improving the model (and yes, at the moment that is only rigid body 
  refinement).
  In your email reply you suggest that even though a refinement to 
  convergence that will lead to an increased Rfree (and lower R? - a 
  classic case of overfitting!) would be a better model than the 
  rigid-body-refined only model. This is what confuses me.
  I can see your reasoning that starting with an atomic model to solve 
  low-res data can lead to this behaviour, but then should the solution 
  not be a modification of the starting model (maybe high B-factors?) to 
  compensate for the difference in resolution of model and data?
  
  Carien
  
  On 4 Jun 2007, at 19:38, Edward A Berry wrote:
  
  Ibrahim M. Moustafa wrote:
  The last question: In the same paper, for the complex structure R and 
  Rfree are equal (30%) is that an indication for improper refinement 
  in these published structure? I'd love to hear your comments on that 
  too.
  Several times I solved low resolution structures using high resolution
  models, and noticed that R-free increased during atomic positional
  refinement.  This could be expected from the assertion that after
  refinement to convergence, the final values should not depend on
  the starting point: If I had started with a crude model and refined
  against low resolution data, Rfree would not have gone as low as the
  high-resolution model, so if I start with the high resolution model
  and refine, Rfree should worsen to the same value as the structure
  converges to the same point.
 
  Thinking about the main purpose of the Rfree statistic, in a very
  real way this tells me that the model was better before this step
  of refinement, and it would be better to omit the minimization step.
  Perhaps this is what the authors did.
 
 On the other hand it does not seem quite right submit a model that
  has simply been rigid-body-refined against the data- I would prefer to
  refine to convergence and submit the best model that can be supported
  by the data alone, rather than a better model which is really the model
  from a better dataset repositioned in the new crystal.
 
  Ed
  
  
  The Institute of Cancer Research: Royal Cancer Hospital, a charitable 
  Company Limited by Guarantee, Registered in England under Company No. 
  534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
  
  This e-mail message is confidential and for use by the addressee only.  
  If the message is received by anyone other than the addressee, please 
  return the message to the sender by replying to it and then delete the 
  message from your computer and network.

[ccp4bb] B-factor Space gr questions!

2007-06-04 Thread Ibrahim M. Moustafa

Hi all,

   While reading a crystallographic paper describing the structure 
of an apo-protein and its complex I noticed that


  the authors described the space goup as P6122 for the unit cell: 
a=141.9, b=143.9, c=380.4 !


 Could this be considered as a typo or I'm missing something here! 
the requirement for the hexagonal is a = b # Cright?


Another observation in that paper too: the B-factors for the 2.4 A 
and 3.2 A structures are 39 and 40?? Does this make sense to anyone??


The last question: In the same paper, for the complex structure R and 
Rfree are equal (30%) is that an indication for improper refinement 
in these published structure? I'd love to hear your comments on that too.


  thanks,
 Ibrahim




--
Ibrahim M. Moustafa, Ph.D.
Biochemistry and Molecular Biology Dept.
201 Althouse Lab., Uinversity Park
Pennsylvania State University, PA16802

Tel.  (814)863-8703
Fax. (814)865-7927
--  


Re: [ccp4bb] B-factor Space gr questions!

2007-06-04 Thread Edward A Berry

Ibrahim M. Moustafa wrote:
The last question: In the same paper, for the complex structure R and 
Rfree are equal (30%) is that an indication for improper refinement in 
these published structure? I'd love to hear your comments on that too.



Several times I solved low resolution structures using high resolution
models, and noticed that R-free increased during atomic positional
refinement.  This could be expected from the assertion that after
refinement to convergence, the final values should not depend on
the starting point: If I had started with a crude model and refined
against low resolution data, Rfree would not have gone as low as the
high-resolution model, so if I start with the high resolution model
and refine, Rfree should worsen to the same value as the structure
converges to the same point.

Thinking about the main purpose of the Rfree statistic, in a very
real way this tells me that the model was better before this step
of refinement, and it would be better to omit the minimization step.
Perhaps this is what the authors did.

   On the other hand it does not seem quite right submit a model that
has simply been rigid-body-refined against the data- I would prefer to
refine to convergence and submit the best model that can be supported
by the data alone, rather than a better model which is really the model
from a better dataset repositioned in the new crystal.

Ed