Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree) [SEC=UNCLASSIFIED]

2010-10-28 Thread DUFF, Anthony
I reckon you could share hypothetical review comments for educational purposes.


-Original Message-
From: CCP4 bulletin board on behalf of Bernhard Rupp (Hofkristallrat a.D.)
Sent: Thu 10/28/2010 12:22 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)
 
Why not double open review? If I have something reasonable to say, I should
be able to sign it. Particularly if the publicly purported point of review
is to make the manuscript better.  And imagine what wonderful open hostility
we would enjoy instead of all these hidden grudges! You would never have to
preemptively condemn a paper on grounds of suspicion that it is from someone
who might have reviewed you equally loathful earlier. You actually know that
you are creaming the right bastard!

A more serious question for the editors amongst us: Can I publish review
comments or are they covered under some confidentiality rule? Some of these
gems are quite worthy public entertainment.

Best, BR 

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Jacob
Keller
Sent: Wednesday, October 27, 2010 6:02 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

What about the possibility of double-blind review? I have actually wondered
why the reviewers should be given the author info--does that determine the
quality of the work? Am I missing some obvious reason why reviewers should
know who the authors are?

JPK

On Wed, Oct 27, 2010 at 5:50 PM, Phoebe Rice pr...@uchicago.edu wrote:
 Journal editors need to know when the reviewer they trusted is completely
out to lunch. So please don't just silently knuckle under!
 It may make no difference for Nature, but my impression has been that
rigorous journals like JMB do care about review quality.
  Phoebe

 =
 Phoebe A. Rice
 Dept. of Biochemistry  Molecular Biology The University of Chicago 
 phone 773 834 1723
 http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty
 _Alphabetically.php?faculty_id=123
 http://www.rsc.org/shop/books/2008/9780854042722.asp


  Original message 
Date: Wed, 27 Oct 2010 15:13:03 -0700
From: CCP4 bulletin board CCP4BB@JISCMAIL.AC.UK (on behalf of 
Bernhard Rupp (Hofkristallrat a.D.) hofkristall...@gmail.com)
Subject: Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)
To: CCP4BB@JISCMAIL.AC.UK

 Surely the best model is the one that the referees for your paper 
 are
happy with?

That may be the sad and pragmatic  wisdom, but certainly not a truth 
we should accept...

 I have found referees to impose seemingly random and arbitrary 
 standards

a) Reviewers are people belonging to a certain population, 
characterized by say a property 'review quality' that follows a certain
distribution.
Irrespective of the actual shape of that parent distribution, the 
central limit theorem informs us that if you sample this distribution 
reasonably often, the sampling distribution will be normal. That 
means, that half of the reviews will be below average review quality, and
half above.

Unfortunately, the mean of that distribution is
b) a function of journal editor quality (they pick the reviewers after 
all) and
c) affected by systematic errors such as your reputation and the 
chance that you yourself might sit on a reviewer's grant review panel 
By combining a, b, c you can get a  fairly good assessment of the 
joint probability of what report you will receive. You do notice that 
model quality is not a parameter in this model, because we can neglect 
marginal second order contributions.

  Mind you discussions on this email list can be a useful resource 
 for
telling referee's why you don't think you should comply with their 
rule of thumb.

I agree and sympathize with your optimism, but I am afraid that those 
who might need this education are not the ones who seek it. I.e., 
reading the bb complicates matters (simplicity being one benefit of 
ROTs)  and you can't build an empire wasting time on such things.

Good luck with your reviews!

BR

Simon



On 27 Oct 2010, at 20:11, Bernhard Rupp (Hofkristallrat a.D.) wrote:

 Dear Young and Impressionable readers:

 I second-guess here that Robbie's intent - after re-refining many 
 many PDB structures, seeing dreadful things, and becoming a hardened 
 cynic
 - is to provoke more discussion in order to put in perspective - if 
 not
 debunk-
 almost all of these rules.

 So it may be better to pretend you have never heard of these rules.
 Your
 crystallographic life might be a happier and less biased one.

 If you follow this simple procedure (not a rule)

 The model that fits the primary evidence (minimally biased electron
 density)
 best and is at the same time physically meaningful, is the best 
 model, i.
 e., all plausibly accountable electron density (and not more) is 
 modeled.

 This process of course does require a little work (like looking 
 

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread George M. Sheldrick
It is instructive to look at what happens for small molecules where
there is often no solvent to worry about. They are often refined 
using SHELXL, which does indeed print out the weighted R-value based 
on intensities (wR2), the conventional unweighted R-value R1 (based 
on F) and sigmaI/I, which it calls R(sigma). For well-behaved
crystals R1 is in the range 1-5% and R(merge) (based on intensities)
is in the range 3-9%. As you suggest, 0.5*R(sigma) could be regarded
as the lower attainable limit for R1 and this is indeed the case in 
practice (the factor 0.5 approximately converts from I to F). Rpim
gives similar results to R(sigma), both attempt to measure the
precision of the MERGED data, which are what one is refining against.

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Wed, 27 Oct 2010, Ed Pozharski wrote:

 On Tue, 2010-10-26 at 21:16 +0100, Frank von Delft wrote:
  the errors in our measurements apparently have no 
  bearing whatsoever on the errors in our models 
 
 This would mean there is no point trying to get better crystals, right?
 Or am I also wrong to assume that the dataset with higher I/sigma in the
 highest resolution shell will give me a better model?
 
 On a related point - why is Rmerge considered to be the limiting value
 for the R?  Isn't Rmerge a poorly defined measure itself that
 deteriorates at least in some circumstances (e.g. increased redundancy)?
 Specifically, shouldn't ideal R approximate 0.5*sigmaI/I?
 
 Cheers,
 
 Ed.
 
 
 
 -- 
 I'd jump in myself, if I weren't so good at whistling.
Julian, King of Lemurs
 
 


Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

2010-10-28 Thread Eleanor Dodson

Oh cynic!
Eleanor



On 10/27/2010 09:01 PM, Simon Kolstoe wrote:

Surely the best model is the one that the referees for your paper are
happy with?

I have found referees to impose seemingly random and arbitrary standards
that sometime require a lot of effort to comply with but result in
little to no impact on the biology being described. Mind you discussions
on this email list can be a useful resource for telling referee's why
you don't think you should comply with their rule of thumb.

Simon



On 27 Oct 2010, at 20:11, Bernhard Rupp (Hofkristallrat a.D.) wrote:


Dear Young and Impressionable readers:

I second-guess here that Robbie's intent - after re-refining many many
PDB
structures, seeing dreadful things, and becoming a hardened cynic - is to
provoke more discussion in order to put in perspective - if not debunk-
almost all of these rules.

So it may be better to pretend you have never heard of these rules. Your
crystallographic life might be a happier and less biased one.

If you follow this simple procedure (not a rule)

The model that fits the primary evidence (minimally biased electron
density)
best and is at the same time physically meaningful, is the best model, i.
e., all plausibly accountable electron density (and not more) is modeled.

This process of course does require a little work (like looking
through all
of the model, not just the interesting parts, and thinking what makes
sense)
but may lead to additional and unexpected insights. And in almost all
cases,
you will get a model with plausible statistics, without any reliance on
rules.

For some decisions regarding global parameterizations you have to
apply more
sophisticated test such as Ethan pointed out (HR tests) or Ian uses
(LL-tests). And once you know how to do that, you do not need any
rules of
thumb anyhow.

So I opt for a formal burial of these rules of thumb and a toast to
evidence
and plausibility.

And, as Gerard B said in other words so nicely:

Si tacuisses, philosophus mansisses.

BR

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of
Robbie
Joosten
Sent: Tuesday, October 26, 2010 10:29 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

Dear Anthony,

That is an excellent question! I believe there are quite a lot of
'rules of
thumb' going around. Some of them seem to lead to very dogmatic
thinking and
have caused (refereeing) trouble for good structures and lack of
trouble for
bad structures. A lot of them were discussed at the CCP4BB so it may
be nice
to try to list them all.


Rule 1: If Rwork  20%, you are done.
Rule 2: If R-free - Rwork  5%, your structure is wrong.
Rule 3: At resolution X, the bond length rmsd should be  than Y (What is
the rmsd thing people keep talking about?) Rule 4: If your resolution is
lower than X, you should not use_anisotropic_Bs/riding_hydrogens
Rule 5: You should not build waters/alternates at resolutions lower
than X
Rule 6: You should do the final refinement with ALL reflections Rule
7: No
one cares about getting the carbohydrates right


Obviously, this list is not complete. I may also have overstated some
of the
rules to get the discussion going. Any addidtions are welcome.

Cheers,
Robbie Joosten
Netherlands Cancer Institute


Apologies if I have missed a recent relevant thread, but are lists of
rules of thumb for model building and refinement?





Anthony



Anthony Duff Telephone: 02 9717 3493 Mob: 043 189 1076


=


Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

2010-10-28 Thread Vellieux Frederic
I do not know if that's really cynical: I've had the case of a referee 
recommending manuscript rejection because the title of the manuscript 
was not appropriate. The editor followed the advice of the referee. A 
proper refereeing job would have been to suggest that the authors change 
the title of the manuscript, not suggesting to the editor that the 
manuscript should be rejected!


So I think we can have different opinions on this. Sometimes referees do 
a good job in evaluating manuscripts, sometimes they do not.


Fred.

Eleanor Dodson wrote:

Oh cynic!
Eleanor



On 10/27/2010 09:01 PM, Simon Kolstoe wrote:

Surely the best model is the one that the referees for your paper are
happy with?

I have found referees to impose seemingly random and arbitrary standards
that sometime require a lot of effort to comply with but result in
little to no impact on the biology being described. Mind you discussions
on this email list can be a useful resource for telling referee's why
you don't think you should comply with their rule of thumb.

Simon


[ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Sebastiaan Werten

Dear all,

we have a his-tagged protein that shows a minor accompanying band in
SDS-PAGE, just above the main band. According to all other methods
available to us the material is homogeneous, the protein has the 
correct

mass in MALDI-TOF, epitopes are recognized, etc. etc.

I know that the additional band is a very common artifact with
his-tagged proteins, but I was wondering if anyone is aware of a paper
that formally describes the phenomenon, as we need to appease a couple
of rather bloody-minded referees.

Thanks very much for any suggestions, Seb.

--
Dr. Sebastiaan Werten
Institut für Biochemie
Universität Greifswald
Felix-Hausdorff-Str. 4
D-17489 Greifswald
Germany
Tel: +49 38 34 86 44 61
E-mail: sebastiaan.wer...@uni-greifswald.de


Re: [ccp4bb] Bug in c_truncate?

2010-10-28 Thread Tim Gruene
Dear Peter,

it seems to me that you are having trouble with f2mtz and not with ctruncate, so
I am confused by the subject.

Can you please post 
- the error message, 
- the first couple of lines of the hkl-file you are trying to import (including
  one or two reflections which are flagged for Rfree), 
- the version of ccp4 you are using 
- whether you are doing the conversion from the GUI or the command line - if the
  latter, please also post the script you are using.

Cheers, Tim

On Wed, Oct 27, 2010 at 09:14:36PM -0400, Peter Chan wrote:
 
 Hello,
 
 I've been struggling with F2MTZ and importing my hkl file into mtz by 
 'keeping existing freeR data'. I keep getting the error Problem with FREE 
 column in input file. All flags apparently identical. Check input file.
 
 At the end of the day, it appears that this only happens in ctruncate and not 
 in the old_truncate instead of ctruncate. Has anyone experienced a similar 
 problem?
 
 Peter
 
-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

phone: +49 (0)551 39 22149

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature


Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Tim Gruene
Dear Sebastiaan,

isn't it the editor rather than the referees whom you have to convince? And if
the editor does not even understand how SDS-PAGE works and still considers
this a reason not to publish your article against your own expertise, maybe it
is worth changing the journal.

Finally, since referees know the names of the authors of the articles they are
refereeing, you may need a lot more than a couple of references to appease them
after you called them bloody-minded in a public email forum.

My two cents, Tim

On Thu, Oct 28, 2010 at 01:16:59PM +0200, Sebastiaan Werten wrote:
 Dear all,

 we have a his-tagged protein that shows a minor accompanying band in
 SDS-PAGE, just above the main band. According to all other methods
 available to us the material is homogeneous, the protein has the correct
 mass in MALDI-TOF, epitopes are recognized, etc. etc.

 I know that the additional band is a very common artifact with
 his-tagged proteins, but I was wondering if anyone is aware of a paper
 that formally describes the phenomenon, as we need to appease a couple
 of rather bloody-minded referees.

 Thanks very much for any suggestions, Seb.

 --
 Dr. Sebastiaan Werten
 Institut für Biochemie
 Universität Greifswald
 Felix-Hausdorff-Str. 4
 D-17489 Greifswald
 Germany
 Tel: +49 38 34 86 44 61
 E-mail: sebastiaan.wer...@uni-greifswald.de

-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

phone: +49 (0)551 39 22149

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Jacob Keller
So I guess a consequence of what you say is that since in cases where there 
is no solvent the R values are often better than the precision of the actual 
measurements (never true with macromolecular crystals involving solvent), 
perhaps our real problem might be modelling solvent? 
Alternatively/additionally, I wonder whether there also might be more 
variability molecule-to-molecule in proteins, which we may not model well 
either.


JPK

- Original Message - 
From: George M. Sheldrick gshe...@shelx.uni-ac.gwdg.de

To: CCP4BB@JISCMAIL.AC.UK
Sent: Thursday, October 28, 2010 4:05 AM
Subject: Re: [ccp4bb] Against Method (R)



It is instructive to look at what happens for small molecules where
there is often no solvent to worry about. They are often refined
using SHELXL, which does indeed print out the weighted R-value based
on intensities (wR2), the conventional unweighted R-value R1 (based
on F) and sigmaI/I, which it calls R(sigma). For well-behaved
crystals R1 is in the range 1-5% and R(merge) (based on intensities)
is in the range 3-9%. As you suggest, 0.5*R(sigma) could be regarded
as the lower attainable limit for R1 and this is indeed the case in
practice (the factor 0.5 approximately converts from I to F). Rpim
gives similar results to R(sigma), both attempt to measure the
precision of the MERGED data, which are what one is refining against.

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Wed, 27 Oct 2010, Ed Pozharski wrote:


On Tue, 2010-10-26 at 21:16 +0100, Frank von Delft wrote:
 the errors in our measurements apparently have no
 bearing whatsoever on the errors in our models

This would mean there is no point trying to get better crystals, right?
Or am I also wrong to assume that the dataset with higher I/sigma in the
highest resolution shell will give me a better model?

On a related point - why is Rmerge considered to be the limiting value
for the R?  Isn't Rmerge a poorly defined measure itself that
deteriorates at least in some circumstances (e.g. increased redundancy)?
Specifically, shouldn't ideal R approximate 0.5*sigmaI/I?

Cheers,

Ed.



--
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs





***
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-kell...@northwestern.edu
***


Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Skrzypczak-Jankun, Ewa
 

Additional band on a gel might not be caused by his-tag. It is often a result 
of different conformation/molecular shape and so the molecule travels with 
different speed in the gel. We may wish for a homogenous sample (chemically and 
structurally) but this is seldom true 

See example:

Int J Mol Med 2009, 23(1), 57  Jankun et al,

VLHL plasminogen activator inhibitor spontaneously reactivates from the latent 
to active form.

I fully agree with Tim - making rude comments about reviewers (or anybody else) 
is not going to help you. Questioning an extra band is a legitimate remark that 
you should address.

Good luck - Ewa



Dr Ewa Skrzypczak-Jankun  Associate Professor

University of Toledo   Office: Dowling 
Hall r.2257

Health Science Campus Phone:  419-383-5414

Urology Department Mail Stop #1091   Fax:  419-383-3785

3000 Arlington Ave.e-mail: 
ewa.skrzypczak-jan...@utoledo.edu

Toledo OH 43614-2598   web: 
http://golemxiv.dh.meduohio.edu/~ewa 



 

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Tim Gruene
Sent: Thursday, October 28, 2010 7:51 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Additional band on gel due to his-tag: any references?

 

Dear Sebastiaan,

 

isn't it the editor rather than the referees whom you have to convince? And if

the editor does not even understand how SDS-PAGE works and still considers

this a reason not to publish your article against your own expertise, maybe it

is worth changing the journal.

 

Finally, since referees know the names of the authors of the articles they are

refereeing, you may need a lot more than a couple of references to appease them

after you called them bloody-minded in a public email forum.

 

My two cents, Tim

 

On Thu, Oct 28, 2010 at 01:16:59PM +0200, Sebastiaan Werten wrote:

 Dear all,

 

 we have a his-tagged protein that shows a minor accompanying band in

 SDS-PAGE, just above the main band. According to all other methods

 available to us the material is homogeneous, the protein has the correct

 mass in MALDI-TOF, epitopes are recognized, etc. etc.

 

 I know that the additional band is a very common artifact with

 his-tagged proteins, but I was wondering if anyone is aware of a paper

 that formally describes the phenomenon, as we need to appease a couple

 of rather bloody-minded referees.

 

 Thanks very much for any suggestions, Seb.

 

 --

 Dr. Sebastiaan Werten

 Institut für Biochemie

 Universität Greifswald

 Felix-Hausdorff-Str. 4

 D-17489 Greifswald

 Germany

 Tel: +49 38 34 86 44 61

 E-mail: sebastiaan.wer...@uni-greifswald.de

 

-- 

--

Tim Gruene

Institut fuer anorganische Chemie

Tammannstr. 4

D-37077 Goettingen

 

phone: +49 (0)551 39 22149

 

GPG Key ID = A46BEE1A

 



Re: [ccp4bb] Bug in c_truncate?

2010-10-28 Thread Peter Chan




Dear Crystallographers,

Thank you all for the emails. Below are some details of the procedures I 
performed leading up to the problem.

The reflection file is my own data, processed in XDS and then flagging FreeR's 
in XPREP in thin resolution shells. I am using CCP4i version 6.1.2. I tried 
looking for known/resolved issues/updates in version 6.1.3 but could not find 
any so I assumed it is the same version of f2mtz/ctruncate/uniqueify.


I used the GUI version of F2MTZ, with the settings below:

- import file in SHELX format

- keep existing FreeR flags

- fortran format (3F4.0,2F8.3,F4.0)

- added data label I other integer // FreeRflag

The hkl file, in SHELX format, output by XPREP look something like this:

 -26  -3   1  777.48   39.19
  26  -3  -1  800.83   36.31
 -26   3  -1  782.67   37.97
  27  -3   1  45.722  25.711  -1
 -27   3   1  -14.20   31.69  -1

Notice the test set is flagged -1 and the working set is not flagged at all. 
This actually lead to another error message in f2mtz about missing FreeR flags. 
From my understanding, the SHELX flagging convention is 1 for working and 
-1 for test. So I manually tagged the working set with 1 using vi:

 -26  -3   1  777.48   39.19   1
  26  -3  -1  800.83   36.31   1
 -26   3  -1  782.67   37.97   1
  27  -3   1  45.722  25.711  -1
 -27   3   1  -14.20   31.69  -1

This is the file which gives me the error message: Problem with FREE column in 
input file. All flags apparently identical. Check input file.. Apparently, 
import to mtz works ok when I use old-truncate instead of c-truncate.

Best,
Peter
  

Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Jan Schoepe
Hi Sebastian,

Under the assumption that the SDS in your assay does not completely unfold the 
protein during electrophoresis (chemical impurity can be excluded because of MS 
experiments, right?), how about adding some urea additionally to the SDS-PAGE 
(or changing SDS concentration)?

GL Jan




--- Skrzypczak-Jankun, Ewa ewa.skrzypczak-jan...@utoledo.edu schrieb am Do, 
28.10.2010:

Von: Skrzypczak-Jankun, Ewa ewa.skrzypczak-jan...@utoledo.edu
Betreff: Re: [ccp4bb] Additional band on gel due to his-tag: any references?
An: CCP4BB@JISCMAIL.AC.UK
Datum: Donnerstag, 28. Oktober, 2010 16:09 Uhr




 
 

 

 

 

 

 

 

 
 







   

Additional
band on a gel might not be caused by his-tag. It is often a result of different
conformation/molecular shape and so the molecule travels with different speed
in the gel. We may wish for a homogenous sample (chemically and structurally)
but this is seldom true  

See
example: 

Int
J Mol Med 2009, 23(1), 57  Jankun et al, 

VLHL
plasminogen activator inhibitor spontaneously reactivates from the latent to
active form. 

I
fully agree with Tim - making rude comments about reviewers (or anybody else)
is not going to help you. Questioning an extra band is a legitimate remark that
you should address. 

Good
luck - Ewa 

 

Dr
Ewa Skrzypczak-Jankun          Associate Professor 

University 
 of Toledo       
Office: Dowling Hall r.2257 

Health
Science Campus     Phone:  419-383-5414 

Urology
Department Mail Stop #1091       Fax:  419-383-3785 

3000 Arlington Ave.        
e-mail: ewa.skrzypczak-jan...@utoledo.edu 

Toledo OH 43614-2598       web:
http://golemxiv.dh.meduohio.edu/~ewa  

 

   

-Original
Message-

From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] 
On Behalf Of Tim Gruene

Sent: Thursday, October 28, 2010 7:51 AM

To: CCP4BB@JISCMAIL.AC.UK

Subject: Re: [ccp4bb] Additional band on
gel due to his-tag: any references?

   

Dear
Sebastiaan, 

   

isn't
it the editor rather than the referees whom you have to convince? And if 

the
editor does not even understand how SDS-PAGE works and still considers 

this
a reason not to publish your article against your own expertise, maybe it 

is
worth changing the journal. 

   

Finally,
since referees know the names of the authors of the articles they are 

refereeing,
you may need a lot more than a couple of references to appease them 

after
you called them bloody-minded in a public email forum. 

   

My
two cents, Tim 

   

On
Thu, Oct 28, 2010 at 01:16:59PM +0200, Sebastiaan Werten wrote: 


Dear all, 

   


we have a his-tagged protein that shows a minor accompanying band in 


SDS-PAGE, just above the main band. According to all other methods 


available to us the material is homogeneous, the protein has the correct 


mass in MALDI-TOF, epitopes are recognized, etc. etc. 

   


I know that the additional band is a very common artifact with 


his-tagged proteins, but I was wondering if anyone is aware of a paper 


that formally describes the phenomenon, as we need to appease a couple 


of rather bloody-minded referees. 

   


Thanks very much for any suggestions, Seb. 

   


-- 


Dr. Sebastiaan Werten 


Institut für Biochemie 


Universität Greifswald 


Felix-Hausdorff-Str. 4 


D-17489 Greifswald 


 Germany 


Tel: +49 38 34 86 44 61 


E-mail: sebastiaan.wer...@uni-greifswald.de 

   

--
 

-- 

Tim
Gruene 

Institut
fuer anorganische Chemie 

Tammannstr.
4 

D-37077
Goettingen 

   

phone:
+49 (0)551 39 22149 

   

GPG
Key ID = A46BEE1A 

   



 





Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

2010-10-28 Thread Simon Kolstoe
It can sometimes be struggle to find the boundary between cynicism and  
pragmatism!


I was, however, rather bemused by Dr Joosten's 7 rules of thumb -  
probably all of which I use and have seen used by referees. Of course  
I wouldn't want to blindly advocate any of them, however their use  
does make life somewhat easier for those of us who use crystallography  
to discover things about biology compared with their use by the  
(rather impressive!) members of this community who are involved in  
theoretical/methodological development. A time comes on my projects  
where you have to say two things -  1) my structure is telling me x  
and although I can spend the next six months performing minor tweaks  
these will not add (or subtract) from the conclusions I am interested  
in and 2)when I submit this to referees will they think my structure  
is appropriate to draw these conclusions?. It is whilst asking these  
two questions that rules of thumb become somewhat handy, especially  
when they coincide with the rules of thumb used by the referees.


Simon


On 28 Oct 2010, at 10:28, Eleanor Dodson wrote:


Oh cynic!
Eleanor



On 10/27/2010 09:01 PM, Simon Kolstoe wrote:
Surely the best model is the one that the referees for your paper  
are

happy with?

I have found referees to impose seemingly random and arbitrary  
standards

that sometime require a lot of effort to comply with but result in
little to no impact on the biology being described. Mind you  
discussions

on this email list can be a useful resource for telling referee's why
you don't think you should comply with their rule of thumb.

Simon



On 27 Oct 2010, at 20:11, Bernhard Rupp (Hofkristallrat a.D.) wrote:


Dear Young and Impressionable readers:

I second-guess here that Robbie's intent - after re-refining many  
many

PDB
structures, seeing dreadful things, and becoming a hardened cynic  
- is to
provoke more discussion in order to put in perspective - if not  
debunk-

almost all of these rules.

So it may be better to pretend you have never heard of these  
rules. Your

crystallographic life might be a happier and less biased one.

If you follow this simple procedure (not a rule)

The model that fits the primary evidence (minimally biased electron
density)
best and is at the same time physically meaningful, is the best  
model, i.
e., all plausibly accountable electron density (and not more) is  
modeled.


This process of course does require a little work (like looking
through all
of the model, not just the interesting parts, and thinking what  
makes

sense)
but may lead to additional and unexpected insights. And in almost  
all

cases,
you will get a model with plausible statistics, without any  
reliance on

rules.

For some decisions regarding global parameterizations you have to
apply more
sophisticated test such as Ethan pointed out (HR tests) or Ian uses
(LL-tests). And once you know how to do that, you do not need any
rules of
thumb anyhow.

So I opt for a formal burial of these rules of thumb and a toast to
evidence
and plausibility.

And, as Gerard B said in other words so nicely:

Si tacuisses, philosophus mansisses.

BR

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf  
Of

Robbie
Joosten
Sent: Tuesday, October 26, 2010 10:29 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

Dear Anthony,

That is an excellent question! I believe there are quite a lot of
'rules of
thumb' going around. Some of them seem to lead to very dogmatic
thinking and
have caused (refereeing) trouble for good structures and lack of
trouble for
bad structures. A lot of them were discussed at the CCP4BB so it may
be nice
to try to list them all.


Rule 1: If Rwork  20%, you are done.
Rule 2: If R-free - Rwork  5%, your structure is wrong.
Rule 3: At resolution X, the bond length rmsd should be  than Y  
(What is
the rmsd thing people keep talking about?) Rule 4: If your  
resolution is

lower than X, you should not use_anisotropic_Bs/riding_hydrogens
Rule 5: You should not build waters/alternates at resolutions lower
than X
Rule 6: You should do the final refinement with ALL reflections Rule
7: No
one cares about getting the carbohydrates right


Obviously, this list is not complete. I may also have overstated  
some

of the
rules to get the discussion going. Any addidtions are welcome.

Cheers,
Robbie Joosten
Netherlands Cancer Institute

Apologies if I have missed a recent relevant thread, but are  
lists of

rules of thumb for model building and refinement?





Anthony



Anthony Duff Telephone: 02 9717 3493 Mob: 043 189 1076


=


Re: [ccp4bb] Bug in c_truncate?

2010-10-28 Thread Tim Gruene
Hello Peter,

I faintly rememeber a similar kind of problem, and think that if you replace
-1 with 0, the problem should go away. It seemed that -1 is not an allowed
flag for (some) ccp4 programs.

Please let us know if this resolves the issue.

Tim

On Thu, Oct 28, 2010 at 10:21:20AM -0400, Peter Chan wrote:
 
 
 
 
 Dear Crystallographers,
 
 Thank you all for the emails. Below are some details of the procedures I 
 performed leading up to the problem.
 
 The reflection file is my own data, processed in XDS and then flagging 
 FreeR's in XPREP in thin resolution shells. I am using CCP4i version 6.1.2. I 
 tried looking for known/resolved issues/updates in version 6.1.3 but could 
 not find any so I assumed it is the same version of f2mtz/ctruncate/uniqueify.
 
 
 I used the GUI version of F2MTZ, with the settings below:
 
 - import file in SHELX format
 
 - keep existing FreeR flags
 
 - fortran format (3F4.0,2F8.3,F4.0)
 
 - added data label I other integer // FreeRflag
 
 The hkl file, in SHELX format, output by XPREP look something like this:
 
  -26  -3   1  777.48   39.19
   26  -3  -1  800.83   36.31
  -26   3  -1  782.67   37.97
   27  -3   1  45.722  25.711  -1
  -27   3   1  -14.20   31.69  -1
 
 Notice the test set is flagged -1 and the working set is not flagged at 
 all. This actually lead to another error message in f2mtz about missing FreeR 
 flags. From my understanding, the SHELX flagging convention is 1 for 
 working and -1 for test. So I manually tagged the working set with 1 
 using vi:
 
  -26  -3   1  777.48   39.19   1
   26  -3  -1  800.83   36.31   1
  -26   3  -1  782.67   37.97   1
   27  -3   1  45.722  25.711  -1
  -27   3   1  -14.20   31.69  -1
 
 This is the file which gives me the error message: Problem with FREE column 
 in input file. All flags apparently identical. Check input file.. 
 Apparently, import to mtz works ok when I use old-truncate instead of 
 c-truncate.
 
 Best,
 Peter
 
-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

phone: +49 (0)551 39 22149

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Ed Pozharski
In addition to bulk solvent, the other well recognized problem with
macromolecular structures is the inadequate description of disorder.
With small molecules, the Debye-Waller works much better because the
harmonic oscillator is indeed a good model there.  Note that the problem
is not anisotropy (which we can model if resolution is sufficiently
high), but rather anharmonic motion and multiple conformations that go
undetected.

On Thu, 2010-10-28 at 08:00 -0500, Jacob Keller wrote:
 So I guess a consequence of what you say is that since in cases where there 
 is no solvent the R values are often better than the precision of the actual 
 measurements (never true with macromolecular crystals involving solvent), 
 perhaps our real problem might be modelling solvent? 
 Alternatively/additionally, I wonder whether there also might be more 
 variability molecule-to-molecule in proteins, which we may not model well 
 either.
 
 JPK
 
 - Original Message - 
 From: George M. Sheldrick gshe...@shelx.uni-ac.gwdg.de
 To: CCP4BB@JISCMAIL.AC.UK
 Sent: Thursday, October 28, 2010 4:05 AM
 Subject: Re: [ccp4bb] Against Method (R)
 
 
  It is instructive to look at what happens for small molecules where
  there is often no solvent to worry about. They are often refined
  using SHELXL, which does indeed print out the weighted R-value based
  on intensities (wR2), the conventional unweighted R-value R1 (based
  on F) and sigmaI/I, which it calls R(sigma). For well-behaved
  crystals R1 is in the range 1-5% and R(merge) (based on intensities)
  is in the range 3-9%. As you suggest, 0.5*R(sigma) could be regarded
  as the lower attainable limit for R1 and this is indeed the case in
  practice (the factor 0.5 approximately converts from I to F). Rpim
  gives similar results to R(sigma), both attempt to measure the
  precision of the MERGED data, which are what one is refining against.
 
  George
 
  Prof. George M. Sheldrick FRS
  Dept. Structural Chemistry,
  University of Goettingen,
  Tammannstr. 4,
  D37077 Goettingen, Germany
  Tel. +49-551-39-3021 or -3068
  Fax. +49-551-39-22582
 
 
  On Wed, 27 Oct 2010, Ed Pozharski wrote:
 
  On Tue, 2010-10-26 at 21:16 +0100, Frank von Delft wrote:
   the errors in our measurements apparently have no
   bearing whatsoever on the errors in our models
 
  This would mean there is no point trying to get better crystals, right?
  Or am I also wrong to assume that the dataset with higher I/sigma in the
  highest resolution shell will give me a better model?
 
  On a related point - why is Rmerge considered to be the limiting value
  for the R?  Isn't Rmerge a poorly defined measure itself that
  deteriorates at least in some circumstances (e.g. increased redundancy)?
  Specifically, shouldn't ideal R approximate 0.5*sigmaI/I?
 
  Cheers,
 
  Ed.
 
 
 
  -- 
  I'd jump in myself, if I weren't so good at whistling.
 Julian, King of Lemurs
 
 
 
 
 ***
 Jacob Pearson Keller
 Northwestern University
 Medical Scientist Training Program
 Dallos Laboratory
 F. Searle 1-240
 2240 Campus Drive
 Evanston IL 60208
 lab: 847.491.2438
 cel: 773.608.9185
 email: j-kell...@northwestern.edu
 ***

-- 
I'd jump in myself, if I weren't so good at whistling.
   Julian, King of Lemurs


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread George M. Sheldrick
Not quite. I was trying to say that for good small molecule data, R1 is 
usally significantly less than Rmerge, but never less than the precision
of the experimental data measured by 0.5*sigmaI/I = 0.5*Rsigma 
(or the very similar 0.5*Rpim).

George

Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-22582


On Thu, 28 Oct 2010, Jacob Keller wrote:

 So I guess a consequence of what you say is that since in cases where there is
 no solvent the R values are often better than the precision of the actual
 measurements (never true with macromolecular crystals involving solvent),
 perhaps our real problem might be modelling solvent?
 Alternatively/additionally, I wonder whether there also might be more
 variability molecule-to-molecule in proteins, which we may not model well
 either.
 
 JPK
 
 - Original Message - From: George M. Sheldrick
 gshe...@shelx.uni-ac.gwdg.de
 To: CCP4BB@JISCMAIL.AC.UK
 Sent: Thursday, October 28, 2010 4:05 AM
 Subject: Re: [ccp4bb] Against Method (R)
 
 
  It is instructive to look at what happens for small molecules where
  there is often no solvent to worry about. They are often refined
  using SHELXL, which does indeed print out the weighted R-value based
  on intensities (wR2), the conventional unweighted R-value R1 (based
  on F) and sigmaI/I, which it calls R(sigma). For well-behaved
  crystals R1 is in the range 1-5% and R(merge) (based on intensities)
  is in the range 3-9%. As you suggest, 0.5*R(sigma) could be regarded
  as the lower attainable limit for R1 and this is indeed the case in
  practice (the factor 0.5 approximately converts from I to F). Rpim
  gives similar results to R(sigma), both attempt to measure the
  precision of the MERGED data, which are what one is refining against.
 
  George
 
  Prof. George M. Sheldrick FRS
  Dept. Structural Chemistry,
  University of Goettingen,
  Tammannstr. 4,
  D37077 Goettingen, Germany
  Tel. +49-551-39-3021 or -3068
  Fax. +49-551-39-22582
 
 
  On Wed, 27 Oct 2010, Ed Pozharski wrote:
 
   On Tue, 2010-10-26 at 21:16 +0100, Frank von Delft wrote:
the errors in our measurements apparently have no
bearing whatsoever on the errors in our models
  
   This would mean there is no point trying to get better crystals, right?
   Or am I also wrong to assume that the dataset with higher I/sigma in the
   highest resolution shell will give me a better model?
  
   On a related point - why is Rmerge considered to be the limiting value
   for the R?  Isn't Rmerge a poorly defined measure itself that
   deteriorates at least in some circumstances (e.g. increased redundancy)?
   Specifically, shouldn't ideal R approximate 0.5*sigmaI/I?
  
   Cheers,
  
   Ed.
  
  
  
   -- 
   I'd jump in myself, if I weren't so good at whistling.
  Julian, King of Lemurs
  
  
 
 
 ***
 Jacob Pearson Keller
 Northwestern University
 Medical Scientist Training Program
 Dallos Laboratory
 F. Searle 1-240
 2240 Campus Drive
 Evanston IL 60208
 lab: 847.491.2438
 cel: 773.608.9185
 email: j-kell...@northwestern.edu
 ***
 
 


Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Van Den Berg, Bert
Hi Seb,

I'm not aware of the notion (and neither are your reviewers apparently) that a 
His tag often results in two bands on a lane in SDS page. Why would that be? 
Extra SDS binding to the positive patch?
Just wondering if there's any truth to your statement.

Also, since in this case there seems to be consensus (!) among reviewers 
regarding the double band, they may have a legitimate point..my 2 cents.

Regards, Bert


On 10/28/10 7:16 AM, Sebastiaan Werten sebastiaan.wer...@uni-greifswald.de 
wrote:

Dear all,

we have a his-tagged protein that shows a minor accompanying band in
SDS-PAGE, just above the main band. According to all other methods
available to us the material is homogeneous, the protein has the
correct
mass in MALDI-TOF, epitopes are recognized, etc. etc.

I know that the additional band is a very common artifact with
his-tagged proteins, but I was wondering if anyone is aware of a paper
that formally describes the phenomenon, as we need to appease a couple
of rather bloody-minded referees.

Thanks very much for any suggestions, Seb.

--
Dr. Sebastiaan Werten
Institut für Biochemie
Universität Greifswald
Felix-Hausdorff-Str. 4
D-17489 Greifswald
Germany
Tel: +49 38 34 86 44 61
E-mail: sebastiaan.wer...@uni-greifswald.de




Re: [ccp4bb] Babinet solvent correction [WAS: R-free flag problem]

2010-10-28 Thread Dirk Kostrewa

Hi Tim,

sorry for my late reply - I just came back to the lab.

In the Babinet bulk solvent correction, no bulk solvent phases are used, 
it is entirely based on amplitudes and strictly only valid if the phases 
of the bulk solvent are opposite to the ones of the protein. And as 
Sasha Urzhumtsev pointed out, this assumption is only valid at very low 
resolution.


The mask bulk solvent correction is a vector sum including the phases of 
the bulk solvent mask, which makes a difference at medium resolution (up 
to ~4.5 A, or so).


As far as I can see, your formulas given below do not distinguish 
between amplitude (modulus) and vector bulk solvent corrections.


Personally, I really don't see any physical sense in using both 
corrections together, except for compensating any potential scaling 
problems at low resolution.


If the model is basically complete and correct, the mask bulk solvent 
correction is usually superior to the Babinet bulk solvent correction 
(see, for example, my old and small CCP4 Newsletter contribution 
http://www.ccp4.ac.uk/newsletters/newsletter34/bsdk_text.html).


However, there are also good reasons for using the Babinet bulk solvent 
correction (it should be an option in ALL refinement programs!):

- it requires only two parameters and can be used in any case
- in rigid body refinement, the mask lags behind; here, I always use the 
Babinet BS correction
- channels could show false positive density, because the mask left them 
empty - this depends heavily on the choice of radii to determine/shrink 
the bulk solvent mask; in such cases, I always calculate a Babinet BS 
correction as a control


Best regards,

Dirk.

Am 23.10.10 22:14, schrieb Tim Fenn:

On Sat, 23 Oct 2010 10:05:15 -0700
Pavel Afoninepafon...@gmail.com  wrote:


Hi Tim,

  ...but I hope this answers the question:
Babinet's vs. the flat model?  Use them together!  ;)


thanks a lot for your reply.

Could you please explain the *physical* meaning of using both
models together?

I can try!  Typically, we model the bulk solvent using a real space
mask that is set to 1 in the bulk solvent region and 0 in the protein.
This gets Fourier transformed, symmetrized and added in to the
scattering factors from the molecule (Equation 1 in the paper, page 6
in your presentation):

Ftot = Fc + ks*Fs*exp(-Bs*s^2/4)

which works great and is how things are usually coded in most
macromolecular software, no problems or arguments there.  However,
we can come from the opposite - but equivalent! - direction of
Babinet's principle, which tells us the bulk solvent can also be
modeled by inverting everything: set the bulk solvent region to 0 and
the protein region to 1 in the real space mask, apply a Fourier
transform to that and then invert the phase:

Ftot = Fc - ks*Fm*exp(-Bs*s^2/4)

(I'm using Fm to distinguish it from Fs, due to the inversion of 0's
and 1's in the real space mask)  This is equation 2 in the paper.

So we're still using the flat model to compute Fm, and we're using
Babinet's principle to add it in to the structure factors - although
its better described as adding the inverse (thus the minus sign in the
second equation) of the complement (Fm rather than Fs). These two
equations are exactly equivalent, without any loss of generality. So, I
would argue the flat model and Babinet's are very much congruous.  Also
take a look at the description/discussion in the paper regarding Figure
2 (which helped me think about things at first).

The big difference is that Babinet's is usually applied as:

Ftot = Fc - ks*Fc*exp(-Bs*s^2/4)

which, I would argue, isn't quite right - the bulk solvent doesn't
scatter like protein, but it does get the shape right.  Which I think
is why Fokine and Urzhumtsev point out that at high resolution this
form would start to show disagreement with the data.  I haven't looked
at this explicitly though, so we still haven't answered that question!
We didn't want to spend much time on it in the paper, our main goal was
to try out the differentiable models we describe.  The Babinet trick
was a convenient way to make coding easier.

Anyway, I hope this helps explain it a bit more, and again: sorry for
the long-windedness.

Regards,
Tim



--

***
Dirk Kostrewa
Gene Center Munich, A5.07
Department of Biochemistry
Ludwig-Maximilians-Universität München
Feodor-Lynen-Str. 25
D-81377 Munich
Germany
Phone:  +49-89-2180-76845
Fax:+49-89-2180-76999
E-mail: kostr...@genzentrum.lmu.de
WWW:www.genzentrum.lmu.de
***



Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Sebastiaan Werten
Please let me clarify that it is by no means my intention to be rude to
any referees, nor to round up alternative explanations for the extra
band. The only thing I am after is a proper reference for the
phenomenon of his-tagged proteins producing an extra band at slightly
higher apparent molecular weight.

I just unearthed a 2007 thread from the CCP4 archives where this very
subject appears to have been discussed already
(https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind0711L=CCP4BBD=0P=38996)
but unfortunately I didn't find any references to published literature
there either.

Seb.


-- 
Dr. Sebastiaan Werten
Institut für Biochemie
Universität Greifswald
Felix-Hausdorff-Str. 4
D-17489 Greifswald
Germany
Tel: +49 38 34 86 44 61
E-mail: sebastiaan.wer...@uni-greifswald.de


On Thu, 28 Oct 2010 10:09:37 -0400, Skrzypczak-Jankun, Ewa
ewa.skrzypczak-jan...@utoledo.edu wrote:
 Additional band on a gel might not be caused by his-tag. It is often
 a result of different conformation/molecular shape and so the molecule
 travels with different speed in the gel. We may wish for a homogenous
 sample (chemically and structurally) but this is seldom true  
 
 See example: 
 
 Int J Mol Med 2009, 23(1), 57  Jankun et al, 
 
 VLHL plasminogen activator inhibitor spontaneously reactivates from
 the latent to active form. 
 
 I fully agree with Tim - making rude comments about reviewers (or
 anybody else) is not going to help you. Questioning an extra band is a
 legitimate remark that you should address. 
 
 Good luck - Ewa 
 
  
 
 Dr Ewa Skrzypczak-Jankun   
       Associate Professor 
 
 University of Toledo   
    Office: Dowling Hall
 r.2257 
 
 Health Science Campus    
 Phone:  419-383-5414 
 
 Urology Department Mail Stop #1091      
 Fax:  419-383-3785 
 
 3000 Arlington
 Ave.   
     e-mail: ewa.skrzypczak-jan...@utoledo.edu 
 
 Toledo OH 43614-2598       web:
 http://golemxiv.dh.meduohio.edu/~ewa  
 
  
 
 -Original Message-
  From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf
 Of Tim Gruene
  Sent: Thursday, October 28, 2010 7:51 AM
  To: CCP4BB@JISCMAIL.AC.UK
  Subject: Re: [ccp4bb] Additional band on gel due to his-tag: any
 references? 
 
 Dear Sebastiaan, 
 
 isn't it the editor rather than the referees whom you have to
 convince? And if 
 
 the editor does not even understand how SDS-PAGE works and still
 considers 
 
 this a reason not to publish your article against your own expertise,
 maybe it 
 
 is worth changing the journal. 
 
 Finally, since referees know the names of the authors of the articles
 they are 
 
 refereeing, you may need a lot more than a couple of references to
 appease them 
 
 after you called them bloody-minded in a public email forum. 
 
 My two cents, Tim 
 
 On Thu, Oct 28, 2010 at 01:16:59PM +0200, Sebastiaan Werten wrote: 
 
 Dear all,
 

 
 we have a his-tagged protein that shows a minor accompanying band
 in 
 
 SDS-PAGE, just above the main band. According to all other methods
 
 available to us the material is homogeneous, the protein has the
 correct 
 
 mass in MALDI-TOF, epitopes are recognized, etc. etc.
 

 
 I know that the additional band is a very common artifact with
 
 his-tagged proteins, but I was wondering if anyone is aware of a
 paper 
 
 that formally describes the phenomenon, as we need to appease a
 couple 
 
 of rather bloody-minded referees.
 

 
 Thanks very much for any suggestions, Seb.
 

 
 --
 
 Dr. Sebastiaan Werten
 
 Institut für Biochemie
 
 Universität Greifswald
 
 Felix-Hausdorff-Str. 4
 
 D-17489 Greifswald
 
 Germany
 
 Tel: +49 38 34 86 44 61
 
 E-mail: sebastiaan.wer...@uni-greifswald.de
 
 --


Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Artem Evdokimov
Dear Sebastian,

Having personally purified upwards of 500 (I lost count really) of
His-tagged proteins, I can't say that I have the same awareness as you with
respect to the additional band being 'very common'. Depending on the kind of
expression system, size of your protein, conditions of purification, and
most importantly the kind of IMAC resin you're using there can be anywhere
between zero and five *extra* bands resulting from contaminants binding to
the resin. In the case of E. coli expression there usually are two main
contaminants (SlyD and a 61 kDa protein) but can be more than that if the
ratio of your protein to resin is unfavorable. Without going through a huge
PITA you can't be sure that the extra band is even related to your protein
of interest - for example, you could do an in-gel digest and get peptides
identified by MS however if your extra band is close enough to the main band
there will always be a lingering concern that your main band is 'bleeding'
into the extra band and therefore the in-gel digest results are biased
towards the expected.

What sort of a concern are these reviewers trying to address? Is there some
unique biological activity in the sample (like a novel enzyme activity
previously unobserved for your target protein) - in which case the concerns
might be justified, or is it just the level of purity with respect to
structure being solved - in which case it's not much of a concern at all?

Artem

On Thu, Oct 28, 2010 at 6:16 AM, Sebastiaan Werten 
sebastiaan.wer...@uni-greifswald.de wrote:

 Dear all,

 we have a his-tagged protein that shows a minor accompanying band in
 SDS-PAGE, just above the main band. According to all other methods
 available to us the material is homogeneous, the protein has the correct
 mass in MALDI-TOF, epitopes are recognized, etc. etc.

 I know that the additional band is a very common artifact with
 his-tagged proteins, but I was wondering if anyone is aware of a paper
 that formally describes the phenomenon, as we need to appease a couple
 of rather bloody-minded referees.

 Thanks very much for any suggestions, Seb.

 --
 Dr. Sebastiaan Werten
 Institut für Biochemie
 Universität Greifswald
 Felix-Hausdorff-Str. 4
 D-17489 Greifswald
 Germany
 Tel: +49 38 34 86 44 61
 E-mail: sebastiaan.wer...@uni-greifswald.de



Re: [ccp4bb] Additional band on gel due to his-tag: any references?

2010-10-28 Thread Daniel Bonsor
THERMATOGA MARITIMA IscU IS A STRUCTURED IRON-SULFUR CLUSTER ASSEMBLY PROTEIN 
June 14, 2002 The Journal of Biological Chemistry, 277, 21397-21404. 

His-tagged Iron cluster that runs as a doublet. Mass-spec show they are the 
same species. They concluded that the protein binds SDS in two stoichiometries 
and therefore runs as a doublet as seen for the OmpA protein (references are 
in the paper).


Hope that helps.


Dan


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Jacob Keller
So I guess there is never a case in crystallography in which our
models predict the data to within the errors of data collection? I
guess the situation might be similar to fitting a Michaelis-Menten
curve, in which the fitted line often misses the error bars of the
individual points, but gets the overall pattern right. In that case,
though, I don't think we say that we are inadequately modelling the
data. I guess there the error bars are actually too small (are
underestimated.) Maybe our intensity errors are also underestimated?

JPK

On Thu, Oct 28, 2010 at 9:50 AM, George M. Sheldrick
gshe...@shelx.uni-ac.gwdg.de wrote:

 Not quite. I was trying to say that for good small molecule data, R1 is
 usally significantly less than Rmerge, but never less than the precision
 of the experimental data measured by 0.5*sigmaI/I = 0.5*Rsigma
 (or the very similar 0.5*Rpim).

 George

 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582


 On Thu, 28 Oct 2010, Jacob Keller wrote:

 So I guess a consequence of what you say is that since in cases where there 
 is
 no solvent the R values are often better than the precision of the actual
 measurements (never true with macromolecular crystals involving solvent),
 perhaps our real problem might be modelling solvent?
 Alternatively/additionally, I wonder whether there also might be more
 variability molecule-to-molecule in proteins, which we may not model well
 either.

 JPK

 - Original Message - From: George M. Sheldrick
 gshe...@shelx.uni-ac.gwdg.de
 To: CCP4BB@JISCMAIL.AC.UK
 Sent: Thursday, October 28, 2010 4:05 AM
 Subject: Re: [ccp4bb] Against Method (R)


  It is instructive to look at what happens for small molecules where
  there is often no solvent to worry about. They are often refined
  using SHELXL, which does indeed print out the weighted R-value based
  on intensities (wR2), the conventional unweighted R-value R1 (based
  on F) and sigmaI/I, which it calls R(sigma). For well-behaved
  crystals R1 is in the range 1-5% and R(merge) (based on intensities)
  is in the range 3-9%. As you suggest, 0.5*R(sigma) could be regarded
  as the lower attainable limit for R1 and this is indeed the case in
  practice (the factor 0.5 approximately converts from I to F). Rpim
  gives similar results to R(sigma), both attempt to measure the
  precision of the MERGED data, which are what one is refining against.
 
  George
 
  Prof. George M. Sheldrick FRS
  Dept. Structural Chemistry,
  University of Goettingen,
  Tammannstr. 4,
  D37077 Goettingen, Germany
  Tel. +49-551-39-3021 or -3068
  Fax. +49-551-39-22582
 
 
  On Wed, 27 Oct 2010, Ed Pozharski wrote:
 
   On Tue, 2010-10-26 at 21:16 +0100, Frank von Delft wrote:
the errors in our measurements apparently have no
bearing whatsoever on the errors in our models
  
   This would mean there is no point trying to get better crystals, right?
   Or am I also wrong to assume that the dataset with higher I/sigma in the
   highest resolution shell will give me a better model?
  
   On a related point - why is Rmerge considered to be the limiting value
   for the R?  Isn't Rmerge a poorly defined measure itself that
   deteriorates at least in some circumstances (e.g. increased redundancy)?
   Specifically, shouldn't ideal R approximate 0.5*sigmaI/I?
  
   Cheers,
  
   Ed.
  
  
  
   --
   I'd jump in myself, if I weren't so good at whistling.
                                  Julian, King of Lemurs
  
  


 ***
 Jacob Pearson Keller
 Northwestern University
 Medical Scientist Training Program
 Dallos Laboratory
 F. Searle 1-240
 2240 Campus Drive
 Evanston IL 60208
 lab: 847.491.2438
 cel: 773.608.9185
 email: j-kell...@northwestern.edu
 ***





Re: [ccp4bb] Bug in c_truncate?

2010-10-28 Thread Peter Chan

Hello Tim,

Thank you for the suggestion. I have now tagged the working set as 1 and test 
set as 0. Unfortunately, it still gives the same error about all Rfree being 
the same, and only in c-truncate but not old-truncate. Perhaps I should install 
6.1.3 and see if the problem still persist.

Best,
Peter

 Date: Thu, 28 Oct 2010 16:29:31 +0200
 From: t...@shelx.uni-ac.gwdg.de
 Subject: Re: [ccp4bb] Bug in c_truncate?
 To: CCP4BB@JISCMAIL.AC.UK
 
 Hello Peter,
 
 I faintly rememeber a similar kind of problem, and think that if you replace
 -1 with 0, the problem should go away. It seemed that -1 is not an 
 allowed
 flag for (some) ccp4 programs.
 
 Please let us know if this resolves the issue.
 
 Tim
 
 On Thu, Oct 28, 2010 at 10:21:20AM -0400, Peter Chan wrote:
  
  
  
  
  Dear Crystallographers,
  
  Thank you all for the emails. Below are some details of the procedures I 
  performed leading up to the problem.
  
  The reflection file is my own data, processed in XDS and then flagging 
  FreeR's in XPREP in thin resolution shells. I am using CCP4i version 6.1.2. 
  I tried looking for known/resolved issues/updates in version 6.1.3 but 
  could not find any so I assumed it is the same version of 
  f2mtz/ctruncate/uniqueify.
  
  
  I used the GUI version of F2MTZ, with the settings below:
  
  - import file in SHELX format
  
  - keep existing FreeR flags
  
  - fortran format (3F4.0,2F8.3,F4.0)
  
  - added data label I other integer // FreeRflag
  
  The hkl file, in SHELX format, output by XPREP look something like this:
  
   -26  -3   1  777.48   39.19
26  -3  -1  800.83   36.31
   -26   3  -1  782.67   37.97
27  -3   1  45.722  25.711  -1
   -27   3   1  -14.20   31.69  -1
  
  Notice the test set is flagged -1 and the working set is not flagged at 
  all. This actually lead to another error message in f2mtz about missing 
  FreeR flags. From my understanding, the SHELX flagging convention is 1 
  for working and -1 for test. So I manually tagged the working set with 
  1 using vi:
  
   -26  -3   1  777.48   39.19   1
26  -3  -1  800.83   36.31   1
   -26   3  -1  782.67   37.97   1
27  -3   1  45.722  25.711  -1
   -27   3   1  -14.20   31.69  -1
  
  This is the file which gives me the error message: Problem with FREE 
  column in input file. All flags apparently identical. Check input file.. 
  Apparently, import to mtz works ok when I use old-truncate instead of 
  c-truncate.
  
  Best,
  Peter

 -- 
 --
 Tim Gruene
 Institut fuer anorganische Chemie
 Tammannstr. 4
 D-37077 Goettingen
 
 phone: +49 (0)551 39 22149
 
 GPG Key ID = A46BEE1A
 
  

Re: [ccp4bb] Bug in c_truncate?

2010-10-28 Thread Phil Evans
Why are you running [c]truncate? this is used to convert I - F and I would be 
surprised if it recognised or preserved a FreeR column

Phil

On 28 Oct 2010, at 17:48, Peter Chan wrote:

 Hello Tim,
 
 Thank you for the suggestion. I have now tagged the working set as 1 and 
 test set as 0. Unfortunately, it still gives the same error about all Rfree 
 being the same, and only in c-truncate but not old-truncate. Perhaps I should 
 install 6.1.3 and see if the problem still persist.
 
 Best,
 Peter
 
  Date: Thu, 28 Oct 2010 16:29:31 +0200
  From: t...@shelx.uni-ac.gwdg.de
  Subject: Re: [ccp4bb] Bug in c_truncate?
  To: CCP4BB@JISCMAIL.AC.UK
  
  Hello Peter,
  
  I faintly rememeber a similar kind of problem, and think that if you replace
  -1 with 0, the problem should go away. It seemed that -1 is not an 
  allowed
  flag for (some) ccp4 programs.
  
  Please let us know if this resolves the issue.
  
  Tim
  
  On Thu, Oct 28, 2010 at 10:21:20AM -0400, Peter Chan wrote:
   
   
   
   
   Dear Crystallographers,
   
   Thank you all for the emails. Below are some details of the procedures I 
   performed leading up to the problem.
   
   The reflection file is my own data, processed in XDS and then flagging 
   FreeR's in XPREP in thin resolution shells. I am using CCP4i version 
   6.1.2. I tried looking for known/resolved issues/updates in version 6.1.3 
   but could not find any so I assumed it is the same version of 
   f2mtz/ctruncate/uniqueify.
   
   
   I used the GUI version of F2MTZ, with the settings below:
   
   - import file in SHELX format
   
   - keep existing FreeR flags
   
   - fortran format (3F4.0,2F8.3,F4.0)
   
   - added data label I other integer // FreeRflag
   
   The hkl file, in SHELX format, output by XPREP look something like this:
   
   -26 -3 1 777.48 39.19
   26 -3 -1 800.83 36.31
   -26 3 -1 782.67 37.97
   27 -3 1 45.722 25.711 -1
   -27 3 1 -14.20 31.69 -1
   
   Notice the test set is flagged -1 and the working set is not flagged at 
   all. This actually lead to another error message in f2mtz about missing 
   FreeR flags. From my understanding, the SHELX flagging convention is 1 
   for working and -1 for test. So I manually tagged the working set with 
   1 using vi:
   
   -26 -3 1 777.48 39.19 1
   26 -3 -1 800.83 36.31 1
   -26 3 -1 782.67 37.97 1
   27 -3 1 45.722 25.711 -1
   -27 3 1 -14.20 31.69 -1
   
   This is the file which gives me the error message: Problem with FREE 
   column in input file. All flags apparently identical. Check input file.. 
   Apparently, import to mtz works ok when I use old-truncate instead of 
   c-truncate.
   
   Best,
   Peter
   
  -- 
  --
  Tim Gruene
  Institut fuer anorganische Chemie
  Tammannstr. 4
  D-37077 Goettingen
  
  phone: +49 (0)551 39 22149
  
  GPG Key ID = A46BEE1A
  


Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

2010-10-28 Thread Bernhard Rupp (Hofkristallrat a.D.)
2)  when I submit this to referees will they think my structure is
appropriate to draw these conclusions?. 

Particularly question 2) can rarely be answered w/o looking at electron
density. Binding site details, ligand geometry, all have practically no
effect on the Rules of Thumb. 

 It is whilst asking these two questions that rules of thumb become
somewhat handy, especially when they coincide with the rules of thumb used
by the referees.

Keep also in mind that every atom contributes to every reflection. So fixing
ANY error WHEREVER will improve you refinement - also in the part you are
actually interested in.

You point is correct however that once your question is answered you could
stop. That however does not necessarily deliver a publishable structure that
might be used by others for purposes you initially did not intend. Pharma
industry does this all the time quickly looking for ligands and abandoning
pointless structures early - but they don't publish the unfinished models.

br


On 28 Oct 2010, at 10:28, Eleanor Dodson wrote:

 Oh cynic!
 Eleanor



 On 10/27/2010 09:01 PM, Simon Kolstoe wrote:
 Surely the best model is the one that the referees for your paper 
 are happy with?

 I have found referees to impose seemingly random and arbitrary 
 standards that sometime require a lot of effort to comply with but 
 result in little to no impact on the biology being described. Mind 
 you discussions on this email list can be a useful resource for 
 telling referee's why you don't think you should comply with their 
 rule of thumb.

 Simon



 On 27 Oct 2010, at 20:11, Bernhard Rupp (Hofkristallrat a.D.) wrote:

 Dear Young and Impressionable readers:

 I second-guess here that Robbie's intent - after re-refining many 
 many PDB structures, seeing dreadful things, and becoming a hardened 
 cynic
 - is to
 provoke more discussion in order to put in perspective - if not
 debunk-
 almost all of these rules.

 So it may be better to pretend you have never heard of these rules. 
 Your crystallographic life might be a happier and less biased one.

 If you follow this simple procedure (not a rule)

 The model that fits the primary evidence (minimally biased electron
 density)
 best and is at the same time physically meaningful, is the best 
 model, i.
 e., all plausibly accountable electron density (and not more) is 
 modeled.

 This process of course does require a little work (like looking 
 through all of the model, not just the interesting parts, and 
 thinking what makes
 sense)
 but may lead to additional and unexpected insights. And in almost 
 all cases, you will get a model with plausible statistics, without 
 any reliance on rules.

 For some decisions regarding global parameterizations you have to 
 apply more sophisticated test such as Ethan pointed out (HR tests) 
 or Ian uses (LL-tests). And once you know how to do that, you do not 
 need any rules of thumb anyhow.

 So I opt for a formal burial of these rules of thumb and a toast to 
 evidence and plausibility.

 And, as Gerard B said in other words so nicely:

 Si tacuisses, philosophus mansisses.

 BR

 -Original Message-
 From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf 
 Of Robbie Joosten
 Sent: Tuesday, October 26, 2010 10:29 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

 Dear Anthony,

 That is an excellent question! I believe there are quite a lot of 
 'rules of thumb' going around. Some of them seem to lead to very 
 dogmatic thinking and have caused (refereeing) trouble for good 
 structures and lack of trouble for bad structures. A lot of them 
 were discussed at the CCP4BB so it may be nice to try to list them 
 all.


 Rule 1: If Rwork  20%, you are done.
 Rule 2: If R-free - Rwork  5%, your structure is wrong.
 Rule 3: At resolution X, the bond length rmsd should be  than Y 
 (What is the rmsd thing people keep talking about?) Rule 4: If your 
 resolution is lower than X, you should not 
 use_anisotropic_Bs/riding_hydrogens
 Rule 5: You should not build waters/alternates at resolutions lower 
 than X Rule 6: You should do the final refinement with ALL 
 reflections Rule
 7: No
 one cares about getting the carbohydrates right


 Obviously, this list is not complete. I may also have overstated 
 some of the rules to get the discussion going. Any addidtions are 
 welcome.

 Cheers,
 Robbie Joosten
 Netherlands Cancer Institute

 Apologies if I have missed a recent relevant thread, but are lists 
 of rules of thumb for model building and refinement?





 Anthony



 Anthony Duff Telephone: 02 9717 3493 Mob: 043 189 1076


 =


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread James Holton
It is important to remember that if you have Gaussian-distributed errors and
you plot error bars between +1 sigma and -1 sigma (where sigma is the rms
error), then you expect the right curve to miss the error bars about 30%
of the time.  This is just a property of the Gaussian distribution: you
expect a certain small number of the errors to be large.  If the curve
passes within the bounds of every single one of your error bars, then your
error estimates are either too big, or the errors have a non-Gaussian
distribution.

For example, if the noise in the data somehow had a uniform distribution
(always between +1 and -1), then no data point will ever be kicked further
than 1 away from the right curve.  In this case, a data point more than
1 away from the curve is evidence that you either have the wrong model
(curve), or there is some other kind of noise around (wrong error model).

As someone who has spent a lot of time looking into how we measure
intensities, I think I can say with some considerable amount of confidence
that we are doing a pretty good job of estimating the errors.  At least,
they are certainly not off by an average of 40% (20% in F).  You could do
better than that estimating the intensities by eye!

Everybody seems to have their own favorite explanation for what I call the
R factor gap: solvent, multi-confomer structures, absorption effects,
etc.  However, if you go through the literature (old and new) you will find
countless attempts to include more sophisticated versions of each of these
hypothetically important systematic errors, and in none of these cases has
anyone ever presented a physically reasonable model that explained the
observed spot intensities from a protein crystal to within experimental
error.  Or at least, if there is such a paper, I haven't seen it.

Since there are so many possible things to correct, what I would like to
find is a structure that represents the transition between the small
molecule and the macromolecule world.  Lysozyme does not qualify!  Even
the famous 0.6 A structure of lysozyme (2vb1) still has a mean absolute
chi: |Iobs-Icalc|/sig(I) = 4.5.  Also, the 1.4 A structure of the
tetrapeptide QQNN (2olx) is only a little better at |chi| = 3.5.  I
realize that the chi I describe here is not a standard crystallographic
statistic, and perhaps I need a statistics lesson, but it seems to me there
ought to be a case where it is close to 1.

-James Holton
MAD Scientist

On Thu, Oct 28, 2010 at 9:04 AM, Jacob Keller 
j-kell...@fsm.northwestern.edu wrote:

 So I guess there is never a case in crystallography in which our
 models predict the data to within the errors of data collection? I
 guess the situation might be similar to fitting a Michaelis-Menten
 curve, in which the fitted line often misses the error bars of the
 individual points, but gets the overall pattern right. In that case,
 though, I don't think we say that we are inadequately modelling the
 data. I guess there the error bars are actually too small (are
 underestimated.) Maybe our intensity errors are also underestimated?

 JPK

 On Thu, Oct 28, 2010 at 9:50 AM, George M. Sheldrick
 gshe...@shelx.uni-ac.gwdg.de wrote:
 
  Not quite. I was trying to say that for good small molecule data, R1 is
  usally significantly less than Rmerge, but never less than the precision
  of the experimental data measured by 0.5*sigmaI/I = 0.5*Rsigma
  (or the very similar 0.5*Rpim).
 
  George
 
  Prof. George M. Sheldrick FRS
  Dept. Structural Chemistry,
  University of Goettingen,
  Tammannstr. 4,
  D37077 Goettingen, Germany
  Tel. +49-551-39-3021 or -3068
  Fax. +49-551-39-22582
 
 
  On Thu, 28 Oct 2010, Jacob Keller wrote:
 
  So I guess a consequence of what you say is that since in cases where
 there is
  no solvent the R values are often better than the precision of the
 actual
  measurements (never true with macromolecular crystals involving
 solvent),
  perhaps our real problem might be modelling solvent?
  Alternatively/additionally, I wonder whether there also might be more
  variability molecule-to-molecule in proteins, which we may not model
 well
  either.
 
  JPK
 
  - Original Message - From: George M. Sheldrick
  gshe...@shelx.uni-ac.gwdg.de
  To: CCP4BB@JISCMAIL.AC.UK
  Sent: Thursday, October 28, 2010 4:05 AM
  Subject: Re: [ccp4bb] Against Method (R)
 
 
   It is instructive to look at what happens for small molecules where
   there is often no solvent to worry about. They are often refined
   using SHELXL, which does indeed print out the weighted R-value based
   on intensities (wR2), the conventional unweighted R-value R1 (based
   on F) and sigmaI/I, which it calls R(sigma). For well-behaved
   crystals R1 is in the range 1-5% and R(merge) (based on intensities)
   is in the range 3-9%. As you suggest, 0.5*R(sigma) could be regarded
   as the lower attainable limit for R1 and this is indeed the case in
   practice (the factor 0.5 approximately converts from I to F). Rpim
   gives 

Re: [ccp4bb] Bug in c_truncate?

2010-10-28 Thread Martyn Winn
The GUI task has the option to run (c)truncate after f2mtz (if you have
intensities in the input hkl file), and then uniqueify after that.

I can reproduce this problem. ctruncate is losing the freeR column. At
the moment, I don't know if this is a bug or a feature.

As a work around, you can run ctruncate for the analyses, and re-run
with truncate for the MTZ file.

Tim is right, you need to use 0 instead of -1 in the CCP4 convention.

HTH
Martyn

PS Refmac is moving towards using intensities, so that you can avoid
this step. But I believe 5.5 only uses intensities for twin refinement.

On Thu, 2010-10-28 at 17:55 +0100, Phil Evans wrote:
 Why are you running [c]truncate? this is used to convert I - F and I would 
 be surprised if it recognised or preserved a FreeR column
 
 Phil
 
 On 28 Oct 2010, at 17:48, Peter Chan wrote:
 
  Hello Tim,
  
  Thank you for the suggestion. I have now tagged the working set as 1 and 
  test set as 0. Unfortunately, it still gives the same error about all 
  Rfree being the same, and only in c-truncate but not old-truncate. Perhaps 
  I should install 6.1.3 and see if the problem still persist.
  
  Best,
  Peter
  
   Date: Thu, 28 Oct 2010 16:29:31 +0200
   From: t...@shelx.uni-ac.gwdg.de
   Subject: Re: [ccp4bb] Bug in c_truncate?
   To: CCP4BB@JISCMAIL.AC.UK
   
   Hello Peter,
   
   I faintly rememeber a similar kind of problem, and think that if you 
   replace
   -1 with 0, the problem should go away. It seemed that -1 is not an 
   allowed
   flag for (some) ccp4 programs.
   
   Please let us know if this resolves the issue.
   
   Tim
   
   On Thu, Oct 28, 2010 at 10:21:20AM -0400, Peter Chan wrote:




Dear Crystallographers,

Thank you all for the emails. Below are some details of the procedures 
I performed leading up to the problem.

The reflection file is my own data, processed in XDS and then flagging 
FreeR's in XPREP in thin resolution shells. I am using CCP4i version 
6.1.2. I tried looking for known/resolved issues/updates in version 
6.1.3 but could not find any so I assumed it is the same version of 
f2mtz/ctruncate/uniqueify.


I used the GUI version of F2MTZ, with the settings below:

- import file in SHELX format

- keep existing FreeR flags

- fortran format (3F4.0,2F8.3,F4.0)

- added data label I other integer // FreeRflag

The hkl file, in SHELX format, output by XPREP look something like this:

-26 -3 1 777.48 39.19
26 -3 -1 800.83 36.31
-26 3 -1 782.67 37.97
27 -3 1 45.722 25.711 -1
-27 3 1 -14.20 31.69 -1

Notice the test set is flagged -1 and the working set is not flagged 
at all. This actually lead to another error message in f2mtz about 
missing FreeR flags. From my understanding, the SHELX flagging 
convention is 1 for working and -1 for test. So I manually tagged 
the working set with 1 using vi:

-26 -3 1 777.48 39.19 1
26 -3 -1 800.83 36.31 1
-26 3 -1 782.67 37.97 1
27 -3 1 45.722 25.711 -1
-27 3 1 -14.20 31.69 -1

This is the file which gives me the error message: Problem with FREE 
column in input file. All flags apparently identical. Check input 
file.. Apparently, import to mtz works ok when I use old-truncate 
instead of c-truncate.

Best,
Peter

   -- 
   --
   Tim Gruene
   Institut fuer anorganische Chemie
   Tammannstr. 4
   D-37077 Goettingen
   
   phone: +49 (0)551 39 22149
   
   GPG Key ID = A46BEE1A
   

-- 
***
* *
*   Dr. Martyn Winn   *
* *
*   STFC Daresbury Laboratory, Daresbury, Warrington, WA4 4AD, U.K.   *
*   Tel: +44 1925 603455E-mail: martyn.w...@stfc.ac.uk*
*   Fax: +44 1925 603634Skype name: martyn.winn   * 
* URL: http://www.ccp4.ac.uk/martyn/  *
***


[ccp4bb] Free R with doubled cell edge

2010-10-28 Thread Thomas Edwards
Dear BB Sages,

I have a problem where I think I could very easily do the wrong thing.
And I don't really want to do that...

We have solved a new structure using zinc SAD phases (1 zinc in 27kD, 2 Zn/AU - 
Shelx, RESOLVE, ARPwARP. Cool.).
In p21 30 109 65 90 105 90 at 2.5A

However, we have now collected 1.9A data.
In p21...
60 109 65 90 107 90

4 chains per AU instead of 2 with a doubling of a.

Self rotations with the new data suggest 2 two-folds, one quite near 
crystallographic.
It seems that the doubling of the a edge is adding an NCS two-fold that is 
almost crystallographic.

Now, having refined against the 2.5A data to R/Rfree of about 25/30 we would 
like to use that model to do MR against the new high res data (We didn't 
collect Zn peak data for the new crystal - didn't think we'd need it.). I 
have done that and found 4 mols with Phaser in about 60 seconds. Still cool.

So, we would like to transfer Free R flags to the new data to avoid refining 
against what had been labelled as Free R.
My problem is - how do I do that properly?
I am worried that some of the working data in the bigger cell will be 
correlated with Free data via the near crystallographic NCS.
I clearly don't want to just copy them from the old mtz file with a=30

I recall some discussion about this from years ago on the BB but can't find the 
right threads.

Can anybody point me to the correct way to do this please - I presumably want 
to label with Free R flags symmetry related Free R labelled reflections from 
the old data that are related by the new NCS 2-fold (that is close to 
crystallographic) in the new data. Right?? If I have worded that correctly...
I am hoping that will make sense to somebody.

I think that the solutions that were recently suggested for lower vs higher 
symmetry in the same unit cell do not apply here.



One suggestion has been to do the MolRep, choose new free Rs,  give it all a 
good hard shake with high temp simulated annealing and hope that any bias is 
gone.

I'm not sure that I am comfortable with the word hope here...
But, if the consensus of opinion of the wise folk at the BB is that this will 
pass muster at the point where the charming and delightful referees are 
commenting on the extremely high impact (obviously :-) manuscript, then I will 
quote you all!


I await your wise words.

Free R. Again. Sorry.


Cheers
Ed


__
T.Edwards Ph.D.
Garstang 8.53d
Astbury Centre for Structural Molecular Biology
University of Leeds, Leeds, LS2 9JT
Telephone: 0113 343 3031
http://www.bmb.leeds.ac.uk/staff/tae/
-- A new scientific truth does not triumph by convincing opponents and making 
them see the light, but rather because its opponents eventually die, and a new 
generation grows up that is familiar with it.  ~Max Planck


[ccp4bb] oligomer ligand building

2010-10-28 Thread Changyi Xue

Hi, all
   I am trying to build an oligomer ligand. I have obtained all the cif files 
for the monomers from HIC-UP. I tried to build several monomers in using coot. 
Then I modified the pdb file and stated the link in the pdb head (LINKR  X ABC 
1   Y ABC 2X-Y). However, when using refmac to refine, it always 
separated the monomers and broke the bond, which I could fix it back in coot 
using 'real space refine zone'.

  I am wondering, what is the general practice to build such oligomer ligand? 

  All suggestions are welcome.

thank you!

Changyi

  

Re: [ccp4bb] oligomer ligand building

2010-10-28 Thread Paul Holland
Hi Changyi,

I'm not sure if this will answer your question, but it is likely that the cif 
file you are reading into Refmac is not in the correct format from Hic-Cup.   I 
would suggest reading your coordinate file for the ligand into the Dundee 
server which will output a cif formats for refinement programs.  Scroll down to 
the Refmac output and save as a .cif, and then read this file into your next 
round of refinement with Refmac.  After this round, Refmac will generate a new 
cif library containing this ligand.  I hope this helps.

Cheers,

Paul


Re: [ccp4bb] Help with Optimizing Crystals

2010-10-28 Thread Matthew Bratkowski
Hi.

Thanks for all of the helpful advice.  Below is a summary of the
suggestions, along with some things that I have tried and the results thus
far.

1) Make sure that the crystals are protein and not salt.

My crystals absorb Izit dye well and shooting some initial crystals did not
produce any diffraction.  I have yet to see them on a gel though, possibly
due to too few crystals run, loss of crystals while washing, etc.

2) Try streak seeding using the hit conditions as well as new conditions.

Streak seeding using two hit conditions and half of the protein
concentration have thus far only produced crystals that are smaller than the
initial crystals used to make the seed stock.  This indicates that self
nucleation is still occurs.  I will try significantly lowering the
precipitant conc. as well.  Seeding into a few random conditions has
produced some crystals in new conditions, but they look the same or worse
that the previous ones.

3) Make sure that the protein is very pure.  Include an additional
purification step to your purification scheme, try thermal denaturation of
contaminating proteins, crystallize out protein and the redissolve the
crystals and use to set up new trays.

I will try some of these.

4) Modify the protein to make it more amendable to crystallization.  Try
truncations, mutations, methylate lysines.

I tried the FL version from two different organisms and neither
crystallizes.  When enzymatically proteolyzed, the protein from one organism
crystallizes.  However, when proteins containing similar truncations are
purified recombinately, a ladder of bands appears after the major one,
indicating protein instability.  I have not tried any mutations or
methylation yet.

Thanks for all of the helpful suggestions, and if there are any more, please
let me know.

Matt

On Wed, Oct 27, 2010 at 9:28 AM, Annie Hassell annie.m.hass...@gsk.comwrote:

  Matt—



 You might want to try heating your protein to get rid of
 unfolded/improperly folded protein.  We have used 37C for  10 min with good
 success, but a time course at different temperatures is the best way to
 determine which parameters are optimum for your protein.  Heat—chill it on
 ice—centrifuge-- then set up your crystallization trays.  It’s a pretty
 quick test to see if this will work for your protein.



 Do you have any ligands for your protein?  These have often been the key to
 getting good crystals in our lab.  If you do have good ligands, you may want
 to express and/or purify your protein in the presence of these compounds.



 Good Luck!

 annie



 *From:* CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] *On Behalf Of 
 *Jürgen
 Bosch
 *Sent:* Tuesday, October 26, 2010 5:46 PM

 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* Re: [ccp4bb] Help with Optimizing Crystals






 Hi.

 Here is some additional information.

 1.  The purification method that I used included Ni, tag cleavage, and SEC
 as a final step.  I have tried samples from three different purification
 batches that range in purity, and even the batch with the worst purity seems
 to produce crystals.

  Resource Q ? two or more species perhaps ? Does it run as a monomer dimer
 multimer on your SEC ?




 2. The protein is a proteolyzed fragment since the full length version did
 not crystallize.  Mutagenesis and methylation, however, may be techniques to
 consider since the protein contains quite a few lysines.

 3. There are not any detergents in the buffer, so these are not detergent
 crystals.  The protein buffer just contains Tris at pH 8, NaCl, and DTT.

 4. Some experiments that I have done thus far seem to suggest that the
 crystals are protein.  Izit dye soaks well into the crystals, and the few
 crystals that I shot previously did not produce any diffraction pattern
 whatsoever.  However, I have had difficulty seeming them on a gel and they
 are a bit tough to break.

 Do they float or do they sink quickly when you try to mount them ?


 5.  I tried seeding previously as follows: I broke some crystals, made a
 seed stock, dipped in a hair, and did serial streak seeding.  After seeding,
 I usually saw small disks or clusters along the path of the hair but nothing
 larger or better looking.

 I also had one more question.  Has anyone had an instance where changing
 the precipitation condition or including an additive improved diffraction
 but did not drastically change the shape of the protein?  If so, I may just
 try further optimization with the current conditions and shoot some more
 crystals.



 The additive screen from Hampton is not bad and can make a big difference.





 A different topic is it a direct cryo what you are using as a condition ?
 If not what do you use a s a cryo ? Have you tried the old-fashioned way of
 shooting at crystals at room temperature using capillaries (WTHIT ?)



 You might be killing your crystal by trying to cryo it is what I'm trying
 to say here.



 Jürgen





  Thanks for all the helpful advice thus far,
 Matt





Re: [ccp4bb] Babinet solvent correction [WAS: R-free flag problem]

2010-10-28 Thread Tim Fenn
On Thu, 28 Oct 2010 16:56:42 +0200
Dirk Kostrewa kostr...@genzentrum.lmu.de wrote:

 
 In the Babinet bulk solvent correction, no bulk solvent phases are
 used, it is entirely based on amplitudes and strictly only valid if
 the phases of the bulk solvent are opposite to the ones of the
 protein. And as Sasha Urzhumtsev pointed out, this assumption is only
 valid at very low resolution.
 
 The mask bulk solvent correction is a vector sum including the phases
 of the bulk solvent mask, which makes a difference at medium
 resolution (up to ~4.5 A, or so).
 
 As far as I can see, your formulas given below do not distinguish 
 between amplitude (modulus) and vector bulk solvent corrections.
 

Sorry - I didn't make that clear.  The formulas all use complex
structure factors, as in the paper.

 Personally, I really don't see any physical sense in using both 
 corrections together, except for compensating any potential scaling 
 problems at low resolution.
 

We're not using both corrections together - the Babinet *method* is
used to add in the bulk solvent contribution computed using the flat
mask *model* (or the polynomial/Gaussian model in the paper).  The
protein structure factors (Fc) are not used in the bulk solvent
correction - nor, in my opinion, should they be (as I attempted to
point out in my previous email).

Regards,
Tim

-- 
-

Tim Fenn
f...@stanford.edu
Stanford University, School of Medicine
James H. Clark Center
318 Campus Drive, Room E300
Stanford, CA  94305-5432
Phone:  (650) 736-1714
FAX:  (650) 736-1961

-


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Bart Hazes




There are many cases where people use a structure refined at high
resolution as a starting molecular replacement structure for a closely
related/same protein with a lower resolution data set and get
substantially better R statistics than you would expect for that
resolution. So one factor in the "R factor gap" is many small errors
that are introduced during model building and not recognized and fixed
later due to limited resolution. In a perfect world, refinement would
find the global minimum but in practice all these little errors get
stuck in local minima with distortions in neighboring atoms
compensating for the initial error and thereby hiding their existence.

Bart

On 10-10-28 11:33 AM, James Holton wrote:
It is important to remember that if you have
Gaussian-distributed errors and you plot error bars between +1 sigma
and -1 sigma (where "sigma" is the rms error), then you expect the
"right" curve to miss the error bars about 30% of the time.  This is
just a property of the Gaussian distribution: you expect a certain
small number of the errors to be large.  If the curve passes within the
bounds of every single one of your error bars, then your error
estimates are either too big, or the errors have a non-Gaussian
distribution.  
  
For example, if the noise in the data somehow had a uniform
distribution (always between +1 and -1), then no data point will ever
be "kicked" further than "1" away from the "right" curve.  In this
case, a data point more than "1" away from the curve is evidence that
you either have the wrong model (curve), or there is some other kind of
noise around (wrong "error model").
  
As someone who has spent a lot of time looking into how we measure
intensities, I think I can say with some considerable amount of
confidence that we are doing a pretty good job of estimating the
errors.  At least, they are certainly not off by an average of 40% (20%
in F).  You could do better than that estimating the intensities by eye!
  
Everybody seems to have their own favorite explanation for what I call
the "R factor gap": solvent, multi-confomer structures, absorption
effects, etc.  However, if you go through the literature (old and new)
you will find countless attempts to include more sophisticated versions
of each of these hypothetically "important" systematic errors, and in
none of these cases has anyone ever presented a physically reasonable
model that explained the observed spot intensities from a protein
crystal to within experimental error.  Or at least, if there is such a
paper, I haven't seen it.
  
Since there are so many possible things to "correct", what I would like
to find is a structure that represents the transition between the
"small molecule" and the "macromolecule" world.  Lysozyme does not
qualify!  Even the famous 0.6 A structure of lysozyme (2vb1) still has
a "mean absolute chi": |Iobs-Icalc|/sig(I) = 4.5.  Also, the
1.4 A structure of the tetrapeptide QQNN (2olx) is only a little better
at |chi| = 3.5.  I realize that the "chi" I describe here is
not a "standard" crystallographic statistic, and perhaps I need a
statistics lesson, but it seems to me there ought to be a case where it
is close to 1.
  
-James Holton
MAD Scientist
  
  On Thu, Oct 28, 2010 at 9:04 AM, Jacob
Keller j-kell...@fsm.northwestern.edu
wrote:
  So
I guess there is never a case in crystallography in which our
models predict the data to within the errors of data collection? I
guess the situation might be similar to fitting a Michaelis-Menten
curve, in which the fitted line often misses the error bars of the
individual points, but gets the overall pattern right. In that case,
though, I don't think we say that we are inadequately modelling the
data. I guess there the error bars are actually too small (are
underestimated.) Maybe our intensity errors are also underestimated?

JPK


On Thu, Oct 28, 2010 at 9:50 AM, George M. Sheldrick
gshe...@shelx.uni-ac.gwdg.de
wrote:

 Not quite. I was trying to say that for good small molecule data,
R1 is
 usally significantly less than Rmerge, but never less than the
precision
 of the experimental data measured by 0.5*sigmaI/I
= 0.5*Rsigma
 (or the very similar 0.5*Rpim).

 George

 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582


 On Thu, 28 Oct 2010, Jacob Keller wrote:

 So I guess a consequence of what you say is that since in
cases where there is
 no solvent the R values are often better than the precision of
the actual
 measurements (never true with macromolecular crystals
involving solvent),
 perhaps our real problem might be modelling solvent?
 Alternatively/additionally, I wonder whether there also might
be more
 variability molecule-to-molecule in proteins, which we may not
model well
 either.

 JPK

 - Original Message - From: "George M. Sheldrick"
 gshe...@shelx.uni-ac.gwdg.de
 To: 

Re: [ccp4bb] oligomer ligand building

2010-10-28 Thread Paul Emsley

On 28/10/10 20:12, Changyi Xue wrote:

Hi, all
   I am trying to build an oligomer ligand. I have obtained all the 
cif files for the monomers from HIC-UP. I tried to build several 
monomers in using coot. Then I modified the pdb file and stated the 
link in the pdb head (LINKR  X ABC 1   Y ABC 2X-Y). 
However, when using refmac to refine, it always separated the monomers 
and broke the bond, which I could fix it back in coot using 'real 
space refine zone'.


  I am wondering, what is the general practice to build such oligomer 
ligand?


Having a LINK in the PDB will show the link in the graphics.

Having a link description in the cif file will allow you to refine it 
(presuming that such a link is not already in the dictionary).  This 
applies to any refinement program.  If you want to refine the link with 
Coot, you will need to use Sphere Refinement. (IMHO) cif link 
descriptions are not straightforward (not for the first few times that 
you make one, that is).


Paul.


Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Jacob Keller
So let's say I take a 0.6 Ang structure, artificially introduce noise into 
corresponding Fobs to make the resolution go down to 2 Ang, and refine using 
the 0.6 Ang model--do I actually get R's better than the artificially-inflated 
sigmas? Or let's say I experimentally decrease I/sigma by attenuating the beam 
and collect another data set--same situation?

JPK

  - Original Message - 
  From: Bart Hazes 
  To: CCP4BB@JISCMAIL.AC.UK 
  Sent: Thursday, October 28, 2010 4:13 PM
  Subject: Re: [ccp4bb] Against Method (R)


  There are many cases where people use a structure refined at high resolution 
as a starting molecular replacement structure for a closely related/same 
protein with a lower resolution data set and get substantially better R 
statistics than you would expect for that resolution. So one factor in the R 
factor gap is many small errors that are introduced during model building and 
not recognized and fixed later due to limited resolution. In a perfect world, 
refinement would find the global minimum but in practice all these little 
errors get stuck in local minima with distortions in neighboring atoms 
compensating for the initial error and thereby hiding their existence.

  Bart

  On 10-10-28 11:33 AM, James Holton wrote: 
It is important to remember that if you have Gaussian-distributed errors 
and you plot error bars between +1 sigma and -1 sigma (where sigma is the rms 
error), then you expect the right curve to miss the error bars about 30% of 
the time.  This is just a property of the Gaussian distribution: you expect a 
certain small number of the errors to be large.  If the curve passes within the 
bounds of every single one of your error bars, then your error estimates are 
either too big, or the errors have a non-Gaussian distribution.  

For example, if the noise in the data somehow had a uniform distribution 
(always between +1 and -1), then no data point will ever be kicked further 
than 1 away from the right curve.  In this case, a data point more than 1 
away from the curve is evidence that you either have the wrong model (curve), 
or there is some other kind of noise around (wrong error model).

As someone who has spent a lot of time looking into how we measure 
intensities, I think I can say with some considerable amount of confidence that 
we are doing a pretty good job of estimating the errors.  At least, they are 
certainly not off by an average of 40% (20% in F).  You could do better than 
that estimating the intensities by eye!

Everybody seems to have their own favorite explanation for what I call the 
R factor gap: solvent, multi-confomer structures, absorption effects, etc.  
However, if you go through the literature (old and new) you will find countless 
attempts to include more sophisticated versions of each of these hypothetically 
important systematic errors, and in none of these cases has anyone ever 
presented a physically reasonable model that explained the observed spot 
intensities from a protein crystal to within experimental error.  Or at least, 
if there is such a paper, I haven't seen it.

Since there are so many possible things to correct, what I would like to 
find is a structure that represents the transition between the small molecule 
and the macromolecule world.  Lysozyme does not qualify!  Even the famous 0.6 
A structure of lysozyme (2vb1) still has a mean absolute chi: 
|Iobs-Icalc|/sig(I) = 4.5.  Also, the 1.4 A structure of the tetrapeptide 
QQNN (2olx) is only a little better at |chi| = 3.5.  I realize that the chi 
I describe here is not a standard crystallographic statistic, and perhaps I 
need a statistics lesson, but it seems to me there ought to be a case where it 
is close to 1.

-James Holton
MAD Scientist


On Thu, Oct 28, 2010 at 9:04 AM, Jacob Keller 
j-kell...@fsm.northwestern.edu wrote:

  So I guess there is never a case in crystallography in which our
  models predict the data to within the errors of data collection? I
  guess the situation might be similar to fitting a Michaelis-Menten
  curve, in which the fitted line often misses the error bars of the
  individual points, but gets the overall pattern right. In that case,
  though, I don't think we say that we are inadequately modelling the
  data. I guess there the error bars are actually too small (are
  underestimated.) Maybe our intensity errors are also underestimated?

  JPK


  On Thu, Oct 28, 2010 at 9:50 AM, George M. Sheldrick
  gshe...@shelx.uni-ac.gwdg.de wrote:
  
   Not quite. I was trying to say that for good small molecule data, R1 is
   usally significantly less than Rmerge, but never less than the precision
   of the experimental data measured by 0.5*sigmaI/I = 0.5*Rsigma
   (or the very similar 0.5*Rpim).
  
   George
  
   Prof. George M. Sheldrick FRS
   Dept. Structural Chemistry,
   University of Goettingen,
   

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread James Holton
Yes, but even the high-resolution structures cannot explain THEIR data to
within experimental error.  You can see this if you download the CIF file
for one of the highest-resolution structures there is: 2vb1 (triclinic
lysozyme at 0.6 A), which contains both I and FC:
http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=STRUCTFACTcompression=NOstructureId=2VB1
I even had Z. Dauter send me the original image files for this one, and I
don't think it will surprise anyone to hear that I think he did it right.
Nevertheless, the average value of |Iobs-Icalc| / sigma(Iobs) is 4.5 for
this structure.

Also, if I take data from MLFSOM-simulated diffraction images (including
anomalous scattering, absorption, shutter jitter, background, etc.) and set
ARP/wARP to work building the model, starting from the SAD-phased map, it
inevitably converges to R/Rfree of around 6-7%, even at 2 A resolution.  For
the real data, however, it never gets below 18%.

This is actually not all that remarkable a result, because the Fobs from
the fake data is actually only ~5% different from Fcalc from the PDB file I
put into the simulation. (I did not provide this PDB to ARP/wARP!)  Add this
to the fact that if your model is close, but missing a few bits, then
those missing bits light up in a Fo-Fc map (like the tail on Kevin Cowtan's
cat).  These differnece features get BETTER as the model becomes more
complete, and in small molecule structures adding in difference features
eventually leads to R1  R(merge) (using Sheldrick's notation from below.)

What I don't understand is why protein structures don't converge like
this.  Yes, there are low-occupancy features:
Fraser et al. 2009: http://dx.doi.org/10.1038/nature08615
Lang et al. 2010: http://dx.doi.org/10.1002/pro.423
but even if you model these in, the R factor only drops a few percent:
van den Bedem et al. 2009: http://dx.doi.org/10.110/S0907444909030613

-James Holton
MAD Scientist

On Thu, Oct 28, 2010 at 2:13 PM, Bart Hazes bart.ha...@ualberta.ca wrote:

  There are many cases where people use a structure refined at high
 resolution as a starting molecular replacement structure for a closely
 related/same protein with a lower resolution data set and get substantially
 better R statistics than you would expect for that resolution. So one factor
 in the R factor gap is many small errors that are introduced during model
 building and not recognized and fixed later due to limited resolution. In a
 perfect world, refinement would find the global minimum but in practice all
 these little errors get stuck in local minima with distortions in
 neighboring atoms compensating for the initial error and thereby hiding
 their existence.

 Bart


 On 10-10-28 11:33 AM, James Holton wrote:

 It is important to remember that if you have Gaussian-distributed errors
 and you plot error bars between +1 sigma and -1 sigma (where sigma is the
 rms error), then you expect the right curve to miss the error bars about
 30% of the time.  This is just a property of the Gaussian distribution: you
 expect a certain small number of the errors to be large.  If the curve
 passes within the bounds of every single one of your error bars, then your
 error estimates are either too big, or the errors have a non-Gaussian
 distribution.

 For example, if the noise in the data somehow had a uniform distribution
 (always between +1 and -1), then no data point will ever be kicked further
 than 1 away from the right curve.  In this case, a data point more than
 1 away from the curve is evidence that you either have the wrong model
 (curve), or there is some other kind of noise around (wrong error model).

 As someone who has spent a lot of time looking into how we measure
 intensities, I think I can say with some considerable amount of confidence
 that we are doing a pretty good job of estimating the errors.  At least,
 they are certainly not off by an average of 40% (20% in F).  You could do
 better than that estimating the intensities by eye!

 Everybody seems to have their own favorite explanation for what I call the
 R factor gap: solvent, multi-confomer structures, absorption effects,
 etc.  However, if you go through the literature (old and new) you will find
 countless attempts to include more sophisticated versions of each of these
 hypothetically important systematic errors, and in none of these cases has
 anyone ever presented a physically reasonable model that explained the
 observed spot intensities from a protein crystal to within experimental
 error.  Or at least, if there is such a paper, I haven't seen it.

 Since there are so many possible things to correct, what I would like to
 find is a structure that represents the transition between the small
 molecule and the macromolecule world.  Lysozyme does not qualify!  Even
 the famous 0.6 A structure of lysozyme (2vb1) still has a mean absolute
 chi: |Iobs-Icalc|/sig(I) = 4.5.  Also, the 1.4 A structure of the
 tetrapeptide QQNN (2olx) is only a little 

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Ethan Merritt
Bart Hazes wrote  
   There are many cases where people use a structure refined at high 
 resolution as a starting molecular replacement structure for a closely 
 related/same protein with a lower resolution data set and get substantially 
 better R statistics than you would expect for that resolution. So one factor 
 in the R factor gap is many small errors that are introduced during model 
 building and not recognized and fixed later due to limited resolution. In a 
 perfect world, refinement would find the global minimum but in practice all 
 these little errors get stuck in local minima with distortions in neighboring 
 atoms compensating for the initial error and thereby hiding their existence.

Excellent point.

On Thursday, October 28, 2010 02:49:11 pm Jacob Keller wrote:
 So let's say I take a 0.6 Ang structure, artificially introduce noise into 
 corresponding Fobs to make the resolution go down to 2 Ang, and refine using 
 the 0.6 Ang model--do I actually get R's better than the 
 artificially-inflated sigmas?
 Or let's say I experimentally decrease I/sigma by attenuating the beam and 
 collect another data set--same situation?

This I can answer based on experience.  One can take the coordinates from a 
structure
refined at near atomic resolution (~1.0A), including multiple conformations,
partial occupancy waters, etc, and use it to calculate R factors against a lower
resolution (say 2.5A) data set collected from an isomorphous crystal.  The
R factors from this total-rigid-body replacement will be better than anything 
you
could get from refinement against the lower resolution data.  In fact, 
refinement
from this starting point will just make the R factors worse.

What this tells us is that the crystallographic residuals can recognize a
better model when they see one. But our refinement programs are not good 
enough to produce such a better model in the first place. Worsr, they are not
even good enough to avoid degrading the model.

That's essentially the same thing Bart said, perhaps a little more pessimistic 
:-)

cheers,

Ethan



 
 JPK
 
   - Original Message - 
   From: Bart Hazes 
   To: CCP4BB@JISCMAIL.AC.UK 
   Sent: Thursday, October 28, 2010 4:13 PM
   Subject: Re: [ccp4bb] Against Method (R)
 
 
   There are many cases where people use a structure refined at high 
 resolution as a starting molecular replacement structure for a closely 
 related/same protein with a lower resolution data set and get substantially 
 better R statistics than you would expect for that resolution. So one factor 
 in the R factor gap is many small errors that are introduced during model 
 building and not recognized and fixed later due to limited resolution. In a 
 perfect world, refinement would find the global minimum but in practice all 
 these little errors get stuck in local minima with distortions in neighboring 
 atoms compensating for the initial error and thereby hiding their existence.
 
   Bart
 
   On 10-10-28 11:33 AM, James Holton wrote: 
 It is important to remember that if you have Gaussian-distributed errors 
 and you plot error bars between +1 sigma and -1 sigma (where sigma is the 
 rms error), then you expect the right curve to miss the error bars about 
 30% of the time.  This is just a property of the Gaussian distribution: you 
 expect a certain small number of the errors to be large.  If the curve passes 
 within the bounds of every single one of your error bars, then your error 
 estimates are either too big, or the errors have a non-Gaussian distribution. 
  
 
 For example, if the noise in the data somehow had a uniform distribution 
 (always between +1 and -1), then no data point will ever be kicked further 
 than 1 away from the right curve.  In this case, a data point more than 
 1 away from the curve is evidence that you either have the wrong model 
 (curve), or there is some other kind of noise around (wrong error model).
 
 As someone who has spent a lot of time looking into how we measure 
 intensities, I think I can say with some considerable amount of confidence 
 that we are doing a pretty good job of estimating the errors.  At least, they 
 are certainly not off by an average of 40% (20% in F).  You could do better 
 than that estimating the intensities by eye!
 
 Everybody seems to have their own favorite explanation for what I call 
 the R factor gap: solvent, multi-confomer structures, absorption effects, 
 etc.  However, if you go through the literature (old and new) you will find 
 countless attempts to include more sophisticated versions of each of these 
 hypothetically important systematic errors, and in none of these cases has 
 anyone ever presented a physically reasonable model that explained the 
 observed spot intensities from a protein crystal to within experimental 
 error.  Or at least, if there is such a paper, I haven't seen it.
 
 Since there are so many possible things to correct, what I would 

Re: [ccp4bb] oligomer ligand building

2010-10-28 Thread Garib N Murshudov
Hi

As I see you want to use link between ligands. You need to create this link 
description first. It can be done using JLigand that is available from:

www.ysbl.york.ac.uk/mxstat/

There are tutorials how to create ligands and links. It should help you to 
create links

We are updating at the moment JLigand and it would be good if you take it after 
this sunday when we will have newer version of JLigand.

regards
Garib

On 28 Oct 2010, at 20:12, Changyi Xue wrote:

 Hi, all
I am trying to build an oligomer ligand. I have obtained all the cif files 
 for the monomers from HIC-UP. I tried to build several monomers in using 
 coot. Then I modified the pdb file and stated the link in the pdb head (LINKR 
  X ABC 1   Y ABC 2X-Y). However, when using refmac to refine, it 
 always separated the monomers and broke the bond, which I could fix it back 
 in coot using 'real space refine zone'.
 
   I am wondering, what is the general practice to build such oligomer ligand? 
 
   All suggestions are welcome.
 
 thank you!
 
 Changyi
 



Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Bart Hazes




You're second suggestion would be a good test because you are dealing
with data from the same crystal and can thus assume the structures are
identical (radiation damage excluded).
So, take a highly diffracting crystal and collect a short-exposure low
resolution data set and long exposure high resolution data set. Let's
say with I/Sig=2 at 2.0 and 1.2 high-resolution shells. Give the data
to two equally capable students to determine the structure by molecular
replacement from a, let's say 30% sequence identity starting model. You
could also use automated model building to be more objective and avoid
becoming unpopular with your students.

Proceed until each model is fully refined against its own data. Now run
some more refinement, without manual rebuilding, of the lowres model
versus the highres data (and perhaps some rigid body or other minimal
refinement of the highres model versus the lowres data, make sure R
 Rfree go down). I predict the highres model will fit the lowres
data noticeably better than the lowres model did and the lowres model,
even after refinement with the highres data, will not reach the same
quality as the highres model. Looking at Fo-Fc maps in the latter case
may give some hints as to which model errors were not recognized at 2A
resolution. You'll probably find peptide flips, mis-modeled leucine and
other side chains, dual conformations not recognized at 2A resolution,
more realistic B values, more waters ...

Bart

On 10-10-28 03:49 PM, Jacob Keller wrote:

  
  
  
  So let's say I take a 0.6 Ang
structure, artificially introduce noise into corresponding Fobs to make
the resolution go down to 2 Ang, and refine using the 0.6 Ang model--do
I actually get R's better than the artificially-inflated sigmas? Or
let's say I experimentally decrease I/sigma by attenuating the beam and
collect another data set--same situation?
  
  JPK
  

-
Original Message - 
From:
Bart Hazes 
To:
CCP4BB@JISCMAIL.AC.UK 
Sent:
Thursday, October 28, 2010 4:13 PM
Subject:
Re: [ccp4bb] Against Method (R)


There are many cases where people use a structure refined at high
resolution as a starting molecular replacement structure for a closely
related/same protein with a lower resolution data set and get
substantially better R statistics than you would expect for that
resolution. So one factor in the "R factor gap" is many small errors
that are introduced during model building and not recognized and fixed
later due to limited resolution. In a perfect world, refinement would
find the global minimum but in practice all these little errors get
stuck in local minima with distortions in neighboring atoms
compensating for the initial error and thereby hiding their existence.

Bart

On 10-10-28 11:33 AM, James Holton wrote:
It is important to remember that if you have
Gaussian-distributed errors and you plot error bars between +1 sigma
and -1 sigma (where "sigma" is the rms error), then you expect the
"right" curve to miss the error bars about 30% of the time. This is
just a property of the Gaussian distribution: you expect a certain
small number of the errors to be large. If the curve passes within the
bounds of every single one of your error bars, then your error
estimates are either too big, or the errors have a non-Gaussian
distribution. 
  
For example, if the noise in the data somehow had a uniform
distribution (always between +1 and -1), then no data point will ever
be "kicked" further than "1" away from the "right" curve. In this
case, a data point more than "1" away from the curve is evidence that
you either have the wrong model (curve), or there is some other kind of
noise around (wrong "error model").
  
As someone who has spent a lot of time looking into how we measure
intensities, I think I can say with some considerable amount of
confidence that we are doing a pretty good job of estimating the
errors. At least, they are certainly not off by an average of 40% (20%
in F). You could do better than that estimating the intensities by eye!
  
Everybody seems to have their own favorite explanation for what I call
the "R factor gap": solvent, multi-confomer structures, absorption
effects, etc. However, if you go through the literature (old and new)
you will find countless attempts to include more sophisticated versions
of each of these hypothetically "important" systematic errors, and in
none of these cases has anyone ever presented a physically reasonable
model that explained the observed spot intensities from a protein
crystal to within experimental error. Or at least, if there is such a
paper, I haven't seen it.
  
Since there are so many possible things to "correct", what I would like
to find is a structure that represents the transition between the
"small molecule" and the "macromolecule" world. Lysozyme does not
qualify! Even the famous 0.6 A structure of lysozyme (2vb1) still has
a "mean absolute chi": |Iobs-Icalc|/sig(I) = 4.5. 

Re: [ccp4bb] Against Method (R)

2010-10-28 Thread Bart Hazes

On 10-10-28 04:09 PM, Ethan Merritt wrote:

This I can answer based on experience.  One can take the coordinates from a 
structure
refined at near atomic resolution (~1.0A), including multiple conformations,
partial occupancy waters, etc, and use it to calculate R factors against a lower
resolution (say 2.5A) data set collected from an isomorphous crystal.  The
R factors from this total-rigid-body replacement will be better than anything 
you
could get from refinement against the lower resolution data.  In fact, 
refinement
from this starting point will just make the R factors worse.

What this tells us is that the crystallographic residuals can recognize a
better model when they see one. But our refinement programs are not good
enough to produce such a better model in the first place. Worsr, they are not
even good enough to avoid degrading the model.

That's essentially the same thing Bart said, perhaps a little more pessimistic 
:-)

cheers,

Ethan
   


Not pessimistic at all, just realistic and perhaps even optimistic for 
methods developers as apparently there is still quite a bit of progress 
that can be made by improving the search strategy during refinement.


During manual refinement I normally tell students not to bother about 
translating/rotating/torsioning atoms by just a tiny bit to make it fit 
better. Likewise there is no point in moving atoms a little bit to 
correct a distorted bond or bond length. If it needed to move that 
little bit the refinement program would have done it for you. Look for 
discreet errors in the problematic residue or its neighbors: peptide 
flips, 120 degree changes in side chain dihedrals, etc. If you can find 
and fix one of those errors a lot of the stereochemical distortions and 
non-ideal fit to density surrounding that residue will suddenly 
disappear as well.


The benefit of high resolution is that it is much easier to pick up and 
fix such errors (or not make them in the first place)


Bart

--



Bart Hazes (Associate Professor)
Dept. of Medical Microbiology  Immunology
University of Alberta
1-15 Medical Sciences Building
Edmonton, Alberta
Canada, T6G 2H7
phone:  1-780-492-0042
fax:1-780-492-7521




Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

2010-10-28 Thread Jim Pflugrath
Zbyszek, 

Since you mention I/sigmaI in your PDF, do you mean I/sigmaI or
I/sigmaI?  
Do you mean I/sigmaI (in whatever rendition you choose) for the averaged
unique reflections or the I/sigmaI for the observations?
Also since one can adjust sigmaI in your scalepack program through the use
of the Error Scale Factor or the Error Model, how can a reviewer believe any
of the I/sigmaI that are reported by authors?

Thanks for any insights into these questions, 

Jim

-Original Message-
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of
Zbyszek Otwinowski
Sent: Thursday, October 28, 2010 7:41 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

Feel free to use it as you wish.

--

DUFF, Anthony wrote:
 I reckon you could share hypothetical review comments for educational 
 purposes.
 
 
 -Original Message-
 From: CCP4 bulletin board on behalf of Bernhard Rupp (Hofkristallrat 
 a.D.)
 Sent: Thu 10/28/2010 12:22 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)
 
 Why not double open review? If I have something reasonable to say, I 
 should be able to sign it. Particularly if the publicly purported 
 point of review is to make the manuscript better.  And imagine what 
 wonderful open hostility we would enjoy instead of all these hidden 
 grudges! You would never have to preemptively condemn a paper on 
 grounds of suspicion that it is from someone who might have reviewed 
 you equally loathful earlier. You actually know that you are creaming the
right bastard!
 
 A more serious question for the editors amongst us: Can I publish 
 review comments or are they covered under some confidentiality rule? 
 Some of these gems are quite worthy public entertainment.
 
 Best, BR
 


--
Zbyszek Otwinowski
UT Southwestern Medical Center  
5323 Harry Hines Blvd., Dallas, TX 75390-8816
(214) 645 6385 (phone) (214) 645 6353 (fax) zbys...@work.swmed.edu


Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)

2010-10-28 Thread Mischa Machius
Lake Wobegon!!!

For those outside the US and/or otherwise not familiar with that small town, 
check out: http://en.wikipedia.org/wiki/Lake_Wobegon

Lake Wobegon, where all the women are strong, all the men are good looking, 
and all the children are above average

The best use of modern statistical concepts in a rebuttal (or in any paper, for 
that matter) I have seen in a long time!

I totally support starting a collection of 'hilarious' reviewers' comments and 
rebuttals. Our resident KuK Hofkristallograf is probably correct in trying to 
establish first whether such Schmaeh is legal. If it is, let the flood gates 
burst!

MM



On Oct 28, 2010, at 8:40 PM, Zbyszek Otwinowski wrote:

 Feel free to use it as you wish.
 
 --
 
 DUFF, Anthony wrote:
 I reckon you could share hypothetical review comments for educational 
 purposes.
 -Original Message-
 From: CCP4 bulletin board on behalf of Bernhard Rupp (Hofkristallrat a.D.)
 Sent: Thu 10/28/2010 12:22 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Rules of thumb (was diverging Rcryst and Rfree)
 Why not double open review? If I have something reasonable to say, I should
 be able to sign it. Particularly if the publicly purported point of review
 is to make the manuscript better.  And imagine what wonderful open hostility
 we would enjoy instead of all these hidden grudges! You would never have to
 preemptively condemn a paper on grounds of suspicion that it is from someone
 who might have reviewed you equally loathful earlier. You actually know that
 you are creaming the right bastard!
 A more serious question for the editors amongst us: Can I publish review
 comments or are they covered under some confidentiality rule? Some of these
 gems are quite worthy public entertainment.
 Best, BR
 
 
 -- 
 Zbyszek Otwinowski
 UT Southwestern Medical Center
 5323 Harry Hines Blvd., Dallas, TX 75390-8816
 (214) 645 6385 (phone) (214) 645 6353 (fax)
 zbys...@work.swmed.edu
 
 
 REVIEW_CRITERIA.pdf



[ccp4bb] Post-Doctoral Position in Structural Biology

2010-10-28 Thread Thirumananseri Kumarevel
Dear Friends:

One post-doctoral
http://www.riken.go.jp/engn/r-world/info/recruit/k101029_s_rsc.html
position  to work on structural biology project is available immediately at
RIKEN SPring-8 Center, Harima Institute, Japan.

With regards,

Kumarevel

 

**

Dr Thirumananseri KUMAREVEL

Senior Research Scientist

Biometal Science Laboratory 

RIKEN Harima Institute at SPring-8

1-1-1 Kouto, Sayo-cho

Sayo-gun, Hyogo 679-5148 JAPAN

 

Tel: (Office) +81-791-58-2838

(Direct) +81-791-58-0802 ext.7894

Fax: +81-791-58-2826

 

Email: tsk...@spring8.or.jp

Web: http://www.riken.jp/biometal www.riken.jp/biometal  or 

   Https://sites.google.com/site/Kumarevel

  

**