[ccp4bb] AW: [ccp4bb] Rfree below Rwork

2015-07-02 Thread Herman . Schreuder
You are right. After I sent the email to the bulletin board, I realized that in 
R32 there must be more then unit cells but did not send a correction.
Next time, I will check the space group before sending an email.
Best regards,
Herman

Von: Oganesyan, Vaheh [mailto:oganesy...@medimmune.com]
Gesendet: Donnerstag, 2. Juli 2015 15:48
An: Schreuder, Herman RD/DE; CCP4BB@JISCMAIL.AC.UK
Betreff: RE: [ccp4bb] Rfree below Rwork

Hi Herman,

While you're correct regarding increase in number of entities in the asu upon 
lowering the symmetry, you're not correct for specific case of R32. One 
molecule per asu in R32 equals 18 molecules per asu in P1.

Regards,

Vaheh Oganesyan
www.medimmune.com

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
herman.schreu...@sanofi.commailto:herman.schreu...@sanofi.com
Sent: Wednesday, July 01, 2015 7:34 AM
To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

Dear Boaz,

One can equally well describe a R32 crystal with one molecule in the asymmetric 
unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 
is identical to the crystallographic symmetry in R32.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz 
Shaanan
Gesendet: Mittwoch, 1. Juli 2015 12:10
An: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] Rfree below Rwork

Just wondering about Eleanor's interesting remark: would the Rf  Rw go as low 
as reported by Wolfram (0.22) in case of a wrong space group?

 Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson 
[eleanor.dod...@york.ac.uk]
Sent: Tuesday, June 30, 2015 8:55 PM
To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rfree below Rwork
I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel

Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

2015-07-02 Thread Dirk Kostrewa

Hi Herman and Boaz,

in the trigonal setting R32 (not in the hexagonal setting H32), the 
unit cell in R32 contains 6 copies. If you take the whole R32 unit cell 
as a P1 cell, you would have 6 copies in the asymmetric unit, as 
Hermann wrote.


Best regards,

Dirk.

Am 02.07.15 um 15:52 schrieb herman.schreu...@sanofi.com:


You are right. After I sent the email to the bulletin board, I 
realized that in R32 there must be more then unit cells but did not 
send a correction.


Next time, I will check the space group before sending an email.

Best regards,

Herman

*Von:*Oganesyan, Vaheh [mailto:oganesy...@medimmune.com]
*Gesendet:* Donnerstag, 2. Juli 2015 15:48
*An:* Schreuder, Herman RD/DE; CCP4BB@JISCMAIL.AC.UK
*Betreff:* RE: [ccp4bb] Rfree below Rwork

Hi Herman,

While you’re correct regarding increase in number of entities in the 
asu upon lowering the symmetry, you’re not correct for specific case 
of R32. One molecule per asu in R32 equals 18 molecules per asu in P1.


/Regards,/

//

/Vaheh Oganesyan/

/www.medimmune.com/

*From:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf 
Of *herman.schreu...@sanofi.com mailto:herman.schreu...@sanofi.com

*Sent:* Wednesday, July 01, 2015 7:34 AM
*To:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
*Subject:* [ccp4bb] AW: [ccp4bb] Rfree below Rwork

Dear Boaz,

One can equally well describe a R32 crystal with one molecule in the 
asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this 
case, the NCS in P1 is identical to the crystallographic symmetry in R32.


Best,

Herman

*Von:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag 
von *Boaz Shaanan

*Gesendet:* Mittwoch, 1. Juli 2015 12:10
*An:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
*Betreff:* Re: [ccp4bb] Rfree below Rwork

Just wondering about Eleanor's interesting remark: would the Rf  Rw 
go as low as reported by Wolfram (0.22) in case of a wrong space group?


 Boaz

/Boaz Shaanan, Ph.D. //
/Dept. of Life Sciences /
/Ben-Gurion University of the Negev /
/Beer-Sheva 84105 /
/Israel /
//
/E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il/
/Phone: 972-8-647-2220  Skype: boaz.shaanan /
/Fax:   972-8-647-2992 or 972-8-646-1710 //

//



*From:*CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of 
Eleanor Dodson [eleanor.dod...@york.ac.uk]

*Sent:* Tuesday, June 30, 2015 8:55 PM
*To:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] Rfree below Rwork

I suppose if I was the referee for this structure and your FreeR is so 
close to the Rfactor I would ask you to ensure you had the right space 
group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 
3 fold..


Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which 
are symmetry equivalents then you see this phenomena of Rfree = Rfactor


Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com 
mailto:g...@globalphasing.com wrote:


Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--

On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my 
free set
 is 1172 selected in thin resolution shells (SFTOOLS) and 
corresponding to

 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 
0.227/0.224. Yes,

 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS

Re: [ccp4bb] Rfree below Rwork

2015-07-02 Thread Oganesyan, Vaheh
Hi Herman,

While you're correct regarding increase in number of entities in the asu upon 
lowering the symmetry, you're not correct for specific case of R32. One 
molecule per asu in R32 equals 18 molecules per asu in P1.

Regards,

Vaheh Oganesyan
www.medimmune.com

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
herman.schreu...@sanofi.com
Sent: Wednesday, July 01, 2015 7:34 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

Dear Boaz,

One can equally well describe a R32 crystal with one molecule in the asymmetric 
unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 
is identical to the crystallographic symmetry in R32.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz 
Shaanan
Gesendet: Mittwoch, 1. Juli 2015 12:10
An: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] Rfree below Rwork

Just wondering about Eleanor's interesting remark: would the Rf  Rw go as low 
as reported by Wolfram (0.22) in case of a wrong space group?

 Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson 
[eleanor.dod...@york.ac.uk]
Sent: Tuesday, June 30, 2015 8:55 PM
To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rfree below Rwork
I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel
--

 ===
 * *
 * Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: 
+44-(0)1223-353033tel:%2B44-%280%291223-353033 *
 * Cambridge CB3 0AX, UK   Fax: 
+44-(0)1223-366889tel

Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

2015-07-02 Thread Oganesyan, Vaheh
Dirk, you're right. With rhombohedral setting there are only six copies of 
asymmetric units in the unit cell. So, technically, Herman was not wrong.

Regards,

Vaheh Oganesyan
www.medimmune.com

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Dirk 
Kostrewa
Sent: Thursday, July 02, 2015 10:03 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

Hi Herman and Boaz,

in the trigonal setting R32 (not in the hexagonal setting H32), the unit cell 
in R32 contains 6 copies. If you take the whole R32 unit cell as a P1 cell, you 
would have 6 copies in the asymmetric unit, as Hermann wrote.

Best regards,

Dirk.
Am 02.07.15 um 15:52 schrieb 
herman.schreu...@sanofi.commailto:herman.schreu...@sanofi.com:
You are right. After I sent the email to the bulletin board, I realized that in 
R32 there must be more then unit cells but did not send a correction.
Next time, I will check the space group before sending an email.
Best regards,
Herman

Von: Oganesyan, Vaheh [mailto:oganesy...@medimmune.com]
Gesendet: Donnerstag, 2. Juli 2015 15:48
An: Schreuder, Herman RD/DE; 
CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Betreff: RE: [ccp4bb] Rfree below Rwork

Hi Herman,

While you're correct regarding increase in number of entities in the asu upon 
lowering the symmetry, you're not correct for specific case of R32. One 
molecule per asu in R32 equals 18 molecules per asu in P1.

Regards,

Vaheh Oganesyan
www.medimmune.com

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
herman.schreu...@sanofi.commailto:herman.schreu...@sanofi.com
Sent: Wednesday, July 01, 2015 7:34 AM
To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

Dear Boaz,

One can equally well describe a R32 crystal with one molecule in the asymmetric 
unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 
is identical to the crystallographic symmetry in R32.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz 
Shaanan
Gesendet: Mittwoch, 1. Juli 2015 12:10
An: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] Rfree below Rwork

Just wondering about Eleanor's interesting remark: would the Rf  Rw go as low 
as reported by Wolfram (0.22) in case of a wrong space group?

 Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK] 
on behalf of Eleanor Dodson 
[eleanor.dod...@york.ac.ukmailto:eleanor.dod...@york.ac.uk]
Sent: Tuesday, June 30, 2015 8:55 PM
To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rfree below Rwork
I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC

Re: [ccp4bb] Rfree below Rwork

2015-07-02 Thread Tim Gruene
Dear Smith,

when you expand to P1, pointless should suggest the space group you
expanded from, unless you fiddled with the data after expansion.

Regards,
Tim

On 07/01/2015 04:43 AM, Smith Liu wrote:
 If both the PDB and mtz for the pdb have been assigned to P1 space group for 
 some reason, can this lead to Rwork higher than Rfree during refinement?
 
 
 
 If after converting my PDB and mtz to P1 space group, and I have forgotten 
 what is the original space group for my PDB and mtz before conversion to P1 
 space group, is any method which can recover the original space group for my 
 PDBand mtz, so that in the following refine Rwork would be lower than Rfree?
 
 
 Smith
 
 
 
 
 
 
 
 
 At 2015-07-01 01:55:22, Eleanor Dodson eleanor.dod...@york.ac.uk wrote:
 
 I suppose if I was the referee for this structure and your FreeR is so close 
 to the Rfactor I would ask you to ensure you had the right space group - is 
 the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold..
 Cases occur where R32 is indexed as C2.. 
 
 
 Certainly if the Rfree set is assigned randomly to reflections which are 
 symmetry equivalents then you see this phenomena of Rfree = Rfactor
 
 
 Eleanor
 
 
 On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote:
 Dear Wolfram,
 
  I have a perhaps optimistic view of the effect of high-order NCS
 on Rfree, in the sense that I don't view it as a problem. People
 have agonised to extreme degrees over the difficulty of choosing a
 free set of reflections that would produce the expected gap between
 Rwork and Rfree, and some of the conclusions were that you would need
 to hide almost half of your data in some cases!
 
  I think it is best to remember that the idea of cross-validation
 by Rfree is to prevent overfitting, i.e. ending up with a model that
 fits the amplitudes too well compared to how well it determines the
 phases. In the case of high-order NCS (in your case, the U/V ratio
 that the old papers on NCS identified as the key quantity to measure
 the phasing power of NCS would be less than 0.1!) the phases and the
 amplitudes are so tightly coupled that it is simply impossible to fit
 the amplitudes without delivering phases of an equally good quality.
 In other words there is no overfitting problem (provided you do have
 good and complete data) and the difference between Rfree and Rwork is
 simply within the bounds of the statistical spread of Rfree depending
 on the free set chosen.
 
  You are lucky to have 6-fold NCS, so don't let any reviewer
 convince you that it is a curse, and make you suffer for it :-) .
 
 
  With best wishes,
 
   Gerard.
 
 --
 
 On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel
 
 
 --
 
  ===
  * *
  * Gerard Bricogne g...@globalphasing.com  *
  * *
  * Global Phasing Ltd. *
  * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
  * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
  * *
  ===
 
 

-- 
--
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen
phone: +49 (0)551 39 22149

GPG Key ID = A46BEE1A




signature.asc
Description: OpenPGP digital signature


Re: [ccp4bb] Rfree below Rwork

2015-07-01 Thread Boaz Shaanan



Just wondering about Eleanor's interesting remark: would the Rf  Rw go as low as reported by Wolfram (0.22) in case of a wrong space group?


Boaz



Boaz Shaanan, Ph.D.

Dept. of Life Sciences 
Ben-Gurion University of the Negev 
Beer-Sheva 84105 
Israel 
 
E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220Skype: boaz.shaanan 
Fax: 972-8-647-2992 or 972-8-646-1710










From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk]
Sent: Tuesday, June 30, 2015 8:55 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rfree below Rwork




I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..


Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor


Eleanor


On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.com wrote:

Dear Wolfram,

  I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

  I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

  You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


  With best wishes,

 Gerard.

--

On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake, suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel



--

  ===
  *  *
  * Gerard Bricogne  g...@globalphasing.com *
  *  *
  * Global Phasing Ltd.*
  * Sheraton House, Castle ParkTel: 
44-(0)1223-353033 *
  * Cambridge CB3 0AX, UK   Fax: 
44-(0)1223-366889 *
  *  *
  ===











Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork

2015-07-01 Thread Eleanor Dodson
I wasn't suggesting the space group was wrong - just a lower symmetry
equivalent of the true SG. e.g. all structures can be solved in P1 but
several of the molecules in the cell will be related by crystal symmetry
operators. The same is true for the associated P1 intensities. So IF you
had assigned FreeR flags as for P1 it is more than likely that say h k l
and -h k -l will have different FreeR status..

And then the FreeR is not really free and it is very likely they will
refine to much the same value.

However if you had been careful to assign the FreeRs to the highest
possible Laue symmetry then expand them to cover P1 you usually find that
the freeR and R differ as expected


On 1 July 2015 at 12:33, herman.schreu...@sanofi.com wrote:

  Dear Boaz,



 One can equally well describe a R32 crystal with one molecule in the
 asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case,
 the NCS in P1 is identical to the crystallographic symmetry in R32.



 Best,

 Herman



 *Von:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag von
 *Boaz Shaanan
 *Gesendet:* Mittwoch, 1. Juli 2015 12:10
 *An:* CCP4BB@JISCMAIL.AC.UK
 *Betreff:* Re: [ccp4bb] Rfree below Rwork



 Just wondering about Eleanor's interesting remark: would the Rf  Rw go as
 low as reported by Wolfram (0.22) in case of a wrong space group?



  Boaz





 *Boaz Shaanan, Ph.D. *







 * Dept. of Life Sciences  Ben-Gurion
 University of the Negev  Beer-Sheva
 84105
 Israel
 E-mail:
 bshaa...@bgu.ac.il bshaa...@bgu.ac.il Phone: 972-8-647-2220  Skype:
 boaz.shaanan  Fax:   972-8-647-2992 or 972-8-646-1710*






--

 *From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor
 Dodson [eleanor.dod...@york.ac.uk]
 *Sent:* Tuesday, June 30, 2015 8:55 PM
 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* Re: [ccp4bb] Rfree below Rwork

 I suppose if I was the referee for this structure and your FreeR is so
 close to the Rfactor I would ask you to ensure you had the right space
 group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3
 fold..

 Cases occur where R32 is indexed as C2..



 Certainly if the Rfree set is assigned randomly to reflections which are
 symmetry equivalents then you see this phenomena of Rfree = Rfactor



 Eleanor



 On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote:

 Dear Wolfram,

  I have a perhaps optimistic view of the effect of high-order NCS
 on Rfree, in the sense that I don't view it as a problem. People
 have agonised to extreme degrees over the difficulty of choosing a
 free set of reflections that would produce the expected gap between
 Rwork and Rfree, and some of the conclusions were that you would need
 to hide almost half of your data in some cases!

  I think it is best to remember that the idea of cross-validation
 by Rfree is to prevent overfitting, i.e. ending up with a model that
 fits the amplitudes too well compared to how well it determines the
 phases. In the case of high-order NCS (in your case, the U/V ratio
 that the old papers on NCS identified as the key quantity to measure
 the phasing power of NCS would be less than 0.1!) the phases and the
 amplitudes are so tightly coupled that it is simply impossible to fit
 the amplitudes without delivering phases of an equally good quality.
 In other words there is no overfitting problem (provided you do have
 good and complete data) and the difference between Rfree and Rwork is
 simply within the bounds of the statistical spread of Rfree depending
 on the free set chosen.

  You are lucky to have 6-fold NCS, so don't let any reviewer
 convince you that it is a curse, and make you suffer for it :-) .


  With best wishes,

   Gerard.

 --

 On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
  Hello,
  my question concerns refinement of a structure with 6-fold NCS (local
  automatic restraints in REFMAC) against 2.8 A data. The size of my free
 set
  is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
  4.3 % of reflections.
  A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
  Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224.
 Yes,
  Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
  I understand that NCS stresses the independence assumption of the free
 set.
  Am I correct in believing that Rfree *may* be smaller than Rcryst even in
  the absence of a major mistake? My hope is that the combined wisdom of
  ccp4bb followers can point out my possible mistake,  suggest tests that I
  may perform to avoid them and, possibly, arguments in defense of a
  crystallographic model with Rfree  Rcryst.
  Many thanks,
  Wolfram Tempel

[ccp4bb] AW: [ccp4bb] Rfree below Rwork

2015-07-01 Thread Herman . Schreuder
Dear Boaz,

One can equally well describe a R32 crystal with one molecule in the asymmetric 
unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 
is identical to the crystallographic symmetry in R32.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz 
Shaanan
Gesendet: Mittwoch, 1. Juli 2015 12:10
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] Rfree below Rwork

Just wondering about Eleanor's interesting remark: would the Rf  Rw go as low 
as reported by Wolfram (0.22) in case of a wrong space group?

 Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson 
[eleanor.dod...@york.ac.uk]
Sent: Tuesday, June 30, 2015 8:55 PM
To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rfree below Rwork
I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel
--

 ===
 * *
 * Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: 
+44-(0)1223-353033tel:%2B44-%280%291223-353033 *
 * Cambridge CB3 0AX, UK   Fax: 
+44-(0)1223-366889tel:%2B44-%280%291223-366889 *
 * *
 ===



Re: [ccp4bb] Rfree below Rwork

2015-07-01 Thread wtempel
A valid concern particularly in this case, and related to a question
someone asked off-list. The model came from a collaborator when it was at
early stages of refinement (Rwork/Rfree 0.244/0.296). I followed up with
steps I knew would additionally confound cross-validation:

   1. I assigned a new free flag in thin shells to allow for the effect of
   NCS on free set independence (Kleywegt  Jones, 1996)
   http://dx.doi.org/10.1107/S0907444995014983.
   2. I selected one of the NCS mates from coordinates I received and used
   PHASER to position 6 copies of it in the asymmetric unit.

Afterward, I annealed the model (PHENIX.REFINE) to “remove the memory”
(Brunger,
1993) http://dx.doi.org/10.1107/S0907444992007352 of free reflections, to
the extent possible given inherent interdependencies of between reflections
and NCS.
Further to several suggestions about the space group, I am refining the
structure in R3 (hexagonal setting). XDS’s CORRECT.LP “SYMMETRY OF
REFLECTION INTENSITIES” shows Rmeas(146) as 21% v Rmeas(155) 53%. The fact
that I have experimented with both SCALEPACK and XDS intensities throughout
my refinement should not matter in this context as the MTZ files have
consistent indices and the same FREE flag.
Additional testing has shown that mere omission of TLS parameterization
will give Rwork/Rfree of 0.209/0.215 (versus 0.207/0.206 with 10 cycles TLS
refinement). For that comparison, I had to adjust REFMAC’s WEIGHT MATRIX to
achieve similar bond/angle RMSDs. The latter is relevant here since, as
another off-list respondent suggested, Rfree positively correlates with
those RMSDs. I further suspect (but have not tested yet) that omission of
explicit NCS restraints would also cause Rfree to rise.
I tend toward following Gerard Bricogne’s intuition about sample variance
of Rfree. It is mentioned briefly by Brunger (1993). Would this be the
variance
corresponding to Cruickshank’s expected value
http://dx.doi.org/10.1107/S0907444995010638?
​

On Tue, Jun 30, 2015 at 2:07 PM, Robbie Joosten robbie_joos...@hotmail.com
wrote:

 Hi Wolfram,



 You didn’t tell us where your model came from but 10 cycles of TLS and 10
 cycles of restrained refinement is not enough for a refinement to converge
 if you just picked your test set. Try resetting your B-factor and doing
 30-40 cycles refinement in REFMAC.



 Cheers,

 Robbie



 *From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of *
 wtempel
 *Sent:* Tuesday, June 30, 2015 18:59
 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* [ccp4bb] Rfree below Rwork



 Hello,

 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.

 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.

 I understand that NCS stresses the independence assumption of the free
 set. Am I correct in believing that Rfree *may* be smaller than Rcryst even
 in the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.

 Many thanks,

 Wolfram Tempel



[ccp4bb] AW: [ccp4bb] Rfree below Rwork

2015-07-01 Thread Herman . Schreuder
You should go back to the output you got during data processing to see which 
space groups had  been proposed by the data processing software and reprocess 
in the correct space group. Alternative, you could use Zanuda (validate space 
group button in ccp4i) to find the correct space group. It will also reindex 
your mtz.

Best,
Herman

Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Smith Liu
Gesendet: Mittwoch, 1. Juli 2015 04:43
An: CCP4BB@JISCMAIL.AC.UK
Betreff: Re: [ccp4bb] Rfree below Rwork


If both the PDB and mtz for the pdb have been assigned to P1 space group for 
some reason, can this lead to Rwork higher than Rfree during refinement?

If after converting my PDB and mtz to P1 space group, and I have forgotten what 
is the original space group for my PDB and mtz before conversion to P1 space 
group, is any method which can recover the original space group for my PDBand 
mtz, so that in the following refine Rwork would be lower than Rfree?

Smith





At 2015-07-01 01:55:22, Eleanor Dodson 
eleanor.dod...@york.ac.ukmailto:eleanor.dod...@york.ac.uk wrote:

I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel
--

 ===
 * *
 * Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: 
+44-(0)1223-353033tel:%2B44-%280%291223-353033 *
 * Cambridge CB3 0AX, UK   Fax: 
+44-(0)1223-366889tel:%2B44-%280%291223-366889 *
 * *
 ===




Re: [ccp4bb] Rfree below Rwork

2015-06-30 Thread Smith Liu
If both the PDB and mtz for the pdb have been assigned to P1 space group for 
some reason, can this lead to Rwork higher than Rfree during refinement?



If after converting my PDB and mtz to P1 space group, and I have forgotten what 
is the original space group for my PDB and mtz before conversion to P1 space 
group, is any method which can recover the original space group for my PDBand 
mtz, so that in the following refine Rwork would be lower than Rfree?


Smith








At 2015-07-01 01:55:22, Eleanor Dodson eleanor.dod...@york.ac.uk wrote:

I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2.. 


Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor


Eleanor


On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--

On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel


--

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===




Re: [ccp4bb] Rfree below Rwork

2015-06-30 Thread Gerard Bricogne
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,
 
  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel

-- 

 ===
 * *
 * Gerard Bricogne g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
 * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
 * *
 ===


Re: [ccp4bb] Rfree below Rwork

2015-06-30 Thread Robbie Joosten
Hi Wolfram,

 

You didn’t tell us where your model came from but 10 cycles of TLS and 10 
cycles of restrained refinement is not enough for a refinement to converge if 
you just picked your test set. Try resetting your B-factor and doing 30-40 
cycles refinement in REFMAC.

 

Cheers,

Robbie

 

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of wtempel
Sent: Tuesday, June 30, 2015 18:59
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Rfree below Rwork

 

Hello,

my question concerns refinement of a structure with 6-fold NCS (local automatic 
restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 
selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of 
reflections.

A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at 
Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, 
Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.

I understand that NCS stresses the independence assumption of the free set. Am 
I correct in believing that Rfree *may* be smaller than Rcryst even in the 
absence of a major mistake? My hope is that the combined wisdom of ccp4bb 
followers can point out my possible mistake,  suggest tests that I may perform 
to avoid them and, possibly, arguments in defense of a crystallographic model 
with Rfree  Rcryst.

Many thanks,

Wolfram Tempel



Re: [ccp4bb] Rfree below Rwork

2015-06-30 Thread Eleanor Dodson
I suppose if I was the referee for this structure and your FreeR is so
close to the Rfactor I would ask you to ensure you had the right space
group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3
fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote:

 Dear Wolfram,

  I have a perhaps optimistic view of the effect of high-order NCS
 on Rfree, in the sense that I don't view it as a problem. People
 have agonised to extreme degrees over the difficulty of choosing a
 free set of reflections that would produce the expected gap between
 Rwork and Rfree, and some of the conclusions were that you would need
 to hide almost half of your data in some cases!

  I think it is best to remember that the idea of cross-validation
 by Rfree is to prevent overfitting, i.e. ending up with a model that
 fits the amplitudes too well compared to how well it determines the
 phases. In the case of high-order NCS (in your case, the U/V ratio
 that the old papers on NCS identified as the key quantity to measure
 the phasing power of NCS would be less than 0.1!) the phases and the
 amplitudes are so tightly coupled that it is simply impossible to fit
 the amplitudes without delivering phases of an equally good quality.
 In other words there is no overfitting problem (provided you do have
 good and complete data) and the difference between Rfree and Rwork is
 simply within the bounds of the statistical spread of Rfree depending
 on the free set chosen.

  You are lucky to have 6-fold NCS, so don't let any reviewer
 convince you that it is a curse, and make you suffer for it :-) .


  With best wishes,

   Gerard.

 --
 On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
  Hello,
  my question concerns refinement of a structure with 6-fold NCS (local
  automatic restraints in REFMAC) against 2.8 A data. The size of my free
 set
  is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
  4.3 % of reflections.
  A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
  Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224.
 Yes,
  Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
  I understand that NCS stresses the independence assumption of the free
 set.
  Am I correct in believing that Rfree *may* be smaller than Rcryst even in
  the absence of a major mistake? My hope is that the combined wisdom of
  ccp4bb followers can point out my possible mistake,  suggest tests that I
  may perform to avoid them and, possibly, arguments in defense of a
  crystallographic model with Rfree  Rcryst.
  Many thanks,
  Wolfram Tempel

 --

  ===
  * *
  * Gerard Bricogne g...@globalphasing.com  *
  * *
  * Global Phasing Ltd. *
  * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
  * Cambridge CB3 0AX, UK   Fax: +44-(0)1223-366889 *
  * *
  ===



Re: [ccp4bb] Rfree below Rwork

2015-06-30 Thread Keller, Jacob
Regarding what Eleanor said:

The program “labelit,” available within Phenix from command line, can check 
automatically for higher space groups.


In the labelit documentation/homepage it mentions the command:



labelit.check_pdb_symmetry [pdb coordinate file] [data=mtz file]


JPK

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Eleanor 
Dodson
Sent: Tuesday, June 30, 2015 1:55 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Rfree below Rwork

I suppose if I was the referee for this structure and your FreeR is so close to 
the Rfactor I would ask you to ensure you had the right space group - is the 6 
fold NCS actually 2 fold NCS with a crystallographic 3 fold..
Cases occur where R32 is indexed as C2..

Certainly if the Rfree set is assigned randomly to reflections which are 
symmetry equivalents then you see this phenomena of Rfree = Rfactor

Eleanor

On 30 June 2015 at 18:26, Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com wrote:
Dear Wolfram,

 I have a perhaps optimistic view of the effect of high-order NCS
on Rfree, in the sense that I don't view it as a problem. People
have agonised to extreme degrees over the difficulty of choosing a
free set of reflections that would produce the expected gap between
Rwork and Rfree, and some of the conclusions were that you would need
to hide almost half of your data in some cases!

 I think it is best to remember that the idea of cross-validation
by Rfree is to prevent overfitting, i.e. ending up with a model that
fits the amplitudes too well compared to how well it determines the
phases. In the case of high-order NCS (in your case, the U/V ratio
that the old papers on NCS identified as the key quantity to measure
the phasing power of NCS would be less than 0.1!) the phases and the
amplitudes are so tightly coupled that it is simply impossible to fit
the amplitudes without delivering phases of an equally good quality.
In other words there is no overfitting problem (provided you do have
good and complete data) and the difference between Rfree and Rwork is
simply within the bounds of the statistical spread of Rfree depending
on the free set chosen.

 You are lucky to have 6-fold NCS, so don't let any reviewer
convince you that it is a curse, and make you suffer for it :-) .


 With best wishes,

  Gerard.

--
On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote:
 Hello,
 my question concerns refinement of a structure with 6-fold NCS (local
 automatic restraints in REFMAC) against 2.8 A data. The size of my free set
 is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
 4.3 % of reflections.
 A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
 Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
 Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
 I understand that NCS stresses the independence assumption of the free set.
 Am I correct in believing that Rfree *may* be smaller than Rcryst even in
 the absence of a major mistake? My hope is that the combined wisdom of
 ccp4bb followers can point out my possible mistake,  suggest tests that I
 may perform to avoid them and, possibly, arguments in defense of a
 crystallographic model with Rfree  Rcryst.
 Many thanks,
 Wolfram Tempel
--

 ===
 * *
 * Gerard Bricogne 
g...@globalphasing.commailto:g...@globalphasing.com  *
 * *
 * Global Phasing Ltd. *
 * Sheraton House, Castle Park Tel: 
+44-(0)1223-353033tel:%2B44-%280%291223-353033 *
 * Cambridge CB3 0AX, UK   Fax: 
+44-(0)1223-366889tel:%2B44-%280%291223-366889 *
 * *
 ===



[ccp4bb] Rfree below Rwork

2015-06-30 Thread wtempel
Hello,
my question concerns refinement of a structure with 6-fold NCS (local
automatic restraints in REFMAC) against 2.8 A data. The size of my free set
is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to
4.3 % of reflections.
A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at
Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes,
Rfree  Rcryst. At the end of CGMAT I have 0.2072/0.2071.
I understand that NCS stresses the independence assumption of the free set.
Am I correct in believing that Rfree *may* be smaller than Rcryst even in
the absence of a major mistake? My hope is that the combined wisdom of
ccp4bb followers can point out my possible mistake,  suggest tests that I
may perform to avoid them and, possibly, arguments in defense of a
crystallographic model with Rfree  Rcryst.
Many thanks,
Wolfram Tempel