[ccp4bb] AW: [ccp4bb] Rfree below Rwork
You are right. After I sent the email to the bulletin board, I realized that in R32 there must be more then unit cells but did not send a correction. Next time, I will check the space group before sending an email. Best regards, Herman Von: Oganesyan, Vaheh [mailto:oganesy...@medimmune.com] Gesendet: Donnerstag, 2. Juli 2015 15:48 An: Schreuder, Herman RD/DE; CCP4BB@JISCMAIL.AC.UK Betreff: RE: [ccp4bb] Rfree below Rwork Hi Herman, While you're correct regarding increase in number of entities in the asu upon lowering the symmetry, you're not correct for specific case of R32. One molecule per asu in R32 equals 18 molecules per asu in P1. Regards, Vaheh Oganesyan www.medimmune.com From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of herman.schreu...@sanofi.commailto:herman.schreu...@sanofi.com Sent: Wednesday, July 01, 2015 7:34 AM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] AW: [ccp4bb] Rfree below Rwork Dear Boaz, One can equally well describe a R32 crystal with one molecule in the asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 is identical to the crystallographic symmetry in R32. Best, Herman Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz Shaanan Gesendet: Mittwoch, 1. Juli 2015 12:10 An: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Rfree below Rwork Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk] Sent: Tuesday, June 30, 2015 8:55 PM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel
Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork
Hi Herman and Boaz, in the trigonal setting R32 (not in the hexagonal setting H32), the unit cell in R32 contains 6 copies. If you take the whole R32 unit cell as a P1 cell, you would have 6 copies in the asymmetric unit, as Hermann wrote. Best regards, Dirk. Am 02.07.15 um 15:52 schrieb herman.schreu...@sanofi.com: You are right. After I sent the email to the bulletin board, I realized that in R32 there must be more then unit cells but did not send a correction. Next time, I will check the space group before sending an email. Best regards, Herman *Von:*Oganesyan, Vaheh [mailto:oganesy...@medimmune.com] *Gesendet:* Donnerstag, 2. Juli 2015 15:48 *An:* Schreuder, Herman RD/DE; CCP4BB@JISCMAIL.AC.UK *Betreff:* RE: [ccp4bb] Rfree below Rwork Hi Herman, While you’re correct regarding increase in number of entities in the asu upon lowering the symmetry, you’re not correct for specific case of R32. One molecule per asu in R32 equals 18 molecules per asu in P1. /Regards,/ // /Vaheh Oganesyan/ /www.medimmune.com/ *From:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of *herman.schreu...@sanofi.com mailto:herman.schreu...@sanofi.com *Sent:* Wednesday, July 01, 2015 7:34 AM *To:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK *Subject:* [ccp4bb] AW: [ccp4bb] Rfree below Rwork Dear Boaz, One can equally well describe a R32 crystal with one molecule in the asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 is identical to the crystallographic symmetry in R32. Best, Herman *Von:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag von *Boaz Shaanan *Gesendet:* Mittwoch, 1. Juli 2015 12:10 *An:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK *Betreff:* Re: [ccp4bb] Rfree below Rwork Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz /Boaz Shaanan, Ph.D. // /Dept. of Life Sciences / /Ben-Gurion University of the Negev / /Beer-Sheva 84105 / /Israel / // /E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il/ /Phone: 972-8-647-2220 Skype: boaz.shaanan / /Fax: 972-8-647-2992 or 972-8-646-1710 // // *From:*CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk] *Sent:* Tuesday, June 30, 2015 8:55 PM *To:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK *Subject:* Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com mailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS
Re: [ccp4bb] Rfree below Rwork
Hi Herman, While you're correct regarding increase in number of entities in the asu upon lowering the symmetry, you're not correct for specific case of R32. One molecule per asu in R32 equals 18 molecules per asu in P1. Regards, Vaheh Oganesyan www.medimmune.com From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of herman.schreu...@sanofi.com Sent: Wednesday, July 01, 2015 7:34 AM To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] AW: [ccp4bb] Rfree below Rwork Dear Boaz, One can equally well describe a R32 crystal with one molecule in the asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 is identical to the crystallographic symmetry in R32. Best, Herman Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz Shaanan Gesendet: Mittwoch, 1. Juli 2015 12:10 An: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Rfree below Rwork Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk] Sent: Tuesday, June 30, 2015 8:55 PM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033tel:%2B44-%280%291223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889tel
Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork
Dirk, you're right. With rhombohedral setting there are only six copies of asymmetric units in the unit cell. So, technically, Herman was not wrong. Regards, Vaheh Oganesyan www.medimmune.com From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Dirk Kostrewa Sent: Thursday, July 02, 2015 10:03 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork Hi Herman and Boaz, in the trigonal setting R32 (not in the hexagonal setting H32), the unit cell in R32 contains 6 copies. If you take the whole R32 unit cell as a P1 cell, you would have 6 copies in the asymmetric unit, as Hermann wrote. Best regards, Dirk. Am 02.07.15 um 15:52 schrieb herman.schreu...@sanofi.commailto:herman.schreu...@sanofi.com: You are right. After I sent the email to the bulletin board, I realized that in R32 there must be more then unit cells but did not send a correction. Next time, I will check the space group before sending an email. Best regards, Herman Von: Oganesyan, Vaheh [mailto:oganesy...@medimmune.com] Gesendet: Donnerstag, 2. Juli 2015 15:48 An: Schreuder, Herman RD/DE; CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Betreff: RE: [ccp4bb] Rfree below Rwork Hi Herman, While you're correct regarding increase in number of entities in the asu upon lowering the symmetry, you're not correct for specific case of R32. One molecule per asu in R32 equals 18 molecules per asu in P1. Regards, Vaheh Oganesyan www.medimmune.com From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of herman.schreu...@sanofi.commailto:herman.schreu...@sanofi.com Sent: Wednesday, July 01, 2015 7:34 AM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] AW: [ccp4bb] Rfree below Rwork Dear Boaz, One can equally well describe a R32 crystal with one molecule in the asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 is identical to the crystallographic symmetry in R32. Best, Herman Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz Shaanan Gesendet: Mittwoch, 1. Juli 2015 12:10 An: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Rfree below Rwork Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.ukmailto:eleanor.dod...@york.ac.uk] Sent: Tuesday, June 30, 2015 8:55 PM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC
Re: [ccp4bb] Rfree below Rwork
Dear Smith, when you expand to P1, pointless should suggest the space group you expanded from, unless you fiddled with the data after expansion. Regards, Tim On 07/01/2015 04:43 AM, Smith Liu wrote: If both the PDB and mtz for the pdb have been assigned to P1 space group for some reason, can this lead to Rwork higher than Rfree during refinement? If after converting my PDB and mtz to P1 space group, and I have forgotten what is the original space group for my PDB and mtz before conversion to P1 space group, is any method which can recover the original space group for my PDBand mtz, so that in the following refine Rwork would be lower than Rfree? Smith At 2015-07-01 01:55:22, Eleanor Dodson eleanor.dod...@york.ac.uk wrote: I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * === -- -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen phone: +49 (0)551 39 22149 GPG Key ID = A46BEE1A signature.asc Description: OpenPGP digital signature
Re: [ccp4bb] Rfree below Rwork
Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il Phone: 972-8-647-2220Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk] Sent: Tuesday, June 30, 2015 8:55 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd.* * Sheraton House, Castle ParkTel: 44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: 44-(0)1223-366889 * * * ===
Re: [ccp4bb] AW: [ccp4bb] Rfree below Rwork
I wasn't suggesting the space group was wrong - just a lower symmetry equivalent of the true SG. e.g. all structures can be solved in P1 but several of the molecules in the cell will be related by crystal symmetry operators. The same is true for the associated P1 intensities. So IF you had assigned FreeR flags as for P1 it is more than likely that say h k l and -h k -l will have different FreeR status.. And then the FreeR is not really free and it is very likely they will refine to much the same value. However if you had been careful to assign the FreeRs to the highest possible Laue symmetry then expand them to cover P1 you usually find that the freeR and R differ as expected On 1 July 2015 at 12:33, herman.schreu...@sanofi.com wrote: Dear Boaz, One can equally well describe a R32 crystal with one molecule in the asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 is identical to the crystallographic symmetry in R32. Best, Herman *Von:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag von *Boaz Shaanan *Gesendet:* Mittwoch, 1. Juli 2015 12:10 *An:* CCP4BB@JISCMAIL.AC.UK *Betreff:* Re: [ccp4bb] Rfree below Rwork Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz *Boaz Shaanan, Ph.D. * * Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.il bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710* -- *From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk] *Sent:* Tuesday, June 30, 2015 8:55 PM *To:* CCP4BB@JISCMAIL.AC.UK *Subject:* Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel
[ccp4bb] AW: [ccp4bb] Rfree below Rwork
Dear Boaz, One can equally well describe a R32 crystal with one molecule in the asymmetric unit as P1 and 6 molecules in the asymmetric unit. In this case, the NCS in P1 is identical to the crystallographic symmetry in R32. Best, Herman Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Boaz Shaanan Gesendet: Mittwoch, 1. Juli 2015 12:10 An: CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Rfree below Rwork Just wondering about Eleanor's interesting remark: would the Rf Rw go as low as reported by Wolfram (0.22) in case of a wrong space group? Boaz Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University of the Negev Beer-Sheva 84105 Israel E-mail: bshaa...@bgu.ac.ilmailto:bshaa...@bgu.ac.il Phone: 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or 972-8-646-1710 From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Eleanor Dodson [eleanor.dod...@york.ac.uk] Sent: Tuesday, June 30, 2015 8:55 PM To: CCP4BB@JISCMAIL.AC.UKmailto:CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033tel:%2B44-%280%291223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889tel:%2B44-%280%291223-366889 * * * ===
Re: [ccp4bb] Rfree below Rwork
A valid concern particularly in this case, and related to a question someone asked off-list. The model came from a collaborator when it was at early stages of refinement (Rwork/Rfree 0.244/0.296). I followed up with steps I knew would additionally confound cross-validation: 1. I assigned a new free flag in thin shells to allow for the effect of NCS on free set independence (Kleywegt Jones, 1996) http://dx.doi.org/10.1107/S0907444995014983. 2. I selected one of the NCS mates from coordinates I received and used PHASER to position 6 copies of it in the asymmetric unit. Afterward, I annealed the model (PHENIX.REFINE) to “remove the memory” (Brunger, 1993) http://dx.doi.org/10.1107/S0907444992007352 of free reflections, to the extent possible given inherent interdependencies of between reflections and NCS. Further to several suggestions about the space group, I am refining the structure in R3 (hexagonal setting). XDS’s CORRECT.LP “SYMMETRY OF REFLECTION INTENSITIES” shows Rmeas(146) as 21% v Rmeas(155) 53%. The fact that I have experimented with both SCALEPACK and XDS intensities throughout my refinement should not matter in this context as the MTZ files have consistent indices and the same FREE flag. Additional testing has shown that mere omission of TLS parameterization will give Rwork/Rfree of 0.209/0.215 (versus 0.207/0.206 with 10 cycles TLS refinement). For that comparison, I had to adjust REFMAC’s WEIGHT MATRIX to achieve similar bond/angle RMSDs. The latter is relevant here since, as another off-list respondent suggested, Rfree positively correlates with those RMSDs. I further suspect (but have not tested yet) that omission of explicit NCS restraints would also cause Rfree to rise. I tend toward following Gerard Bricogne’s intuition about sample variance of Rfree. It is mentioned briefly by Brunger (1993). Would this be the variance corresponding to Cruickshank’s expected value http://dx.doi.org/10.1107/S0907444995010638? On Tue, Jun 30, 2015 at 2:07 PM, Robbie Joosten robbie_joos...@hotmail.com wrote: Hi Wolfram, You didn’t tell us where your model came from but 10 cycles of TLS and 10 cycles of restrained refinement is not enough for a refinement to converge if you just picked your test set. Try resetting your B-factor and doing 30-40 cycles refinement in REFMAC. Cheers, Robbie *From:* CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf Of * wtempel *Sent:* Tuesday, June 30, 2015 18:59 *To:* CCP4BB@JISCMAIL.AC.UK *Subject:* [ccp4bb] Rfree below Rwork Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel
[ccp4bb] AW: [ccp4bb] Rfree below Rwork
You should go back to the output you got during data processing to see which space groups had been proposed by the data processing software and reprocess in the correct space group. Alternative, you could use Zanuda (validate space group button in ccp4i) to find the correct space group. It will also reindex your mtz. Best, Herman Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Smith Liu Gesendet: Mittwoch, 1. Juli 2015 04:43 An: CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] Rfree below Rwork If both the PDB and mtz for the pdb have been assigned to P1 space group for some reason, can this lead to Rwork higher than Rfree during refinement? If after converting my PDB and mtz to P1 space group, and I have forgotten what is the original space group for my PDB and mtz before conversion to P1 space group, is any method which can recover the original space group for my PDBand mtz, so that in the following refine Rwork would be lower than Rfree? Smith At 2015-07-01 01:55:22, Eleanor Dodson eleanor.dod...@york.ac.ukmailto:eleanor.dod...@york.ac.uk wrote: I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033tel:%2B44-%280%291223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889tel:%2B44-%280%291223-366889 * * * ===
Re: [ccp4bb] Rfree below Rwork
If both the PDB and mtz for the pdb have been assigned to P1 space group for some reason, can this lead to Rwork higher than Rfree during refinement? If after converting my PDB and mtz to P1 space group, and I have forgotten what is the original space group for my PDB and mtz before conversion to P1 space group, is any method which can recover the original space group for my PDBand mtz, so that in the following refine Rwork would be lower than Rfree? Smith At 2015-07-01 01:55:22, Eleanor Dodson eleanor.dod...@york.ac.uk wrote: I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] Rfree below Rwork
Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] Rfree below Rwork
Hi Wolfram, You didn’t tell us where your model came from but 10 cycles of TLS and 10 cycles of restrained refinement is not enough for a refinement to converge if you just picked your test set. Try resetting your B-factor and doing 30-40 cycles refinement in REFMAC. Cheers, Robbie From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of wtempel Sent: Tuesday, June 30, 2015 18:59 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] Rfree below Rwork Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel
Re: [ccp4bb] Rfree below Rwork
I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * * * ===
Re: [ccp4bb] Rfree below Rwork
Regarding what Eleanor said: The program “labelit,” available within Phenix from command line, can check automatically for higher space groups. In the labelit documentation/homepage it mentions the command: labelit.check_pdb_symmetry [pdb coordinate file] [data=mtz file] JPK From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Eleanor Dodson Sent: Tuesday, June 30, 2015 1:55 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Rfree below Rwork I suppose if I was the referee for this structure and your FreeR is so close to the Rfactor I would ask you to ensure you had the right space group - is the 6 fold NCS actually 2 fold NCS with a crystallographic 3 fold.. Cases occur where R32 is indexed as C2.. Certainly if the Rfree set is assigned randomly to reflections which are symmetry equivalents then you see this phenomena of Rfree = Rfactor Eleanor On 30 June 2015 at 18:26, Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com wrote: Dear Wolfram, I have a perhaps optimistic view of the effect of high-order NCS on Rfree, in the sense that I don't view it as a problem. People have agonised to extreme degrees over the difficulty of choosing a free set of reflections that would produce the expected gap between Rwork and Rfree, and some of the conclusions were that you would need to hide almost half of your data in some cases! I think it is best to remember that the idea of cross-validation by Rfree is to prevent overfitting, i.e. ending up with a model that fits the amplitudes too well compared to how well it determines the phases. In the case of high-order NCS (in your case, the U/V ratio that the old papers on NCS identified as the key quantity to measure the phasing power of NCS would be less than 0.1!) the phases and the amplitudes are so tightly coupled that it is simply impossible to fit the amplitudes without delivering phases of an equally good quality. In other words there is no overfitting problem (provided you do have good and complete data) and the difference between Rfree and Rwork is simply within the bounds of the statistical spread of Rfree depending on the free set chosen. You are lucky to have 6-fold NCS, so don't let any reviewer convince you that it is a curse, and make you suffer for it :-) . With best wishes, Gerard. -- On Tue, Jun 30, 2015 at 12:58:44PM -0400, wtempel wrote: Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel -- === * * * Gerard Bricogne g...@globalphasing.commailto:g...@globalphasing.com * * * * Global Phasing Ltd. * * Sheraton House, Castle Park Tel: +44-(0)1223-353033tel:%2B44-%280%291223-353033 * * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889tel:%2B44-%280%291223-366889 * * * ===
[ccp4bb] Rfree below Rwork
Hello, my question concerns refinement of a structure with 6-fold NCS (local automatic restraints in REFMAC) against 2.8 A data. The size of my free set is 1172 selected in thin resolution shells (SFTOOLS) and corresponding to 4.3 % of reflections. A refmac run of 10 cycles of TLS and 10 cycles of CGMAT starts out at Rfree/Rcryst 0.271/0.272. After the 10th TLS cycle I have 0.227/0.224. Yes, Rfree Rcryst. At the end of CGMAT I have 0.2072/0.2071. I understand that NCS stresses the independence assumption of the free set. Am I correct in believing that Rfree *may* be smaller than Rcryst even in the absence of a major mistake? My hope is that the combined wisdom of ccp4bb followers can point out my possible mistake, suggest tests that I may perform to avoid them and, possibly, arguments in defense of a crystallographic model with Rfree Rcryst. Many thanks, Wolfram Tempel