Re: [ccp4bb] Automated refinement convergence
Thank you all for the replies. My goal was to learn about practical ways to achieve convergence during refinement even if the supplied model for the density will never be able to model the density adequately. I may experiment with refmac's ability to terminate refinement when a criteria has been met such as delta_Rfreehttps://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Automated refinement convergence
That is an interesting point, for those using big clusters to do extensive computing work. I assume universities are 'going green' but that it will take a while for this to happen. I just use my home computer and have my own solar panels and battery to keep things running. cheers, tom From: CCP4 bulletin board on behalf of Guillaume Gaullier Sent: Friday, January 19, 2024 9:58 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Automated refinement convergence You don't often get email from guillaume.gaull...@kemi.uu.se. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> Hello all, While there is definitely scientific value in reaching perfect convergence by "cooking" refinement jobs "à point", in this day and age I think we should all reflect on whether this value is worth it, given that computing is also contributing to cooking the planet. I am not advocating for poorly refined models, but for finding a reasonable baseline with no more cooking than is necessary to achieve satisfactory models. I know, define "necessary" and "satisfactory"... they are not the same if you want a reasonable model to answer a biological question or if you want to benchmark refinement strategies... Here is an interesting resource if you worry about the sustainability of your computing: https://www.green-algorithms.org/ The calculator is nice too: https://calculator.green-algorithms.org/ Cheers, Guillaume --- Guillaume Gaullier, PhD Researcher, Blikstad group Molecular Biomimetics / Microbial Chemistry Department of Chemistry - Ångström Uppsala University Lägerhyddsvägen 1 752 37 Uppsala Sweden From: CCP4 bulletin board on behalf of Nigel Moriarty Sent: Friday, January 19, 2024 2:10:42 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Automated refinement convergence I, too, love the smell of cooking (or cooked?) jobs in the morning. Cheers Nigel --- Nigel W. Moriarty Building 33R0349, Molecular Biophysics and Integrated Bioimaging Lawrence Berkeley National Laboratory Berkeley, CA 94720-8235 Email : nwmoria...@lbl.gov Web : CCI.LBL.gov<http://CCI.LBL.gov> ORCID : orcid.org/-0001-8857-9464<https://orcid.org/-0001-8857-9464> On Thu, Jan 18, 2024 at 1:16 PM James Holton mailto:jmhol...@lbl.gov>> wrote: Hey there Robert, Refmac has a keyword called "kill" that I think is what you are looking for. It is documented here: https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html You can specify a conditional exit based on R factor, etc. Or you can just create a specified file containing "stop Y" from an external process. I use it when running refmac on a cluster that has run time limits but difficult-to-predict CPU speeds. Phenix, I don't think has a checkpointing feature. Not that I know of. Amber does support checkpointing and now counts as a refinement program since support for structure factor restraints was added in 22. Personally, when I do refinements I do dozens to hundreds of macro-macro-cycles. As in, take the pdb file output by one run and feed it into another run. There is an instantiation overhead to doing this, as you note, but I like my models to be super converged. I define convergence as the x,y,z,B and occ values in the pdb file are not changed by the refinement program. This does not happen quickly, but it does eventually happen. Yes, you can get oscillations, but one way to deal with those is to add a bit more damping, or to adjust the x-ray weight down and then up and then back to auto again. This "weight snap" tends to take things that were dangling from a cliff in the energy landscape and knock them to the ground. After that, the oscillations are less common. And like an equilibrated chromatography column, an xyz-converged model is the best way to know that when you edit and re-refine, everything you see is due to the edit, and not some other process that just wasn't finished yet. That's what I do. Maybe I just want to feel like I've got something cooking while I sleep... Cheers, -James Holton MAD Scientist On 1/18/2024 3:04 AM, Robert Oeffner wrote: > Hi, > > I am wondering if authors of refinement programs would like to consider > putting on their users wish list the ability of refinement programs to > automatically terminate once the refinement has reached convergence. Various > refinement metrics such as R factors, CC or RMS values typically will reach a > plateau once the refinement of a macromolecular structure with X-ray or > EM-data has converged and further macro-cycles of refinement will no longer > improve the structure. The default number of macro-cycles in programs such as > Phenix-refine and Refmac are probably sensible for most cases but in some > cases it would be nice if the progra
Re: [ccp4bb] Automated refinement convergence
Hello all, While there is definitely scientific value in reaching perfect convergence by "cooking" refinement jobs "à point", in this day and age I think we should all reflect on whether this value is worth it, given that computing is also contributing to cooking the planet. I am not advocating for poorly refined models, but for finding a reasonable baseline with no more cooking than is necessary to achieve satisfactory models. I know, define "necessary" and "satisfactory"... they are not the same if you want a reasonable model to answer a biological question or if you want to benchmark refinement strategies... Here is an interesting resource if you worry about the sustainability of your computing: https://www.green-algorithms.org/ The calculator is nice too: https://calculator.green-algorithms.org/ Cheers, Guillaume --- Guillaume Gaullier, PhD Researcher, Blikstad group Molecular Biomimetics / Microbial Chemistry Department of Chemistry - Ångström Uppsala University Lägerhyddsvägen 1 752 37 Uppsala Sweden From: CCP4 bulletin board on behalf of Nigel Moriarty Sent: Friday, January 19, 2024 2:10:42 AM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] Automated refinement convergence I, too, love the smell of cooking (or cooked?) jobs in the morning. Cheers Nigel --- Nigel W. Moriarty Building 33R0349, Molecular Biophysics and Integrated Bioimaging Lawrence Berkeley National Laboratory Berkeley, CA 94720-8235 Email : nwmoria...@lbl.gov Web : CCI.LBL.gov<http://CCI.LBL.gov> ORCID : orcid.org/-0001-8857-9464<https://orcid.org/-0001-8857-9464> On Thu, Jan 18, 2024 at 1:16 PM James Holton mailto:jmhol...@lbl.gov>> wrote: Hey there Robert, Refmac has a keyword called "kill" that I think is what you are looking for. It is documented here: https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html You can specify a conditional exit based on R factor, etc. Or you can just create a specified file containing "stop Y" from an external process. I use it when running refmac on a cluster that has run time limits but difficult-to-predict CPU speeds. Phenix, I don't think has a checkpointing feature. Not that I know of. Amber does support checkpointing and now counts as a refinement program since support for structure factor restraints was added in 22. Personally, when I do refinements I do dozens to hundreds of macro-macro-cycles. As in, take the pdb file output by one run and feed it into another run. There is an instantiation overhead to doing this, as you note, but I like my models to be super converged. I define convergence as the x,y,z,B and occ values in the pdb file are not changed by the refinement program. This does not happen quickly, but it does eventually happen. Yes, you can get oscillations, but one way to deal with those is to add a bit more damping, or to adjust the x-ray weight down and then up and then back to auto again. This "weight snap" tends to take things that were dangling from a cliff in the energy landscape and knock them to the ground. After that, the oscillations are less common. And like an equilibrated chromatography column, an xyz-converged model is the best way to know that when you edit and re-refine, everything you see is due to the edit, and not some other process that just wasn't finished yet. That's what I do. Maybe I just want to feel like I've got something cooking while I sleep... Cheers, -James Holton MAD Scientist On 1/18/2024 3:04 AM, Robert Oeffner wrote: > Hi, > > I am wondering if authors of refinement programs would like to consider > putting on their users wish list the ability of refinement programs to > automatically terminate once the refinement has reached convergence. Various > refinement metrics such as R factors, CC or RMS values typically will reach a > plateau once the refinement of a macromolecular structure with X-ray or > EM-data has converged and further macro-cycles of refinement will no longer > improve the structure. The default number of macro-cycles in programs such as > Phenix-refine and Refmac are probably sensible for most cases but in some > cases it would be nice if the programs automatically extended the number of > macro-cycles as needed (or decreased the number). > > The user can of course examine log files from refinement themselves and > decide whether to continue refinement. But since starting a new session of > refinement appears to always create an initial fluctuation in the refinement > metrics before they align with the values of the last macro-cycles in the > previous refinement session, the user is compelled to do at least, say 3 or > more macrocycles in addition to whatever may be needed for reaching > convergence. I guess it would therefore be more efficient if this w
Re: [ccp4bb] Automated refinement convergence
I, too, love the smell of cooking (or cooked?) jobs in the morning. Cheers Nigel --- Nigel W. Moriarty Building 33R0349, Molecular Biophysics and Integrated Bioimaging Lawrence Berkeley National Laboratory Berkeley, CA 94720-8235 Email : nwmoria...@lbl.gov Web : CCI.LBL.gov ORCID : orcid.org/-0001-8857-9464 On Thu, Jan 18, 2024 at 1:16 PM James Holton wrote: > Hey there Robert, > > Refmac has a keyword called "kill" that I think is what you are looking > for. It is documented here: > > https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html > >You can specify a conditional exit based on R factor, etc. Or you can > just create a specified file containing "stop Y" from an external > process. I use it when running refmac on a cluster that has run time > limits but difficult-to-predict CPU speeds. > > Phenix, I don't think has a checkpointing feature. Not that I know of. > > Amber does support checkpointing and now counts as a refinement program > since support for structure factor restraints was added in 22. > > Personally, when I do refinements I do dozens to hundreds of > macro-macro-cycles. As in, take the pdb file output by one run and feed > it into another run. There is an instantiation overhead to doing this, > as you note, but I like my models to be super converged. I define > convergence as the x,y,z,B and occ values in the pdb file are not > changed by the refinement program. This does not happen quickly, but it > does eventually happen. Yes, you can get oscillations, but one way to > deal with those is to add a bit more damping, or to adjust the x-ray > weight down and then up and then back to auto again. This "weight snap" > tends to take things that were dangling from a cliff in the energy > landscape and knock them to the ground. After that, the oscillations are > less common. > > And like an equilibrated chromatography column, an xyz-converged model > is the best way to know that when you edit and re-refine, everything you > see is due to the edit, and not some other process that just wasn't > finished yet. > > That's what I do. Maybe I just want to feel like I've got something > cooking while I sleep... > > Cheers, > > -James Holton > MAD Scientist > > On 1/18/2024 3:04 AM, Robert Oeffner wrote: > > Hi, > > > > I am wondering if authors of refinement programs would like to consider > putting on their users wish list the ability of refinement programs to > automatically terminate once the refinement has reached convergence. > Various refinement metrics such as R factors, CC or RMS values typically > will reach a plateau once the refinement of a macromolecular structure with > X-ray or EM-data has converged and further macro-cycles of refinement will > no longer improve the structure. The default number of macro-cycles in > programs such as Phenix-refine and Refmac are probably sensible for most > cases but in some cases it would be nice if the programs automatically > extended the number of macro-cycles as needed (or decreased the number). > > > > The user can of course examine log files from refinement themselves and > decide whether to continue refinement. But since starting a new session of > refinement appears to always create an initial fluctuation in the > refinement metrics before they align with the values of the last > macro-cycles in the previous refinement session, the user is compelled to > do at least, say 3 or more macrocycles in addition to whatever may be > needed for reaching convergence. I guess it would therefore be more > efficient if this was implemented directly in the refinement programs and > presented as an option for the user to choose. > > > > There could be cases where alternate conformations of a structure will > repeatedly be oscillating in and out of density thus causing the refinement > metrics also to oscillate. Hopefully such cases could be covered by gauging > the level of fluctuations of the refinement metrics and terminate the > refinement accordingly. > > > > Many thanks, > > > > Robert > > > > > > > > To unsubscribe from the CCP4BB list, click the following link: > > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 > > > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a > mailing list hosted by www.jiscmail.ac.uk, terms & conditions are > available at https://www.jiscmail.ac.uk/policyandsecurity/ > > > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a > mailing list hosted by www.jiscmail.ac.uk, terms & conditions are > available at https://www.jiscmail.ac.uk/policyandsecurity/ > To unsubscribe from the CCP4BB
Re: [ccp4bb] Automated refinement convergence
Hey there Robert, Refmac has a keyword called "kill" that I think is what you are looking for. It is documented here: https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html You can specify a conditional exit based on R factor, etc. Or you can just create a specified file containing "stop Y" from an external process. I use it when running refmac on a cluster that has run time limits but difficult-to-predict CPU speeds. Phenix, I don't think has a checkpointing feature. Not that I know of. Amber does support checkpointing and now counts as a refinement program since support for structure factor restraints was added in 22. Personally, when I do refinements I do dozens to hundreds of macro-macro-cycles. As in, take the pdb file output by one run and feed it into another run. There is an instantiation overhead to doing this, as you note, but I like my models to be super converged. I define convergence as the x,y,z,B and occ values in the pdb file are not changed by the refinement program. This does not happen quickly, but it does eventually happen. Yes, you can get oscillations, but one way to deal with those is to add a bit more damping, or to adjust the x-ray weight down and then up and then back to auto again. This "weight snap" tends to take things that were dangling from a cliff in the energy landscape and knock them to the ground. After that, the oscillations are less common. And like an equilibrated chromatography column, an xyz-converged model is the best way to know that when you edit and re-refine, everything you see is due to the edit, and not some other process that just wasn't finished yet. That's what I do. Maybe I just want to feel like I've got something cooking while I sleep... Cheers, -James Holton MAD Scientist On 1/18/2024 3:04 AM, Robert Oeffner wrote: Hi, I am wondering if authors of refinement programs would like to consider putting on their users wish list the ability of refinement programs to automatically terminate once the refinement has reached convergence. Various refinement metrics such as R factors, CC or RMS values typically will reach a plateau once the refinement of a macromolecular structure with X-ray or EM-data has converged and further macro-cycles of refinement will no longer improve the structure. The default number of macro-cycles in programs such as Phenix-refine and Refmac are probably sensible for most cases but in some cases it would be nice if the programs automatically extended the number of macro-cycles as needed (or decreased the number). The user can of course examine log files from refinement themselves and decide whether to continue refinement. But since starting a new session of refinement appears to always create an initial fluctuation in the refinement metrics before they align with the values of the last macro-cycles in the previous refinement session, the user is compelled to do at least, say 3 or more macrocycles in addition to whatever may be needed for reaching convergence. I guess it would therefore be more efficient if this was implemented directly in the refinement programs and presented as an option for the user to choose. There could be cases where alternate conformations of a structure will repeatedly be oscillating in and out of density thus causing the refinement metrics also to oscillate. Hopefully such cases could be covered by gauging the level of fluctuations of the refinement metrics and terminate the refinement accordingly. Many thanks, Robert To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
Re: [ccp4bb] Automated refinement convergence
Hi Robert, I see your point but extending the number of cycles to reach convergence has a big risk of going into infinite loops (which you point out). In the case of Refmac stopping early is not really needed as it is very fast anyway; a few unnecessary cycles won't take that long. Generally it is better to err in the direction of having too many cycles than too few, especially in your 'final' refinement. This is also the logic applied to pdb-redo, it does 20 cycles by default (more than the typical number for Refmac) and then extends the number of cycles when it uses options that slow down convergence (jelly body, anisotropic B-factors, new data from paired refinement). This (almost) always leads to something that we can call "convergence", at least for models that were at the final stages of model building. That said, I have only really achieved convergence in Refmac (i.e. gradients are '0') once in twenty years and that was after more than 500 cycles of jelly body refinement.* Apparently, there is a large step between "things don't change a lot anymore" and real convergence. Cheers, Robbie * Refmac crashed at that point. A division by zero if I remembered correctly. > -Original Message- > From: CCP4 bulletin board On Behalf Of Robert > Oeffner > Sent: Thursday, January 18, 2024 12:05 > To: CCP4BB@JISCMAIL.AC.UK > Subject: [ccp4bb] Automated refinement convergence > > Hi, > > I am wondering if authors of refinement programs would like to consider > putting on their users wish list the ability of refinement programs to > automatically terminate once the refinement has reached convergence. Various > refinement metrics such as R factors, CC or RMS values typically will reach a > plateau once the refinement of a macromolecular structure with X-ray or EM- > data has converged and further macro-cycles of refinement will no longer > improve the structure. The default number of macro-cycles in programs such as > Phenix-refine and Refmac are probably sensible for most cases but in some > cases it would be nice if the programs automatically extended the number of > macro-cycles as needed (or decreased the number). > > The user can of course examine log files from refinement themselves and > decide whether to continue refinement. But since starting a new session of > refinement appears to always create an initial fluctuation in the refinement > metrics before they align with the values of the last macro-cycles in the > previous refinement session, the user is compelled to do at least, say 3 or > more > macrocycles in addition to whatever may be needed for reaching convergence. I > guess it would therefore be more efficient if this was implemented directly in > the refinement programs and presented as an option for the user to choose. > > There could be cases where alternate conformations of a structure will > repeatedly be oscillating in and out of density thus causing the refinement > metrics also to oscillate. Hopefully such cases could be covered by gauging > the > level of fluctuations of the refinement metrics and terminate the refinement > accordingly. > > Many thanks, > > Robert > > ### > # > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 > > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing > list hosted by www.jiscmail.ac.uk, terms & conditions are available at > https://www.jiscmail.ac.uk/policyandsecurity/ To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/
[ccp4bb] Automated refinement convergence
Hi, I am wondering if authors of refinement programs would like to consider putting on their users wish list the ability of refinement programs to automatically terminate once the refinement has reached convergence. Various refinement metrics such as R factors, CC or RMS values typically will reach a plateau once the refinement of a macromolecular structure with X-ray or EM-data has converged and further macro-cycles of refinement will no longer improve the structure. The default number of macro-cycles in programs such as Phenix-refine and Refmac are probably sensible for most cases but in some cases it would be nice if the programs automatically extended the number of macro-cycles as needed (or decreased the number). The user can of course examine log files from refinement themselves and decide whether to continue refinement. But since starting a new session of refinement appears to always create an initial fluctuation in the refinement metrics before they align with the values of the last macro-cycles in the previous refinement session, the user is compelled to do at least, say 3 or more macrocycles in addition to whatever may be needed for reaching convergence. I guess it would therefore be more efficient if this was implemented directly in the refinement programs and presented as an option for the user to choose. There could be cases where alternate conformations of a structure will repeatedly be oscillating in and out of density thus causing the refinement metrics also to oscillate. Hopefully such cases could be covered by gauging the level of fluctuations of the refinement metrics and terminate the refinement accordingly. Many thanks, Robert To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/