Re: [ccp4bb] Automated refinement convergence

2024-01-23 Thread Robert Oeffner
Thank you all for the replies. My goal was to learn about practical ways to 
achieve convergence during refinement even if the supplied model for the 
density will never be able to model the density adequately. I may experiment 
with refmac's ability to terminate refinement when a criteria has been met such 
as delta_Rfreehttps://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Automated refinement convergence

2024-01-19 Thread Tom Peat
That is an interesting point, for those using big clusters to do extensive 
computing work.
I assume universities are 'going green' but that it will take a while for this 
to happen.
I just use my home computer and have my own solar panels and battery to keep 
things running.
cheers, tom


From: CCP4 bulletin board  on behalf of Guillaume 
Gaullier 
Sent: Friday, January 19, 2024 9:58 PM
To: CCP4BB@JISCMAIL.AC.UK 
Subject: Re: [ccp4bb] Automated refinement convergence

You don't often get email from guillaume.gaull...@kemi.uu.se. Learn why this is 
important<https://aka.ms/LearnAboutSenderIdentification>

Hello all,


While there is definitely scientific value in reaching perfect convergence by 
"cooking" refinement jobs "à point", in this day and age I think we should all 
reflect on whether this value is worth it, given that computing is also 
contributing to cooking the planet.

I am not advocating for poorly refined models, but for finding a reasonable 
baseline with no more cooking than is necessary to achieve satisfactory models. 
I know, define "necessary" and "satisfactory"... they are not the same if you 
want a reasonable model to answer a biological question or if you want to 
benchmark refinement strategies...

Here is an interesting resource if you worry about the sustainability of your 
computing: https://www.green-algorithms.org/
The calculator is nice too: https://calculator.green-algorithms.org/

Cheers,


Guillaume


---

Guillaume Gaullier, PhD
Researcher, Blikstad group
Molecular Biomimetics / Microbial Chemistry
Department of Chemistry - Ångström
Uppsala University
Lägerhyddsvägen 1
752 37 Uppsala
Sweden


From: CCP4 bulletin board  on behalf of Nigel Moriarty 

Sent: Friday, January 19, 2024 2:10:42 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Automated refinement convergence

I, too, love the smell of cooking (or cooked?) jobs in the morning.

Cheers

Nigel

---
Nigel W. Moriarty
Building 33R0349, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
Berkeley, CA 94720-8235
Email : nwmoria...@lbl.gov
Web  : CCI.LBL.gov<http://CCI.LBL.gov>
ORCID : orcid.org/-0001-8857-9464<https://orcid.org/-0001-8857-9464>


On Thu, Jan 18, 2024 at 1:16 PM James Holton 
mailto:jmhol...@lbl.gov>> wrote:
Hey there Robert,

Refmac has a keyword called "kill" that I think is what you are looking
for.  It is documented here:
https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html

   You can specify a conditional exit based on R factor, etc. Or you can
just create a specified file containing "stop Y" from an external
process.  I use it when running refmac on a cluster that has run time
limits but difficult-to-predict CPU speeds.

Phenix, I don't think has a checkpointing feature. Not that I know of.

Amber does support checkpointing and now counts as a refinement program
since support for structure factor restraints was added in 22.

Personally, when I do refinements I do dozens to hundreds of
macro-macro-cycles. As in, take the pdb file output by one run and feed
it into another run. There is an instantiation overhead to doing this,
as you note, but I like my models to be super converged. I define
convergence as the x,y,z,B and occ values in the pdb file are not
changed by the refinement program. This does not happen quickly, but it
does eventually happen. Yes, you can get oscillations, but one way to
deal with those is to add a bit more damping, or to adjust the x-ray
weight down and then up and then back to auto again. This "weight snap"
tends to take things that were dangling from a cliff in the energy
landscape and knock them to the ground. After that, the oscillations are
less common.

  And like an equilibrated chromatography column, an xyz-converged model
is the best way to know that when you edit and re-refine, everything you
see is due to the edit, and not some other process that just wasn't
finished yet.

That's what I do. Maybe I just want to feel like I've got something
cooking while I sleep...

Cheers,

-James Holton
MAD Scientist

On 1/18/2024 3:04 AM, Robert Oeffner wrote:
> Hi,
>
> I am wondering if authors of refinement programs would like to consider 
> putting on their users wish list the ability of refinement programs to 
> automatically terminate once the refinement has reached convergence. Various 
> refinement metrics such as R factors, CC or RMS values typically will reach a 
> plateau once the refinement of a macromolecular structure with X-ray or 
> EM-data has converged and further macro-cycles of refinement will no longer 
> improve the structure. The default number of macro-cycles in programs such as 
> Phenix-refine and Refmac are probably sensible for most cases but in some 
> cases it would be nice if the progra

Re: [ccp4bb] Automated refinement convergence

2024-01-19 Thread Guillaume Gaullier
Hello all,


While there is definitely scientific value in reaching perfect convergence by 
"cooking" refinement jobs "à point", in this day and age I think we should all 
reflect on whether this value is worth it, given that computing is also 
contributing to cooking the planet.

I am not advocating for poorly refined models, but for finding a reasonable 
baseline with no more cooking than is necessary to achieve satisfactory models. 
I know, define "necessary" and "satisfactory"... they are not the same if you 
want a reasonable model to answer a biological question or if you want to 
benchmark refinement strategies...

Here is an interesting resource if you worry about the sustainability of your 
computing: https://www.green-algorithms.org/
The calculator is nice too: https://calculator.green-algorithms.org/

Cheers,


Guillaume


---

Guillaume Gaullier, PhD
Researcher, Blikstad group
Molecular Biomimetics / Microbial Chemistry
Department of Chemistry - Ångström
Uppsala University
Lägerhyddsvägen 1
752 37 Uppsala
Sweden



From: CCP4 bulletin board  on behalf of Nigel Moriarty 

Sent: Friday, January 19, 2024 2:10:42 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Automated refinement convergence

I, too, love the smell of cooking (or cooked?) jobs in the morning.

Cheers

Nigel

---
Nigel W. Moriarty
Building 33R0349, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
Berkeley, CA 94720-8235
Email : nwmoria...@lbl.gov
Web  : CCI.LBL.gov<http://CCI.LBL.gov>
ORCID : orcid.org/-0001-8857-9464<https://orcid.org/-0001-8857-9464>


On Thu, Jan 18, 2024 at 1:16 PM James Holton 
mailto:jmhol...@lbl.gov>> wrote:
Hey there Robert,

Refmac has a keyword called "kill" that I think is what you are looking
for.  It is documented here:
https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html

   You can specify a conditional exit based on R factor, etc. Or you can
just create a specified file containing "stop Y" from an external
process.  I use it when running refmac on a cluster that has run time
limits but difficult-to-predict CPU speeds.

Phenix, I don't think has a checkpointing feature. Not that I know of.

Amber does support checkpointing and now counts as a refinement program
since support for structure factor restraints was added in 22.

Personally, when I do refinements I do dozens to hundreds of
macro-macro-cycles. As in, take the pdb file output by one run and feed
it into another run. There is an instantiation overhead to doing this,
as you note, but I like my models to be super converged. I define
convergence as the x,y,z,B and occ values in the pdb file are not
changed by the refinement program. This does not happen quickly, but it
does eventually happen. Yes, you can get oscillations, but one way to
deal with those is to add a bit more damping, or to adjust the x-ray
weight down and then up and then back to auto again. This "weight snap"
tends to take things that were dangling from a cliff in the energy
landscape and knock them to the ground. After that, the oscillations are
less common.

  And like an equilibrated chromatography column, an xyz-converged model
is the best way to know that when you edit and re-refine, everything you
see is due to the edit, and not some other process that just wasn't
finished yet.

That's what I do. Maybe I just want to feel like I've got something
cooking while I sleep...

Cheers,

-James Holton
MAD Scientist

On 1/18/2024 3:04 AM, Robert Oeffner wrote:
> Hi,
>
> I am wondering if authors of refinement programs would like to consider 
> putting on their users wish list the ability of refinement programs to 
> automatically terminate once the refinement has reached convergence. Various 
> refinement metrics such as R factors, CC or RMS values typically will reach a 
> plateau once the refinement of a macromolecular structure with X-ray or 
> EM-data has converged and further macro-cycles of refinement will no longer 
> improve the structure. The default number of macro-cycles in programs such as 
> Phenix-refine and Refmac are probably sensible for most cases but in some 
> cases it would be nice if the programs automatically extended the number of 
> macro-cycles as needed (or decreased the number).
>
> The user can of course examine log files from refinement themselves and 
> decide whether to continue refinement. But since starting a new session of 
> refinement appears to always create an initial fluctuation in the refinement 
> metrics before they align with the values of the last macro-cycles in the 
> previous refinement session, the user is compelled to do at least, say 3 or 
> more macrocycles in addition to whatever may be needed for reaching 
> convergence. I guess it would therefore be more efficient if this w

Re: [ccp4bb] Automated refinement convergence

2024-01-18 Thread Nigel Moriarty
I, too, love the smell of cooking (or cooked?) jobs in the morning.

Cheers

Nigel

---
Nigel W. Moriarty
Building 33R0349, Molecular Biophysics and Integrated Bioimaging
Lawrence Berkeley National Laboratory
Berkeley, CA 94720-8235
Email : nwmoria...@lbl.gov
Web  : CCI.LBL.gov
ORCID : orcid.org/-0001-8857-9464


On Thu, Jan 18, 2024 at 1:16 PM James Holton  wrote:

> Hey there Robert,
>
> Refmac has a keyword called "kill" that I think is what you are looking
> for.  It is documented here:
>
> https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html
>
>You can specify a conditional exit based on R factor, etc. Or you can
> just create a specified file containing "stop Y" from an external
> process.  I use it when running refmac on a cluster that has run time
> limits but difficult-to-predict CPU speeds.
>
> Phenix, I don't think has a checkpointing feature. Not that I know of.
>
> Amber does support checkpointing and now counts as a refinement program
> since support for structure factor restraints was added in 22.
>
> Personally, when I do refinements I do dozens to hundreds of
> macro-macro-cycles. As in, take the pdb file output by one run and feed
> it into another run. There is an instantiation overhead to doing this,
> as you note, but I like my models to be super converged. I define
> convergence as the x,y,z,B and occ values in the pdb file are not
> changed by the refinement program. This does not happen quickly, but it
> does eventually happen. Yes, you can get oscillations, but one way to
> deal with those is to add a bit more damping, or to adjust the x-ray
> weight down and then up and then back to auto again. This "weight snap"
> tends to take things that were dangling from a cliff in the energy
> landscape and knock them to the ground. After that, the oscillations are
> less common.
>
>   And like an equilibrated chromatography column, an xyz-converged model
> is the best way to know that when you edit and re-refine, everything you
> see is due to the edit, and not some other process that just wasn't
> finished yet.
>
> That's what I do. Maybe I just want to feel like I've got something
> cooking while I sleep...
>
> Cheers,
>
> -James Holton
> MAD Scientist
>
> On 1/18/2024 3:04 AM, Robert Oeffner wrote:
> > Hi,
> >
> > I am wondering if authors of refinement programs would like to consider
> putting on their users wish list the ability of refinement programs to
> automatically terminate once the refinement has reached convergence.
> Various refinement metrics such as R factors, CC or RMS values typically
> will reach a plateau once the refinement of a macromolecular structure with
> X-ray or EM-data has converged and further macro-cycles of refinement will
> no longer improve the structure. The default number of macro-cycles in
> programs such as Phenix-refine and Refmac are probably sensible for most
> cases but in some cases it would be nice if the programs automatically
> extended the number of macro-cycles as needed (or decreased the number).
> >
> > The user can of course examine log files from refinement themselves and
> decide whether to continue refinement. But since starting a new session of
> refinement appears to always create an initial fluctuation in the
> refinement metrics before they align with the values of the last
> macro-cycles in the previous refinement session, the user is compelled to
> do at least, say 3 or more macrocycles in addition to whatever may be
> needed for reaching convergence. I guess it would therefore be more
> efficient if this was implemented directly in the refinement programs and
> presented as an option for the user to choose.
> >
> > There could be cases where alternate conformations of a structure will
> repeatedly be oscillating in and out of density thus causing the refinement
> metrics also to oscillate. Hopefully such cases could be covered by gauging
> the level of fluctuations of the refinement metrics and terminate the
> refinement accordingly.
> >
> > Many thanks,
> >
> > Robert
> >
> > 
> >
> > To unsubscribe from the CCP4BB list, click the following link:
> > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> >
> > This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB 

Re: [ccp4bb] Automated refinement convergence

2024-01-18 Thread James Holton

Hey there Robert,

Refmac has a keyword called "kill" that I think is what you are looking 
for.  It is documented here:

https://www2.mrc-lmb.cam.ac.uk/groups/murshudov/content/refmac/refmac_keywords.html

  You can specify a conditional exit based on R factor, etc. Or you can 
just create a specified file containing "stop Y" from an external 
process.  I use it when running refmac on a cluster that has run time 
limits but difficult-to-predict CPU speeds.


Phenix, I don't think has a checkpointing feature. Not that I know of.

Amber does support checkpointing and now counts as a refinement program 
since support for structure factor restraints was added in 22.


Personally, when I do refinements I do dozens to hundreds of 
macro-macro-cycles. As in, take the pdb file output by one run and feed 
it into another run. There is an instantiation overhead to doing this, 
as you note, but I like my models to be super converged. I define 
convergence as the x,y,z,B and occ values in the pdb file are not 
changed by the refinement program. This does not happen quickly, but it 
does eventually happen. Yes, you can get oscillations, but one way to 
deal with those is to add a bit more damping, or to adjust the x-ray 
weight down and then up and then back to auto again. This "weight snap" 
tends to take things that were dangling from a cliff in the energy 
landscape and knock them to the ground. After that, the oscillations are 
less common.


 And like an equilibrated chromatography column, an xyz-converged model 
is the best way to know that when you edit and re-refine, everything you 
see is due to the edit, and not some other process that just wasn't 
finished yet.


That's what I do. Maybe I just want to feel like I've got something 
cooking while I sleep...


Cheers,

-James Holton
MAD Scientist

On 1/18/2024 3:04 AM, Robert Oeffner wrote:

Hi,

I am wondering if authors of refinement programs would like to consider putting 
on their users wish list the ability of refinement programs to automatically 
terminate once the refinement has reached convergence. Various refinement 
metrics such as R factors, CC or RMS values typically will reach a plateau once 
the refinement of a macromolecular structure with X-ray or EM-data has 
converged and further macro-cycles of refinement will no longer improve the 
structure. The default number of macro-cycles in programs such as Phenix-refine 
and Refmac are probably sensible for most cases but in some cases it would be 
nice if the programs automatically extended the number of macro-cycles as 
needed (or decreased the number).

The user can of course examine log files from refinement themselves and decide 
whether to continue refinement. But since starting a new session of refinement 
appears to always create an initial fluctuation in the refinement metrics 
before they align with the values of the last macro-cycles in the previous 
refinement session, the user is compelled to do at least, say 3 or more 
macrocycles in addition to whatever may be needed for reaching convergence. I 
guess it would therefore be more efficient if this was implemented directly in 
the refinement programs and presented as an option for the user to choose.

There could be cases where alternate conformations of a structure will 
repeatedly be oscillating in and out of density thus causing the refinement 
metrics also to oscillate. Hopefully such cases could be covered by gauging the 
level of fluctuations of the refinement metrics and terminate the refinement 
accordingly.

Many thanks,

Robert



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] Automated refinement convergence

2024-01-18 Thread Robbie Joosten
Hi Robert,

I see your point but extending the number of cycles to reach convergence has a 
big risk of going into infinite loops (which you point out). In the case of 
Refmac stopping early is not really needed as it is very fast anyway; a few 
unnecessary cycles won't take that long. Generally it is better to err in the 
direction of having too many cycles than too few, especially in your 'final' 
refinement. This is also the logic applied to pdb-redo, it does 20 cycles by 
default (more than the typical number for Refmac) and then extends the number 
of cycles when it uses options that slow down convergence (jelly body, 
anisotropic B-factors, new data from paired refinement). This (almost) always 
leads to something that we can call "convergence", at least for models that 
were at the final stages of model building.

That said, I have only really achieved convergence in Refmac (i.e. gradients 
are '0') once in twenty years and that was after more than 500 cycles of jelly 
body refinement.* Apparently, there is a large step between "things don't 
change a lot anymore" and real convergence.   

Cheers,
Robbie

* Refmac crashed at that point. A division by zero if I remembered correctly.

> -Original Message-
> From: CCP4 bulletin board  On Behalf Of Robert
> Oeffner
> Sent: Thursday, January 18, 2024 12:05
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: [ccp4bb] Automated refinement convergence
> 
> Hi,
> 
> I am wondering if authors of refinement programs would like to consider
> putting on their users wish list the ability of refinement programs to
> automatically terminate once the refinement has reached convergence. Various
> refinement metrics such as R factors, CC or RMS values typically will reach a
> plateau once the refinement of a macromolecular structure with X-ray or EM-
> data has converged and further macro-cycles of refinement will no longer
> improve the structure. The default number of macro-cycles in programs such as
> Phenix-refine and Refmac are probably sensible for most cases but in some
> cases it would be nice if the programs automatically extended the number of
> macro-cycles as needed (or decreased the number).
> 
> The user can of course examine log files from refinement themselves and
> decide whether to continue refinement. But since starting a new session of
> refinement appears to always create an initial fluctuation in the refinement
> metrics before they align with the values of the last macro-cycles in the
> previous refinement session, the user is compelled to do at least, say 3 or 
> more
> macrocycles in addition to whatever may be needed for reaching convergence. I
> guess it would therefore be more efficient if this was implemented directly in
> the refinement programs and presented as an option for the user to choose.
> 
> There could be cases where alternate conformations of a structure will
> repeatedly be oscillating in and out of density thus causing the refinement
> metrics also to oscillate. Hopefully such cases could be covered by gauging 
> the
> level of fluctuations of the refinement metrics and terminate the refinement
> accordingly.
> 
> Many thanks,
> 
> Robert
> 
> ###
> #
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
> 
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing
> list hosted by www.jiscmail.ac.uk, terms & conditions are available at
> https://www.jiscmail.ac.uk/policyandsecurity/



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


[ccp4bb] Automated refinement convergence

2024-01-18 Thread Robert Oeffner
Hi,

I am wondering if authors of refinement programs would like to consider putting 
on their users wish list the ability of refinement programs to automatically 
terminate once the refinement has reached convergence. Various refinement 
metrics such as R factors, CC or RMS values typically will reach a plateau once 
the refinement of a macromolecular structure with X-ray or EM-data has 
converged and further macro-cycles of refinement will no longer improve the 
structure. The default number of macro-cycles in programs such as Phenix-refine 
and Refmac are probably sensible for most cases but in some cases it would be 
nice if the programs automatically extended the number of macro-cycles as 
needed (or decreased the number). 

The user can of course examine log files from refinement themselves and decide 
whether to continue refinement. But since starting a new session of refinement 
appears to always create an initial fluctuation in the refinement metrics 
before they align with the values of the last macro-cycles in the previous 
refinement session, the user is compelled to do at least, say 3 or more 
macrocycles in addition to whatever may be needed for reaching convergence. I 
guess it would therefore be more efficient if this was implemented directly in 
the refinement programs and presented as an option for the user to choose.

There could be cases where alternate conformations of a structure will 
repeatedly be oscillating in and out of density thus causing the refinement 
metrics also to oscillate. Hopefully such cases could be covered by gauging the 
level of fluctuations of the refinement metrics and terminate the refinement 
accordingly.

Many thanks,

Robert



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/