That summarises the differences between the use of the 'full_analysis.py' script and Modelfree4 using the FAST-Modelfree interface quite concisely. I'll just expand or explain a few of those points. There are really four important differences here: model-free model selection; model-free model elimination; model-free optimisation; and the strategy for obtaining the global description of the Brownian rotational diffusion tensor together with all model-free models and parameters.
1 Model selection In the 'full_analysis.py' script AIC model selection is employed. The reason for using this criterion is because the global problem is sought by minimising the Kullback-Leibler discrepancy, more about this later. In the FAST-Modelfree interface to Modelfree4, the ANOVA step-up hypothesis testing of Mandel et al., 1995 is used. I've shown in d'Auvergne and Gooley, JBNMR, 2003, 25(1), 25-36 that there are significant deficiencies in the hypothesis testing model selection. Specifically there are two flaws: not selecting a model when one ought to be selected and under-fitting. If no model is selected (when one should be!) then there will be segments of the macromolecule which cannot be dynamically described (but which should be). The consequences of under-fitting are that S2 is overestimated and te and Rex parameters are underestimated by being dropped from the final model. These two flaws cause the molecule to appear more rigid than reality. This is what you are seeing Doug with the higher proportion of models m3 to m5. 2 Model elimination This may or may not be causing differences between the results. Essentially if a model-free model has failed, the 'full_analysis.py' script will kick it out prior to model selection. See d'Auvergne and Gooley, JBNMR, 2006, 35(2), 117-135 for more details. 3 Optimisation This point will make a major difference to the results. For the optimisation of the model-free models (ignore the optimisation of the diffusion tensor for now) there are 4 optimisation issues: optimisation precision; failure of the Levenberg-Marquardt minimisation (in relax, Modelfree4, and Dasha); failure of the limits algorithm; and a bug in Modelfree4. I have a paper in submission which fully explores each of these issues. The difference in the precision of optimisation between the default in relax and those values hard-coded into model-free is 20 orders of magnitude! For more details see the archived post located at https://mail.gna.org/public/relax-devel/2006-10/msg00122.html (Message-id: <[EMAIL PROTECTED]>). There are a few more details inter-dispersed in the thread to which that post belongs starting at https://mail.gna.org/public/relax-devel/2006-10/msg00114.html (Message-id: <[EMAIL PROTECTED]>). relax can easily be set to the low precision of Modelfree4 however I wouldn't recommend it as the convolution of the model-free space will mean that early termination of optimisation due to low precision will result in parameter values far from the true values. The Levenberg-Marquart algorithm which is the only optimisation algorithm in Modelfree4, one of two in Dasha, or one of many in relax is also an issue. The problem is described in the fine print of the algorithm - the singular matrix failure of the Levenberg-Marquardt matrix. This is often described as being rarely encountered. Yet in model-free analysis the singular matrix failure is actually quite common. It occurs when ever an internal correlation time parameter becomes undefined - i.e. when the corresponding order parameter is equal to one. In this case changing the correlation time has no effect. There are two things which amplify the issue, both the grid search and the limits algorithm significantly increase the probability of having an S2 value of 1. This issue is a hidden issue as those models in which the Levenberg-Marquardt algorithm has failed are often not selected by the model selection algorithm as their optimised chi-squared value is overestimated. The limits algorithm used in Modelfree4 is another point of failure. This can be pictured as follows (taken from a submitted paper). Say minimisation is constrained within a cube arbitrarily placed within a space. Let there be a single minimum located towards one face of the cube. It is simultaneously a local and global minimum within the cube. If the minimum is much narrower than the length between points of the grid search it is conceivable that a moderate curvature of the space will cause the grid search algorithm to select a position distant from the minimum. This often occurs within the model-free space because of the shallow, curved valley which starts at infinite correlation times and heads down to the minimum. Assuming only one minimum within the entire space, optimisation without constraints will follow a trajectory determined by the curvature of the space from the initial position to the minimum. If the trajectory is contained within the cube, constraints should not influence optimisation. However if part of the trajectory lies outside the cube the constraint algorithm will influence whether the minimum will be found. Where the trajectory traverses the surface of the cube if, between the exit and reentry points, there is a downhill path where the gradient is always negative, then this path should be followed to allow the minimum to be found. The constrained trajectory should be similar to the unconstrained trajectory for those parts within the cube. The parts outside the cube should be replaced by a trajectory along the face of the cube between the exit and entry points. Within the model-free space this hypothetical situation does occur due to the convoluted nature of the space. However Modelfree4 does not follow the downhill path along the constraint and optimisation is terminated far from the minimum. The last difference is caused by a bug in the Modelfree4 Levenberg-Marquardt algorithm whereby optimisation is terminated early. In a paper that has been submitted, I've shown that between 13 to 45% of residues or spin systems are affected by this issue dependant on the model-free model. 4 Optimisation of the global model This one is quite complex and is in another manuscript I have submitted for publication. Essentially in Modelfree4 using the FAST-Modelfree interface you are forced to follow the paradigm of starting the analysis using an initial estimate of the diffusion tensor first used in Kay et al., Biochem, 1989, 28(23), 8972-8979. Using this estimate you then optimise the model-free models. The 'full_analysis.py' script takes a completely different approach to solving the simultaneous optimisation and model selection global problem (the diffusion tensor + all model-free models for all spin systems). For details, see the post at https://mail.gna.org/public/relax-users/2006-10/msg00009.html (Message-id: <[EMAIL PROTECTED]>) and all the other messages following from Sebastien Morin's post at https://mail.gna.org/public/relax-users/2006-10/msg00007.html (Message-id: <[EMAIL PROTECTED]>). I hope that that sufficiently describes the differences in the results! Cheers, Edward On 12/21/06, Chris MacRaild <[EMAIL PROTECTED]> wrote:
Hi Doug, I've done similar comparisons and come to similar results. There are a few things to keep in mind when trying to rationalise these differences. First, the approach coded in full_analysis.py makes a serious attempt to optimise both the rotational diffusion tensor, as well as the local dynamic parameters. Modelfree, on the other hand, relies on you having a good estimate of the tensor before you start. So the first thing to check is whether the diffusion tensor relax gets agrees with the one you gave Modelfree - if not, all bets are off with respect to the dynamic parameters. Second, the model selection used by relax is different to that used by Modelfree, so relax will in some cases pick different models, even with everything else being equal. Edward can elaborate on why the relax approach is superior, I'm sure... Third, the optimisation code in relax is much more up-to-date, so is better at finding the true best fit for any given model to your data. Finally, its worth keeping in mind that in many cases, dynamic parameters are poorly defined, even by good data. Even very big differences in tau_e, eg. are not always significant. The difference that would concern me is if there are dramatic differences in order parameters - S2 is generally fairly robust to the above issues, within reason. Cheers, Chris On Wed, 2006-12-20 at 16:18 -0500, Douglas Kojetin wrote: > Hi All, > > Has anyone compared runs of relax (m1 through m5; full_analysis.py > script) vs. a traditional fastmodelfree/modelfree run using the > binary provided by the Palmer group? I have ... I think I'm using > similar parameters for both runs, and I'm seeing a drastic difference > in results (models chosen). > > Thanks in advance for the input, > Doug > > _______________________________________________ > relax (http://nmr-relax.com) > > This is the relax-users mailing list > [email protected] > > To unsubscribe from this list, get a password > reminder, or change your subscription options, > visit the list information page at > https://mail.gna.org/listinfo/relax-users > _______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users
_______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users

