Hi Edward, No problem. I am working on this now, and will try to submit the bug report later tonight or tomorrow.
Doug On Sep 17, 2007, at 4:12 PM, Edward d'Auvergne wrote: > Hi, > > In your previous post > (https://mail.gna.org/public/relax-users/2007-09/msg00011.html, > Message-id: <[EMAIL PROTECTED]>) I think > you were spot on with the diagnosis. The reading of the results files > with None in all model positions will create variables called 'model' > with the value set to None. Then the string comparison will fail > unless these are skipped as well. Well done on picking up the exact > cause of the failure. This important change will need to go into > relax before a new release with the more advanced 'full_analysis.py' > script. > > If you would like to post the changes, I can add these to the > repository. I'll need to check the change carefully first though, and > it may only require two lines in the script to be changed. The best > way would be to attach a patch to a bug report. If you could create a > bug report with a simple summary, that would be appreciated. Then how > you report the changes is up to you. If you change a checked out copy > of the SVN repository and type 'svn diff > patch', you'll get the > changes in a patch file which I can then check and commit to the > repository with your name attached. > > Thanks, > > Edward > > > On 9/17/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: >> As a followup, my changes to full_analysis.py solved my problem. I >> will clean up my code and post it within the next day or so. Would >> you prefer that I attach the script as an attachment, or inline in an >> email, or provide a patch, or change the CVS code myself? >> >> Doug >> >> >> On Sep 17, 2007, at 11:48 AM, Edward d'Auvergne wrote: >> >>> Hi, >>> >>> The problem is likely to be due to a circular looping around very >>> similar solutions close to the universal solution, as I have defined >>> in: >>> >>> d'Auvergne EJ, Gooley PR. Set theory formulation of the model-free >>> problem and the diffusion seeded model-free paradigm. Mol Biosyst. >>> 2007 Jul;3(7):483-94. >>> >>> If you can't get the paper, have a look at Chapter 5 of my PhD >>> thesis at >>> http://eprints.infodiv.unimelb.edu.au/archive/00002799/. The >>> problem >>> is the interlinked mathematical optimisation and statistical model >>> selection where we are trying to minimise different quantities. For >>> mathematical optimisation this is the chi-squared value. For model >>> selection this is the quantity known as the descrepancy. These can >>> not be optimised together as mathematical optimisation works in a >>> single fixed-dimension space or universe whereas model selection >>> operates across multiple spaces with different dimensions. See the >>> paper for a more comprehensive description of the issue. >>> >>> You should be able to see this if you look at the end of iterations. >>> If you have 160 iterations, look after iteration 20 (or maybe >>> even 30 >>> or further). Until then, you will not have reached the circular >>> loop. >>> After that point you will be able to exactly quatify this circular >>> loop. You'll be able to determine its periodicity, which >>> residues are >>> involved (probably only 1), and whether the diffusion tensor changes >>> as model selection changes. >>> >>> I mentioned all of this already in my post at >>> https://mail.gna.org/public/relax-users/2007-07/msg00001.html >>> (Message-id: >>> <[EMAIL PROTECTED]>) >>> in response to your original post >>> (https://mail.gna.org/public/relax-users/2007-06/msg00004.html, >>> Message-id: <[EMAIL PROTECTED]>). >>> >>> I have a few more points about the tests you have done but to >>> work out >>> what is happening with the printouts, it would be very useful to >>> have >>> your modified 'full_analysis.py' script attached. >>> >>> >>> >>> On 9/17/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: >>>> Hi, >>>> >>>> I'm unsure if this is a bug in full_analysis.py, in the internal >>>> relax code, or user error. The optimization of the 'sphere' model >>>> will not converge, now after 160+ rounds. The chi-squared test >>>> has >>>> converged (long, long ago): >>> >>> See above. >>> >>> >>>> "" from output >>>> Chi-squared test: >>>> chi2 (k-1): 100.77647517006251 >>>> chi2 (k): 100.77647517006251 >>>> The chi-squared value has converged. >>>> "" >>>> >>>> However, the identical model-free models test does has not >>>> converged: >>>> >>>> "" from output >>>> Identical model-free models test: >>>> The model-free models have not converged. >>>> >>>> Identical parameter test: >>>> The model-free models haven't converged hence the >>>> parameters haven't converged. >>>> "" >>>> >>>> Something that confuses me is that the output files in the >>>> round_??/ >>>> aic directory suggest that, for example, the round_160 and >>>> round_161 >>>> AIC model selections are equivalent. Here are the models for the >>>> first few residues: >>> >>> Between these 2 rounds, are you sure that all models for all >>> residues >>> are identical? From your data that you posted at >>> https://mail.gna.org/public/relax-users/2007-06/msg00017.html >>> (Message-id: <[EMAIL PROTECTED]>), I >>> would guess that this is not the case and one or two residues >>> actually >>> do change in their model selections. >>> >>> >>>> "" >>>> 1 None None >>>> 2 None None >>>> 3 None None >>>> 4 m2 m2 >>>> 5 m2 m2 >>>> 6 m2 m2 >>>> 7 m2 m2 >>>> 8 m2 m2 >>>> 9 m4 m4 >>>> 10 m1 m1 >>>> 11 None None >>>> 12 m2 m2 >>>> 13 m2 m2 >>>> 14 m1 m1 >>>> 15 m2 m2 >>>> 16 m3 m3 >>>> 17 m3 m3 >>>> 18 None None >>>> "" >>>> >>>> However, I modified the full_analysis.py protocol to print the >>>> differences in the model selection, within the 'Identical model- >>>> free >>>> model test' section of the 'convergence' definition. Here is the >>>> beginning of the output (which only contains differences between >>>> the >>>> previous and current rounds): >>>> >>>> "" >>>> residue 1: prev=None curr=m2 >>>> residue 2: prev=None curr=m2 >>>> residue 3: prev=None curr=m2 >>>> residue 6: prev=m2 curr=m4 >>>> residue 7: prev=m2 curr=m1 >>>> residue 9: prev=m4 curr=m2 >>>> residue 11: prev=None curr=m2 >>>> residue 12: prev=m2 curr=m3 >>>> residue 13: prev=m2 curr=m3 >>>> residue 15: prev=m2 curr=m1 >>>> residue 16: prev=m3 curr=m2 >>>> residue 17: prev=m3 curr=m1 >>>> residue 18: prev=None curr=m3 >>>> "" >>> >>> This output is quite strange. I would need to see the >>> full_analysis.py script to do more with this. >>> >>> >>>> There should be no data for residues 1-3, 11 and 18 (None), however >>>> the 'Identical model-free model test' seems as if it ignores >>>> residues >>>> for which 'None' was selected in the curr_model call in the >>>> following >>>> code: >>>> >>>> "" >>>> # Create a string representation of the model-free >>>> models of >>>> the previous run. >>>> prev_models = '' >>>> for i in xrange(len(self.relax.data.res['previous'])): >>>> if hasattr(self.relax.data.res['previous'][i], >>>> 'model'): >>>> #prev_models = prev_models + self.relax.data.res >>>> ['previous'][i].model >>>> prev_models = prev_models + ' ' + >>>> self.relax.data.res >>>> ['previous'][i].model >>>> >>>> # Create a string representation of the model-free >>>> models of >>>> the current run. >>>> curr_models = '' >>>> for i in xrange(len(self.relax.data.res[run])): >>>> if hasattr(self.relax.data.res[run][i], 'model'): >>>> #curr_models = curr_models + self.relax.data.res >>>> [run] >>>> [i].model >>>> curr_models = curr_models + ' ' + >>>> self.relax.data.res >>>> [run][i].model >>>> "" >>> >>> As residues 1-3, 11 and 18 are deselected, then they will not >>> have the >>> attribute 'model' and hence will not be placed in the prev_models or >>> curr_models string (which are then compared). >>> >>> >>>> For what it's worth, I have residues 1,2,3,11 and 18 in the file >>>> 'unresolved' which is read by the full_analysis.py protocol. I >>>> created a separate sequence file (variable = SEQUENCE) that >>>> contains >>>> all residues (those with data and those without), instead of >>>> using a >>>> data file (noe data, in the default full_analysis.py file). >>>> However, >>>> these residues are not specified in the data (r1, r2 and noe) >>>> files, >>>> as I did not have data for them. Should I add them but place >>>> 'None' >>>> in the data and error columns? Could that be causing the problems? >>>> Or should I create a bug report for this? >>> >>> I'm not sure what you mean by the file '(variable = SEQUENCE)' >>> statement. I would need to see the full_analysis.py script to >>> understand this. I would assume a file called the value in the >>> variable 'SEQUENCE'. In which case, this should not be a >>> problem. As >>> the data is missing from the files containing the relaxation data, >>> these residues will not be used in model-free analysis. They >>> will be >>> automatically deselected. There is not need to add empty data for >>> these spin systems into the relaxation data files. As I said before >>> at the top of this message and at >>> https://mail.gna.org/public/relax-users/2007-07/msg00001.html >>> (Message-id: >>> <[EMAIL PROTECTED]>), >>> the problem is almost guaranteed to be a circular loop of equivalent >>> solutions circling around the universal solution - the solution >>> defined within the universal set (the union of all global models >>> (diffusion tensor + all model-free models of all residues)). If you >>> have hit this circular problem, then I suggest you stop running >>> relax. >>> What I would then do is identify the spin system (or systems) >>> causing >>> the loop, and what the differences are between all members of the >>> loop. This you can do by plotting the data and maybe using the >>> program diff on uncompressed versions of the results files. It's >>> likely that the differences are small and inconsequential. I hope >>> this helps. >>> >>> Regards, >>> >>> Edward >> >> _______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users

