Hi Edward, I uploaded another patch (file #2965) with the simpler solution you mentioned. I guess I must have been extremely lucky for the dictionary comparison to work on the first try, unless something else was happening resulting in a sorted key list.
Doug On Sep 18, 2007, at 4:41 AM, Edward d'Auvergne wrote: > Hi, > > I've reviewed the patch attached to bug #10022 > (https://gna.org/bugs/?10022), and have found an issue. The problem > is with the use of two dictionaries for the previous and current run. > The issue is that order in a dictionary is not guaranteed, and hence > the comparison may not work. Maybe a simple test such as "if > self.relax.data.res[run][i].model == None" or its negative after > testing for the presence of the 'model' attribute, and removal of the > dictionary would remove the problem. Significant simplifications to > the code after the comment "# The test." could also be made. > > As for the code in the section "NOTE: the following code has not been > extenstively tested", this does not directly address the bug itself > and it will have problems with the dictionary ordering. I would > prefer that this section not be present in the patch. If someone does > have different residues in two different iterations of > full_analysis.py then this error is much more severe and belongs > elsewhere other than in the convergence tests. > > Cheers, > > Edward > > > > On 9/18/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: >> Hi Edward, >> >> I submitted this as a bug report. I modified the full_analysis.py >> file after a SVN refresh. Unless you have a quick way of doing so, I >> will test the cleaned up version (submitted as a patch to the bug >> report) tomorrow. >> >> Doug >> >> >> On Sep 17, 2007, at 4:12 PM, Edward d'Auvergne wrote: >> >>> Hi, >>> >>> In your previous post >>> (https://mail.gna.org/public/relax-users/2007-09/msg00011.html, >>> Message-id: <[EMAIL PROTECTED]>) I >>> think >>> you were spot on with the diagnosis. The reading of the results >>> files >>> with None in all model positions will create variables called >>> 'model' >>> with the value set to None. Then the string comparison will fail >>> unless these are skipped as well. Well done on picking up the exact >>> cause of the failure. This important change will need to go into >>> relax before a new release with the more advanced 'full_analysis.py' >>> script. >>> >>> If you would like to post the changes, I can add these to the >>> repository. I'll need to check the change carefully first >>> though, and >>> it may only require two lines in the script to be changed. The best >>> way would be to attach a patch to a bug report. If you could >>> create a >>> bug report with a simple summary, that would be appreciated. >>> Then how >>> you report the changes is up to you. If you change a checked out >>> copy >>> of the SVN repository and type 'svn diff > patch', you'll get the >>> changes in a patch file which I can then check and commit to the >>> repository with your name attached. >>> >>> Thanks, >>> >>> Edward >>> >>> >>> On 9/17/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: >>>> As a followup, my changes to full_analysis.py solved my problem. I >>>> will clean up my code and post it within the next day or so. Would >>>> you prefer that I attach the script as an attachment, or inline >>>> in an >>>> email, or provide a patch, or change the CVS code myself? >>>> >>>> Doug >>>> >>>> >>>> On Sep 17, 2007, at 11:48 AM, Edward d'Auvergne wrote: >>>> >>>>> Hi, >>>>> >>>>> The problem is likely to be due to a circular looping around very >>>>> similar solutions close to the universal solution, as I have >>>>> defined >>>>> in: >>>>> >>>>> d'Auvergne EJ, Gooley PR. Set theory formulation of the model-free >>>>> problem and the diffusion seeded model-free paradigm. Mol Biosyst. >>>>> 2007 Jul;3(7):483-94. >>>>> >>>>> If you can't get the paper, have a look at Chapter 5 of my PhD >>>>> thesis at >>>>> http://eprints.infodiv.unimelb.edu.au/archive/00002799/. The >>>>> problem >>>>> is the interlinked mathematical optimisation and statistical model >>>>> selection where we are trying to minimise different >>>>> quantities. For >>>>> mathematical optimisation this is the chi-squared value. For >>>>> model >>>>> selection this is the quantity known as the descrepancy. These >>>>> can >>>>> not be optimised together as mathematical optimisation works in a >>>>> single fixed-dimension space or universe whereas model selection >>>>> operates across multiple spaces with different dimensions. See >>>>> the >>>>> paper for a more comprehensive description of the issue. >>>>> >>>>> You should be able to see this if you look at the end of >>>>> iterations. >>>>> If you have 160 iterations, look after iteration 20 (or maybe >>>>> even 30 >>>>> or further). Until then, you will not have reached the circular >>>>> loop. >>>>> After that point you will be able to exactly quatify this >>>>> circular >>>>> loop. You'll be able to determine its periodicity, which >>>>> residues are >>>>> involved (probably only 1), and whether the diffusion tensor >>>>> changes >>>>> as model selection changes. >>>>> >>>>> I mentioned all of this already in my post at >>>>> https://mail.gna.org/public/relax-users/2007-07/msg00001.html >>>>> (Message-id: >>>>> <[EMAIL PROTECTED]>) >>>>> in response to your original post >>>>> (https://mail.gna.org/public/relax-users/2007-06/msg00004.html, >>>>> Message-id: <[EMAIL PROTECTED]>). >>>>> >>>>> I have a few more points about the tests you have done but to >>>>> work out >>>>> what is happening with the printouts, it would be very useful to >>>>> have >>>>> your modified 'full_analysis.py' script attached. >>>>> >>>>> >>>>> >>>>> On 9/17/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: >>>>>> Hi, >>>>>> >>>>>> I'm unsure if this is a bug in full_analysis.py, in the internal >>>>>> relax code, or user error. The optimization of the 'sphere' >>>>>> model >>>>>> will not converge, now after 160+ rounds. The chi-squared test >>>>>> has >>>>>> converged (long, long ago): >>>>> >>>>> See above. >>>>> >>>>> >>>>>> "" from output >>>>>> Chi-squared test: >>>>>> chi2 (k-1): 100.77647517006251 >>>>>> chi2 (k): 100.77647517006251 >>>>>> The chi-squared value has converged. >>>>>> "" >>>>>> >>>>>> However, the identical model-free models test does has not >>>>>> converged: >>>>>> >>>>>> "" from output >>>>>> Identical model-free models test: >>>>>> The model-free models have not converged. >>>>>> >>>>>> Identical parameter test: >>>>>> The model-free models haven't converged hence the >>>>>> parameters haven't converged. >>>>>> "" >>>>>> >>>>>> Something that confuses me is that the output files in the >>>>>> round_??/ >>>>>> aic directory suggest that, for example, the round_160 and >>>>>> round_161 >>>>>> AIC model selections are equivalent. Here are the models for the >>>>>> first few residues: >>>>> >>>>> Between these 2 rounds, are you sure that all models for all >>>>> residues >>>>> are identical? From your data that you posted at >>>>> https://mail.gna.org/public/relax-users/2007-06/msg00017.html >>>>> (Message-id: <[EMAIL PROTECTED]>), I >>>>> would guess that this is not the case and one or two residues >>>>> actually >>>>> do change in their model selections. >>>>> >>>>> >>>>>> "" >>>>>> 1 None None >>>>>> 2 None None >>>>>> 3 None None >>>>>> 4 m2 m2 >>>>>> 5 m2 m2 >>>>>> 6 m2 m2 >>>>>> 7 m2 m2 >>>>>> 8 m2 m2 >>>>>> 9 m4 m4 >>>>>> 10 m1 m1 >>>>>> 11 None None >>>>>> 12 m2 m2 >>>>>> 13 m2 m2 >>>>>> 14 m1 m1 >>>>>> 15 m2 m2 >>>>>> 16 m3 m3 >>>>>> 17 m3 m3 >>>>>> 18 None None >>>>>> "" >>>>>> >>>>>> However, I modified the full_analysis.py protocol to print the >>>>>> differences in the model selection, within the 'Identical model- >>>>>> free >>>>>> model test' section of the 'convergence' definition. Here is the >>>>>> beginning of the output (which only contains differences between >>>>>> the >>>>>> previous and current rounds): >>>>>> >>>>>> "" >>>>>> residue 1: prev=None curr=m2 >>>>>> residue 2: prev=None curr=m2 >>>>>> residue 3: prev=None curr=m2 >>>>>> residue 6: prev=m2 curr=m4 >>>>>> residue 7: prev=m2 curr=m1 >>>>>> residue 9: prev=m4 curr=m2 >>>>>> residue 11: prev=None curr=m2 >>>>>> residue 12: prev=m2 curr=m3 >>>>>> residue 13: prev=m2 curr=m3 >>>>>> residue 15: prev=m2 curr=m1 >>>>>> residue 16: prev=m3 curr=m2 >>>>>> residue 17: prev=m3 curr=m1 >>>>>> residue 18: prev=None curr=m3 >>>>>> "" >>>>> >>>>> This output is quite strange. I would need to see the >>>>> full_analysis.py script to do more with this. >>>>> >>>>> >>>>>> There should be no data for residues 1-3, 11 and 18 (None), >>>>>> however >>>>>> the 'Identical model-free model test' seems as if it ignores >>>>>> residues >>>>>> for which 'None' was selected in the curr_model call in the >>>>>> following >>>>>> code: >>>>>> >>>>>> "" >>>>>> # Create a string representation of the model-free >>>>>> models of >>>>>> the previous run. >>>>>> prev_models = '' >>>>>> for i in xrange(len(self.relax.data.res['previous'])): >>>>>> if hasattr(self.relax.data.res['previous'][i], >>>>>> 'model'): >>>>>> #prev_models = prev_models + self.relax.data.res >>>>>> ['previous'][i].model >>>>>> prev_models = prev_models + ' ' + >>>>>> self.relax.data.res >>>>>> ['previous'][i].model >>>>>> >>>>>> # Create a string representation of the model-free >>>>>> models of >>>>>> the current run. >>>>>> curr_models = '' >>>>>> for i in xrange(len(self.relax.data.res[run])): >>>>>> if hasattr(self.relax.data.res[run][i], 'model'): >>>>>> #curr_models = curr_models + self.relax.data.res >>>>>> [run] >>>>>> [i].model >>>>>> curr_models = curr_models + ' ' + >>>>>> self.relax.data.res >>>>>> [run][i].model >>>>>> "" >>>>> >>>>> As residues 1-3, 11 and 18 are deselected, then they will not >>>>> have the >>>>> attribute 'model' and hence will not be placed in the >>>>> prev_models or >>>>> curr_models string (which are then compared). >>>>> >>>>> >>>>>> For what it's worth, I have residues 1,2,3,11 and 18 in the file >>>>>> 'unresolved' which is read by the full_analysis.py protocol. I >>>>>> created a separate sequence file (variable = SEQUENCE) that >>>>>> contains >>>>>> all residues (those with data and those without), instead of >>>>>> using a >>>>>> data file (noe data, in the default full_analysis.py file). >>>>>> However, >>>>>> these residues are not specified in the data (r1, r2 and noe) >>>>>> files, >>>>>> as I did not have data for them. Should I add them but place >>>>>> 'None' >>>>>> in the data and error columns? Could that be causing the >>>>>> problems? >>>>>> Or should I create a bug report for this? >>>>> >>>>> I'm not sure what you mean by the file '(variable = SEQUENCE)' >>>>> statement. I would need to see the full_analysis.py script to >>>>> understand this. I would assume a file called the value in the >>>>> variable 'SEQUENCE'. In which case, this should not be a >>>>> problem. As >>>>> the data is missing from the files containing the relaxation data, >>>>> these residues will not be used in model-free analysis. They >>>>> will be >>>>> automatically deselected. There is not need to add empty data for >>>>> these spin systems into the relaxation data files. As I said >>>>> before >>>>> at the top of this message and at >>>>> https://mail.gna.org/public/relax-users/2007-07/msg00001.html >>>>> (Message-id: >>>>> <[EMAIL PROTECTED]>), >>>>> the problem is almost guaranteed to be a circular loop of >>>>> equivalent >>>>> solutions circling around the universal solution - the solution >>>>> defined within the universal set (the union of all global models >>>>> (diffusion tensor + all model-free models of all residues)). >>>>> If you >>>>> have hit this circular problem, then I suggest you stop running >>>>> relax. >>>>> What I would then do is identify the spin system (or systems) >>>>> causing >>>>> the loop, and what the differences are between all members of the >>>>> loop. This you can do by plotting the data and maybe using the >>>>> program diff on uncompressed versions of the results files. It's >>>>> likely that the differences are small and inconsequential. I hope >>>>> this helps. >>>>> >>>>> Regards, >>>>> >>>>> Edward >>>> >>>> >> >> _______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users

