Hi, Sorry for the delay, I've been flat out after coming back from holidays. Now I've finally had a chance to look at apply your patch. The patch labelled 'patch2' attached to bug #10022 (https://gna.org/bugs/?10022) has been applied to the 1.2 line (and manually ported to the 1.3 line). Thank you again for fixing the problem.
Regards, Edward On 9/19/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: > Hi, > > I uploaded what I think could be the last version of the patch, where > references to None were changed to 'None'. > > Doug > > > On Sep 18, 2007, at 4:41 AM, Edward d'Auvergne wrote: > > > Hi, > > > > I've reviewed the patch attached to bug #10022 > > (https://gna.org/bugs/?10022), and have found an issue. The problem > > is with the use of two dictionaries for the previous and current run. > > The issue is that order in a dictionary is not guaranteed, and hence > > the comparison may not work. Maybe a simple test such as "if > > self.relax.data.res[run][i].model == None" or its negative after > > testing for the presence of the 'model' attribute, and removal of the > > dictionary would remove the problem. Significant simplifications to > > the code after the comment "# The test." could also be made. > > > > As for the code in the section "NOTE: the following code has not been > > extenstively tested", this does not directly address the bug itself > > and it will have problems with the dictionary ordering. I would > > prefer that this section not be present in the patch. If someone does > > have different residues in two different iterations of > > full_analysis.py then this error is much more severe and belongs > > elsewhere other than in the convergence tests. > > > > Cheers, > > > > Edward > > > > > > > > On 9/18/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: > >> Hi Edward, > >> > >> I submitted this as a bug report. I modified the full_analysis.py > >> file after a SVN refresh. Unless you have a quick way of doing so, I > >> will test the cleaned up version (submitted as a patch to the bug > >> report) tomorrow. > >> > >> Doug > >> > >> > >> On Sep 17, 2007, at 4:12 PM, Edward d'Auvergne wrote: > >> > >>> Hi, > >>> > >>> In your previous post > >>> (https://mail.gna.org/public/relax-users/2007-09/msg00011.html, > >>> Message-id: <[EMAIL PROTECTED]>) I > >>> think > >>> you were spot on with the diagnosis. The reading of the results > >>> files > >>> with None in all model positions will create variables called > >>> 'model' > >>> with the value set to None. Then the string comparison will fail > >>> unless these are skipped as well. Well done on picking up the exact > >>> cause of the failure. This important change will need to go into > >>> relax before a new release with the more advanced 'full_analysis.py' > >>> script. > >>> > >>> If you would like to post the changes, I can add these to the > >>> repository. I'll need to check the change carefully first > >>> though, and > >>> it may only require two lines in the script to be changed. The best > >>> way would be to attach a patch to a bug report. If you could > >>> create a > >>> bug report with a simple summary, that would be appreciated. > >>> Then how > >>> you report the changes is up to you. If you change a checked out > >>> copy > >>> of the SVN repository and type 'svn diff > patch', you'll get the > >>> changes in a patch file which I can then check and commit to the > >>> repository with your name attached. > >>> > >>> Thanks, > >>> > >>> Edward > >>> > >>> > >>> On 9/17/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: > >>>> As a followup, my changes to full_analysis.py solved my problem. I > >>>> will clean up my code and post it within the next day or so. Would > >>>> you prefer that I attach the script as an attachment, or inline > >>>> in an > >>>> email, or provide a patch, or change the CVS code myself? > >>>> > >>>> Doug > >>>> > >>>> > >>>> On Sep 17, 2007, at 11:48 AM, Edward d'Auvergne wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> The problem is likely to be due to a circular looping around very > >>>>> similar solutions close to the universal solution, as I have > >>>>> defined > >>>>> in: > >>>>> > >>>>> d'Auvergne EJ, Gooley PR. Set theory formulation of the model-free > >>>>> problem and the diffusion seeded model-free paradigm. Mol Biosyst. > >>>>> 2007 Jul;3(7):483-94. > >>>>> > >>>>> If you can't get the paper, have a look at Chapter 5 of my PhD > >>>>> thesis at > >>>>> http://eprints.infodiv.unimelb.edu.au/archive/00002799/. The > >>>>> problem > >>>>> is the interlinked mathematical optimisation and statistical model > >>>>> selection where we are trying to minimise different > >>>>> quantities. For > >>>>> mathematical optimisation this is the chi-squared value. For > >>>>> model > >>>>> selection this is the quantity known as the descrepancy. These > >>>>> can > >>>>> not be optimised together as mathematical optimisation works in a > >>>>> single fixed-dimension space or universe whereas model selection > >>>>> operates across multiple spaces with different dimensions. See > >>>>> the > >>>>> paper for a more comprehensive description of the issue. > >>>>> > >>>>> You should be able to see this if you look at the end of > >>>>> iterations. > >>>>> If you have 160 iterations, look after iteration 20 (or maybe > >>>>> even 30 > >>>>> or further). Until then, you will not have reached the circular > >>>>> loop. > >>>>> After that point you will be able to exactly quatify this > >>>>> circular > >>>>> loop. You'll be able to determine its periodicity, which > >>>>> residues are > >>>>> involved (probably only 1), and whether the diffusion tensor > >>>>> changes > >>>>> as model selection changes. > >>>>> > >>>>> I mentioned all of this already in my post at > >>>>> https://mail.gna.org/public/relax-users/2007-07/msg00001.html > >>>>> (Message-id: > >>>>> <[EMAIL PROTECTED]>) > >>>>> in response to your original post > >>>>> (https://mail.gna.org/public/relax-users/2007-06/msg00004.html, > >>>>> Message-id: <[EMAIL PROTECTED]>). > >>>>> > >>>>> I have a few more points about the tests you have done but to > >>>>> work out > >>>>> what is happening with the printouts, it would be very useful to > >>>>> have > >>>>> your modified 'full_analysis.py' script attached. > >>>>> > >>>>> > >>>>> > >>>>> On 9/17/07, Douglas Kojetin <[EMAIL PROTECTED]> wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I'm unsure if this is a bug in full_analysis.py, in the internal > >>>>>> relax code, or user error. The optimization of the 'sphere' > >>>>>> model > >>>>>> will not converge, now after 160+ rounds. The chi-squared test > >>>>>> has > >>>>>> converged (long, long ago): > >>>>> > >>>>> See above. > >>>>> > >>>>> > >>>>>> "" from output > >>>>>> Chi-squared test: > >>>>>> chi2 (k-1): 100.77647517006251 > >>>>>> chi2 (k): 100.77647517006251 > >>>>>> The chi-squared value has converged. > >>>>>> "" > >>>>>> > >>>>>> However, the identical model-free models test does has not > >>>>>> converged: > >>>>>> > >>>>>> "" from output > >>>>>> Identical model-free models test: > >>>>>> The model-free models have not converged. > >>>>>> > >>>>>> Identical parameter test: > >>>>>> The model-free models haven't converged hence the > >>>>>> parameters haven't converged. > >>>>>> "" > >>>>>> > >>>>>> Something that confuses me is that the output files in the > >>>>>> round_??/ > >>>>>> aic directory suggest that, for example, the round_160 and > >>>>>> round_161 > >>>>>> AIC model selections are equivalent. Here are the models for the > >>>>>> first few residues: > >>>>> > >>>>> Between these 2 rounds, are you sure that all models for all > >>>>> residues > >>>>> are identical? From your data that you posted at > >>>>> https://mail.gna.org/public/relax-users/2007-06/msg00017.html > >>>>> (Message-id: <[EMAIL PROTECTED]>), I > >>>>> would guess that this is not the case and one or two residues > >>>>> actually > >>>>> do change in their model selections. > >>>>> > >>>>> > >>>>>> "" > >>>>>> 1 None None > >>>>>> 2 None None > >>>>>> 3 None None > >>>>>> 4 m2 m2 > >>>>>> 5 m2 m2 > >>>>>> 6 m2 m2 > >>>>>> 7 m2 m2 > >>>>>> 8 m2 m2 > >>>>>> 9 m4 m4 > >>>>>> 10 m1 m1 > >>>>>> 11 None None > >>>>>> 12 m2 m2 > >>>>>> 13 m2 m2 > >>>>>> 14 m1 m1 > >>>>>> 15 m2 m2 > >>>>>> 16 m3 m3 > >>>>>> 17 m3 m3 > >>>>>> 18 None None > >>>>>> "" > >>>>>> > >>>>>> However, I modified the full_analysis.py protocol to print the > >>>>>> differences in the model selection, within the 'Identical model- > >>>>>> free > >>>>>> model test' section of the 'convergence' definition. Here is the > >>>>>> beginning of the output (which only contains differences between > >>>>>> the > >>>>>> previous and current rounds): > >>>>>> > >>>>>> "" > >>>>>> residue 1: prev=None curr=m2 > >>>>>> residue 2: prev=None curr=m2 > >>>>>> residue 3: prev=None curr=m2 > >>>>>> residue 6: prev=m2 curr=m4 > >>>>>> residue 7: prev=m2 curr=m1 > >>>>>> residue 9: prev=m4 curr=m2 > >>>>>> residue 11: prev=None curr=m2 > >>>>>> residue 12: prev=m2 curr=m3 > >>>>>> residue 13: prev=m2 curr=m3 > >>>>>> residue 15: prev=m2 curr=m1 > >>>>>> residue 16: prev=m3 curr=m2 > >>>>>> residue 17: prev=m3 curr=m1 > >>>>>> residue 18: prev=None curr=m3 > >>>>>> "" > >>>>> > >>>>> This output is quite strange. I would need to see the > >>>>> full_analysis.py script to do more with this. > >>>>> > >>>>> > >>>>>> There should be no data for residues 1-3, 11 and 18 (None), > >>>>>> however > >>>>>> the 'Identical model-free model test' seems as if it ignores > >>>>>> residues > >>>>>> for which 'None' was selected in the curr_model call in the > >>>>>> following > >>>>>> code: > >>>>>> > >>>>>> "" > >>>>>> # Create a string representation of the model-free > >>>>>> models of > >>>>>> the previous run. > >>>>>> prev_models = '' > >>>>>> for i in xrange(len(self.relax.data.res['previous'])): > >>>>>> if hasattr(self.relax.data.res['previous'][i], > >>>>>> 'model'): > >>>>>> #prev_models = prev_models + self.relax.data.res > >>>>>> ['previous'][i].model > >>>>>> prev_models = prev_models + ' ' + > >>>>>> self.relax.data.res > >>>>>> ['previous'][i].model > >>>>>> > >>>>>> # Create a string representation of the model-free > >>>>>> models of > >>>>>> the current run. > >>>>>> curr_models = '' > >>>>>> for i in xrange(len(self.relax.data.res[run])): > >>>>>> if hasattr(self.relax.data.res[run][i], 'model'): > >>>>>> #curr_models = curr_models + self.relax.data.res > >>>>>> [run] > >>>>>> [i].model > >>>>>> curr_models = curr_models + ' ' + > >>>>>> self.relax.data.res > >>>>>> [run][i].model > >>>>>> "" > >>>>> > >>>>> As residues 1-3, 11 and 18 are deselected, then they will not > >>>>> have the > >>>>> attribute 'model' and hence will not be placed in the > >>>>> prev_models or > >>>>> curr_models string (which are then compared). > >>>>> > >>>>> > >>>>>> For what it's worth, I have residues 1,2,3,11 and 18 in the file > >>>>>> 'unresolved' which is read by the full_analysis.py protocol. I > >>>>>> created a separate sequence file (variable = SEQUENCE) that > >>>>>> contains > >>>>>> all residues (those with data and those without), instead of > >>>>>> using a > >>>>>> data file (noe data, in the default full_analysis.py file). > >>>>>> However, > >>>>>> these residues are not specified in the data (r1, r2 and noe) > >>>>>> files, > >>>>>> as I did not have data for them. Should I add them but place > >>>>>> 'None' > >>>>>> in the data and error columns? Could that be causing the > >>>>>> problems? > >>>>>> Or should I create a bug report for this? > >>>>> > >>>>> I'm not sure what you mean by the file '(variable = SEQUENCE)' > >>>>> statement. I would need to see the full_analysis.py script to > >>>>> understand this. I would assume a file called the value in the > >>>>> variable 'SEQUENCE'. In which case, this should not be a > >>>>> problem. As > >>>>> the data is missing from the files containing the relaxation data, > >>>>> these residues will not be used in model-free analysis. They > >>>>> will be > >>>>> automatically deselected. There is not need to add empty data for > >>>>> these spin systems into the relaxation data files. As I said > >>>>> before > >>>>> at the top of this message and at > >>>>> https://mail.gna.org/public/relax-users/2007-07/msg00001.html > >>>>> (Message-id: > >>>>> <[EMAIL PROTECTED]>), > >>>>> the problem is almost guaranteed to be a circular loop of > >>>>> equivalent > >>>>> solutions circling around the universal solution - the solution > >>>>> defined within the universal set (the union of all global models > >>>>> (diffusion tensor + all model-free models of all residues)). > >>>>> If you > >>>>> have hit this circular problem, then I suggest you stop running > >>>>> relax. > >>>>> What I would then do is identify the spin system (or systems) > >>>>> causing > >>>>> the loop, and what the differences are between all members of the > >>>>> loop. This you can do by plotting the data and maybe using the > >>>>> program diff on uncompressed versions of the results files. It's > >>>>> likely that the differences are small and inconsequential. I hope > >>>>> this helps. > >>>>> > >>>>> Regards, > >>>>> > >>>>> Edward > >>>> > >>>> > >> > >> > > _______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list [email protected] To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users

