Re: r27203 - /trunk/specific_analyses/relax_disp/optimisation.py

2015-01-19 Thread Edward d'Auvergne
Hi Troels,

Could you rename spin.sos to spin.sse?  This is the acronym used in
the field and by other software - the sum of squared errors
(https://en.wikipedia.org/wiki/Residual_sum_of_squares,
http://www.palmer.hs.columbia.edu/software/modelfree_manual.pdf).  If
the individual SSE elements are divided by the experimental error
sigma_i, then this is the chi2 value.  The SSE and chi2 statistics are
related, and are identical in the case of unit errors.  Other
acronyms, much less used in the NMR field, are SSR or RSS.  I don't
think I've ever encountered SOS before, outside of emergencies
(https://en.wikipedia.org/wiki/SOS).

Cheers,

Edward

On 16 January 2015 at 23:19,  tlin...@nmr-relax.com wrote:
 Author: tlinnet
 Date: Fri Jan 16 23:19:50 2015
 New Revision: 27203

 URL: http://svn.gna.org/viewcvs/relax?rev=27203view=rev
 Log:
 Implemented storing of sum of squares and the standard deviation of these for 
 relaxation dispersion, when doing a point calculation.

 Task #7882 (https://gna.org/task/?7882): Implement Monte-Carlo simulation, 
 where errors are generated with width of standard deviation or residuals.

 Modified:
 trunk/specific_analyses/relax_disp/optimisation.py

 Modified: trunk/specific_analyses/relax_disp/optimisation.py
 URL: 
 http://svn.gna.org/viewcvs/relax/trunk/specific_analyses/relax_disp/optimisation.py?rev=27203r1=27202r2=27203view=diff
 ==
 --- trunk/specific_analyses/relax_disp/optimisation.py  (original)
 +++ trunk/specific_analyses/relax_disp/optimisation.py  Fri Jan 16 23:19:50 
 2015
 @@ -119,7 +119,7 @@
  @type spin_lock_nu1:list of lists of numpy rank-1 float arrays
  @keyword relax_times_new:   The interpolated experiment specific fixed 
 time period for relaxation (in seconds).  The dimensions are {Ei, Mi, Oi, Di, 
 Ti}.
  @type relax_times_new:  rank-4 list of floats
 -@keyword store_chi2:A flag which if True will cause the spin 
 specific chi-squared value to be stored in the spin container.
 +@keyword store_chi2:A flag which if True will cause the spin 
 specific chi-squared value to be stored in the spin container together with 
 the sum of squares of the residuals and the standard deviation of the sum of 
 squares of the residuals.
  @type store_chi2:   bool
  @return:The back-calculated R2eff/R1rho value for 
 the given spin.
  @rtype: numpy rank-1 float array
 @@ -215,10 +215,15 @@
  # Make a single function call.  This will cause back calculation and the 
 data will be stored in the class instance.
  chi2 = model.func(param_vector)

 -# Store the chi-squared value.
 +# Get the sum of squares 'sos' of the residuals between the fitted 
 values and the measured values. Get the std deviation of these, std_sos.
 +sos, sos_std = model.get_sum_of_squares()
 +
 +# Store the chi-squared value, sums of squares of residual and the 
 standard deviation of sums of squares of residual.
  if store_chi2:
  for spin in spins:
  spin.chi2 = chi2
 +spin.sos = sos
 +spin.sos_std = sos_std

  # Return the structure.
  return model.get_back_calc()


 ___
 relax (http://www.nmr-relax.com)

 This is the relax-commits mailing list
 relax-comm...@gna.org

 To unsubscribe from this list, get a password
 reminder, or change your subscription options,
 visit the list information page at
 https://mail.gna.org/listinfo/relax-commits

___
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@gna.org

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel


Re: r27203 - /trunk/specific_analyses/relax_disp/optimisation.py

2015-01-19 Thread Edward d'Auvergne
Ah!  It could be what I've seen with the model-free analysis.  It
could be that the high precision optimisation in relax avoids the
problems of parameter error overestimation due to the real minimum
being located in a broad region or tunnel in the space.  Do you know
the reason for the large kex errors in the original analysis?  Is it
possible to investigate this?  How were the errors calculated, and do
you know the exact implementation details?  For example what was the
optimisation starting point for each error simulation?  I spent my
entire PhD time solving such problems, reading 3 statistics books
cover-to-cover in the maths library, so I may be able to help.  Or at
least point you in a useful direction.

Regards,

Edward


On 19 January 2015 at 10:33, Troels Emtekær Linnet
tlin...@nmr-relax.com wrote:
 Hi Edward.

 I actually think that I have created local minima in my dataset, which is
 not caught.

 I am looking into it.

 2015-01-19 10:30 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:

 Hi,

 Maybe we should discuss on the original thread the problem in detail
 and see if there is a solution.  I wonder why the kex errors are so
 different?

 Regards,

 Edward



 On 19 January 2015 at 09:51, Troels Emtekær Linnet
 tlin...@nmr-relax.com wrote:
  Hi Edward.
 
  I was through sor (sum of residuals), sos(sum of squares), and now
  sse(sum
  of squared errors).
 
  I agree with sse being the best, but I have reverted all my commits, and
  found a solution through the API.
 
  Just using the chi2 value, and finding degrees of freedom with the API.
 
  If one wants .sse, one can just quickly do
 
  value.set(val=1.0, param=r2eff, error=True)
  minimise.calculate(verbosity=1)
 
  Anyway, in the end, the new method did not solve my problem.
  STD_fit = sqrt(chi2 / dof)
 
  Since dof is so big (many datapoints, small amounts of parameters for
  clustered fitting), STD_fit becomes close to 1.
 
 
  Best
  Troels
 
 
  2015-01-19 9:35 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:
 
  Hi Troels,
 
  Could you rename spin.sos to spin.sse?  This is the acronym used in
  the field and by other software - the sum of squared errors
  (https://en.wikipedia.org/wiki/Residual_sum_of_squares,
  http://www.palmer.hs.columbia.edu/software/modelfree_manual.pdf).  If
  the individual SSE elements are divided by the experimental error
  sigma_i, then this is the chi2 value.  The SSE and chi2 statistics are
  related, and are identical in the case of unit errors.  Other
  acronyms, much less used in the NMR field, are SSR or RSS.  I don't
  think I've ever encountered SOS before, outside of emergencies
  (https://en.wikipedia.org/wiki/SOS).
 
  Cheers,
 
  Edward
 
  On 16 January 2015 at 23:19,  tlin...@nmr-relax.com wrote:
   Author: tlinnet
   Date: Fri Jan 16 23:19:50 2015
   New Revision: 27203
  
   URL: http://svn.gna.org/viewcvs/relax?rev=27203view=rev
   Log:
   Implemented storing of sum of squares and the standard deviation of
   these for relaxation dispersion, when doing a point calculation.
  
   Task #7882 (https://gna.org/task/?7882): Implement Monte-Carlo
   simulation, where errors are generated with width of standard
   deviation or
   residuals.
  
   Modified:
   trunk/specific_analyses/relax_disp/optimisation.py
  
   Modified: trunk/specific_analyses/relax_disp/optimisation.py
   URL:
  
   http://svn.gna.org/viewcvs/relax/trunk/specific_analyses/relax_disp/optimisation.py?rev=27203r1=27202r2=27203view=diff
  
  
   ==
   --- trunk/specific_analyses/relax_disp/optimisation.py  (original)
   +++ trunk/specific_analyses/relax_disp/optimisation.py  Fri Jan 16
   23:19:50 2015
   @@ -119,7 +119,7 @@
@type spin_lock_nu1:list of lists of numpy rank-1 float
   arrays
@keyword relax_times_new:   The interpolated experiment specific
   fixed time period for relaxation (in seconds).  The dimensions are
   {Ei, Mi,
   Oi, Di, Ti}.
@type relax_times_new:  rank-4 list of floats
   -@keyword store_chi2:A flag which if True will cause the
   spin specific chi-squared value to be stored in the spin container.
   +@keyword store_chi2:A flag which if True will cause the
   spin specific chi-squared value to be stored in the spin container
   together
   with the sum of squares of the residuals and the standard deviation
   of the
   sum of squares of the residuals.
@type store_chi2:   bool
@return:The back-calculated R2eff/R1rho
   value
   for the given spin.
@rtype: numpy rank-1 float array
   @@ -215,10 +215,15 @@
# Make a single function call.  This will cause back calculation
   and the data will be stored in the class instance.
chi2 = model.func(param_vector)
  
   -# Store the chi-squared value.
   +# Get the sum of squares 'sos' of the residuals between the
   fitted
   

Re: r27203 - /trunk/specific_analyses/relax_disp/optimisation.py

2015-01-19 Thread Troels Emtekær Linnet
Hi Edward.

I am now trying to follow page 109-111.
http://www.graphpad.com/faq/file/Prism4RegressionBook.pdf
Generating confidence intervals via model comparison

Here, I have locked all values except kex.
# The number of parameters to check is kex = 1.
P = 1
# Number of datapoints
N = 1952
# The degrees of freedom for this confidence interval
dof_conf = N - P
# The critical value of the F distribution with p-value of 0.05 for 95%
confidence.
# Can be calculated with microsoft excel:
# F=FINV(0,05; P; dof_conf), F=FINV(0,05; P; dof_conf), F=FINV(0,05; 1;
1951)=3,846229551
 # Can also be calculated with: import scipy.stats; scipy.stats.f.isf(0.05,
1, 1951)=3.8462295505435562
F = 3.8462295505435562
scale = F*P/dof_conf +1 = 1.00197141443

Then I vary kex from 1000, til 5000, and then taking values of kex, where
the calulated chi2 is less than:
chi2_test = 1.00197141443 *  2324.5 = 2329.082

For the 100% dataset, there is a nice shape of chi2.

But this goes against 50 % dataset I have seen.
i_sortdw_sortpA_sortkex_sort  chi2_sort
471   4.50.993752125.04664.31083
470   4.50.993751750.04665.23872

If I am unlucky, I have created local minima in the space.

So, I am now trying to do this for the 50 % dataset.

This method is interesting, since I can force kex out of its local minima.
But it will be close to impossible to extend to more parameters than 1.

Best
Troels

2015-01-19 10:43 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:

 Ah!  It could be what I've seen with the model-free analysis.  It
 could be that the high precision optimisation in relax avoids the
 problems of parameter error overestimation due to the real minimum
 being located in a broad region or tunnel in the space.  Do you know
 the reason for the large kex errors in the original analysis?  Is it
 possible to investigate this?  How were the errors calculated, and do
 you know the exact implementation details?  For example what was the
 optimisation starting point for each error simulation?  I spent my
 entire PhD time solving such problems, reading 3 statistics books
 cover-to-cover in the maths library, so I may be able to help.  Or at
 least point you in a useful direction.

 Regards,

 Edward


 On 19 January 2015 at 10:33, Troels Emtekær Linnet
 tlin...@nmr-relax.com wrote:
  Hi Edward.
 
  I actually think that I have created local minima in my dataset, which is
  not caught.
 
  I am looking into it.
 
  2015-01-19 10:30 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:
 
  Hi,
 
  Maybe we should discuss on the original thread the problem in detail
  and see if there is a solution.  I wonder why the kex errors are so
  different?
 
  Regards,
 
  Edward
 
 
 
  On 19 January 2015 at 09:51, Troels Emtekær Linnet
  tlin...@nmr-relax.com wrote:
   Hi Edward.
  
   I was through sor (sum of residuals), sos(sum of squares), and now
   sse(sum
   of squared errors).
  
   I agree with sse being the best, but I have reverted all my commits,
 and
   found a solution through the API.
  
   Just using the chi2 value, and finding degrees of freedom with the
 API.
  
   If one wants .sse, one can just quickly do
  
   value.set(val=1.0, param=r2eff, error=True)
   minimise.calculate(verbosity=1)
  
   Anyway, in the end, the new method did not solve my problem.
   STD_fit = sqrt(chi2 / dof)
  
   Since dof is so big (many datapoints, small amounts of parameters for
   clustered fitting), STD_fit becomes close to 1.
  
  
   Best
   Troels
  
  
   2015-01-19 9:35 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:
  
   Hi Troels,
  
   Could you rename spin.sos to spin.sse?  This is the acronym used in
   the field and by other software - the sum of squared errors
   (https://en.wikipedia.org/wiki/Residual_sum_of_squares,
   http://www.palmer.hs.columbia.edu/software/modelfree_manual.pdf).
 If
   the individual SSE elements are divided by the experimental error
   sigma_i, then this is the chi2 value.  The SSE and chi2 statistics
 are
   related, and are identical in the case of unit errors.  Other
   acronyms, much less used in the NMR field, are SSR or RSS.  I don't
   think I've ever encountered SOS before, outside of emergencies
   (https://en.wikipedia.org/wiki/SOS).
  
   Cheers,
  
   Edward
  
   On 16 January 2015 at 23:19,  tlin...@nmr-relax.com wrote:
Author: tlinnet
Date: Fri Jan 16 23:19:50 2015
New Revision: 27203
   
URL: http://svn.gna.org/viewcvs/relax?rev=27203view=rev
Log:
Implemented storing of sum of squares and the standard deviation of
these for relaxation dispersion, when doing a point calculation.
   
Task #7882 (https://gna.org/task/?7882): Implement Monte-Carlo
simulation, where errors are generated with width of standard
deviation or
residuals.
   
Modified:
trunk/specific_analyses/relax_disp/optimisation.py
   
Modified: trunk/specific_analyses/relax_disp/optimisation.py
URL:
   
   
 

Re: r27203 - /trunk/specific_analyses/relax_disp/optimisation.py

2015-01-19 Thread Troels Emtekær Linnet
Hi Edward.

I actually think that I have created local minima in my dataset, which is
not caught.

I am looking into it.

2015-01-19 10:30 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:

 Hi,

 Maybe we should discuss on the original thread the problem in detail
 and see if there is a solution.  I wonder why the kex errors are so
 different?

 Regards,

 Edward



 On 19 January 2015 at 09:51, Troels Emtekær Linnet
 tlin...@nmr-relax.com wrote:
  Hi Edward.
 
  I was through sor (sum of residuals), sos(sum of squares), and now
 sse(sum
  of squared errors).
 
  I agree with sse being the best, but I have reverted all my commits, and
  found a solution through the API.
 
  Just using the chi2 value, and finding degrees of freedom with the API.
 
  If one wants .sse, one can just quickly do
 
  value.set(val=1.0, param=r2eff, error=True)
  minimise.calculate(verbosity=1)
 
  Anyway, in the end, the new method did not solve my problem.
  STD_fit = sqrt(chi2 / dof)
 
  Since dof is so big (many datapoints, small amounts of parameters for
  clustered fitting), STD_fit becomes close to 1.
 
 
  Best
  Troels
 
 
  2015-01-19 9:35 GMT+01:00 Edward d'Auvergne edw...@nmr-relax.com:
 
  Hi Troels,
 
  Could you rename spin.sos to spin.sse?  This is the acronym used in
  the field and by other software - the sum of squared errors
  (https://en.wikipedia.org/wiki/Residual_sum_of_squares,
  http://www.palmer.hs.columbia.edu/software/modelfree_manual.pdf).  If
  the individual SSE elements are divided by the experimental error
  sigma_i, then this is the chi2 value.  The SSE and chi2 statistics are
  related, and are identical in the case of unit errors.  Other
  acronyms, much less used in the NMR field, are SSR or RSS.  I don't
  think I've ever encountered SOS before, outside of emergencies
  (https://en.wikipedia.org/wiki/SOS).
 
  Cheers,
 
  Edward
 
  On 16 January 2015 at 23:19,  tlin...@nmr-relax.com wrote:
   Author: tlinnet
   Date: Fri Jan 16 23:19:50 2015
   New Revision: 27203
  
   URL: http://svn.gna.org/viewcvs/relax?rev=27203view=rev
   Log:
   Implemented storing of sum of squares and the standard deviation of
   these for relaxation dispersion, when doing a point calculation.
  
   Task #7882 (https://gna.org/task/?7882): Implement Monte-Carlo
   simulation, where errors are generated with width of standard
 deviation or
   residuals.
  
   Modified:
   trunk/specific_analyses/relax_disp/optimisation.py
  
   Modified: trunk/specific_analyses/relax_disp/optimisation.py
   URL:
  
 http://svn.gna.org/viewcvs/relax/trunk/specific_analyses/relax_disp/optimisation.py?rev=27203r1=27202r2=27203view=diff
  
  
 ==
   --- trunk/specific_analyses/relax_disp/optimisation.py  (original)
   +++ trunk/specific_analyses/relax_disp/optimisation.py  Fri Jan 16
   23:19:50 2015
   @@ -119,7 +119,7 @@
@type spin_lock_nu1:list of lists of numpy rank-1 float
   arrays
@keyword relax_times_new:   The interpolated experiment specific
   fixed time period for relaxation (in seconds).  The dimensions are
 {Ei, Mi,
   Oi, Di, Ti}.
@type relax_times_new:  rank-4 list of floats
   -@keyword store_chi2:A flag which if True will cause the
   spin specific chi-squared value to be stored in the spin container.
   +@keyword store_chi2:A flag which if True will cause the
   spin specific chi-squared value to be stored in the spin container
 together
   with the sum of squares of the residuals and the standard deviation
 of the
   sum of squares of the residuals.
@type store_chi2:   bool
@return:The back-calculated R2eff/R1rho value
   for the given spin.
@rtype: numpy rank-1 float array
   @@ -215,10 +215,15 @@
# Make a single function call.  This will cause back calculation
   and the data will be stored in the class instance.
chi2 = model.func(param_vector)
  
   -# Store the chi-squared value.
   +# Get the sum of squares 'sos' of the residuals between the
 fitted
   values and the measured values. Get the std deviation of these,
 std_sos.
   +sos, sos_std = model.get_sum_of_squares()
   +
   +# Store the chi-squared value, sums of squares of residual and
 the
   standard deviation of sums of squares of residual.
if store_chi2:
for spin in spins:
spin.chi2 = chi2
   +spin.sos = sos
   +spin.sos_std = sos_std
  
# Return the structure.
return model.get_back_calc()
  
  
   ___
   relax (http://www.nmr-relax.com)
  
   This is the relax-commits mailing list
   relax-comm...@gna.org
  
   To unsubscribe from this list, get a password
   reminder, or change your subscription options,
   visit the list information page at
   https://mail.gna.org/listinfo/relax-commits