Re: [bug #23644] monte_carlo.error_analysis() does not update the mean value/expectation value from simulations

Edward d'Auvergne Mon, 15 Jun 2015 06:46:16 -0700

On 15 June 2015 at 15:33, Edward d'Auvergne <edw...@nmr-relax.com> wrote:
> On 15 June 2015 at 15:28, Troels Emtekær Linnet <tlin...@nmr-relax.com> wrote:
>> Hi Edward.
>>
>> What do you think about this bug report?
>>
>> I added some figures, showing that the parameter values does not represent
>> the expectation value of the Monte-Carlo simulation distribution.
>
> Did you see my response at ...  Oh, it was not reply-to-all and it
> went to the <no-reply.invalid-addr...@gna.org> email address only!  My
> email from 3 hours ago was:
>
> """
> This is actually the definition of Monte Carlo simulations.  The
> parameter value is the optimised value and the parameter error is the
> standard deviation of the back-calculated distribution.  There are two
> opposite and very much related values which do not have a great
> statistical meaning.  That is the mean of the back-calculated
> distribution and the standard deviation of the non-back-calculated
> distribution.  These are unused for good reason.  You can create the
> non-back-calculated distribution by using the bootstrapping in relax -
> the mean of this will equal the optimised parameter value, but the
> standard deviation will not match the MC standard deviation.  I
> suggest looking at the Numerical Recipes books as they have a great
> diagram of the Monte Carlo simulation setup and how the parameter
> value and error are calculated.
> """


In essence, you have stumbled upon a very important statistics
concept.  You'll see this written up in my PhD thesis (
https://minerva-access.unimelb.edu.au/handle/11343/39174 ),
specifically the section "2.2.1 Model selection theory for NMR
relaxation", and the paragraph "The four relaxation data sets".  I'll
reproduce the text for reference:

"""
For a single nucleus four different types of relaxation data sets
exist, the true set Rtrue, the sample set R, the true back calculated
set Rtrue(θ), and the back calculated set R(θ). A relaxation data set
is defined as the collection of all the relaxation values which
influence the model. θ is the vector whose elements are the parameters
of the model. The true set is the true relaxation data underlying the
measured data. It can never be observed due to noise. The sample set
is the experimentally available or measured relaxation data set and is
the true set plus noise. The true back calculated and back calculated
sets are determined from the model-free parameters which are fitted
using the true or sample sets respectively. The differences between
the models are reflected in the two back calculated sets whereas the
true and sample sets remain constant. For each of the four data sets
there is a corresponding error set with the same dimension. By
assuming Gaussian errors the data and error sets together describe a
set of normal probability distribution functions (pdfs) with one
normal pdf for each data point. It is assumed that all four error sets
are identical and therefore the one error set σ will be used in
association with all four data sets.
"""

You are seeing two of these 4 distributions!  That is the sample set
and back calculated set, not the true set or true back calculated set.
Keep reading the "Maximum likelihood" and the full text of
"Discrepancies", and then hopefully you'll be a master of these
frequentist statistics concepts ;)

Regards,

Edward

_______________________________________________
relax (http://www.nmr-relax.com)

This is the relax-devel mailing list
relax-devel@gna.org

To unsubscribe from this list, get a password
reminder, or change your subscription options,
visit the list information page at
https://mail.gna.org/listinfo/relax-devel

Re: [bug #23644] monte_carlo.error_analysis() does not update the mean value/expectation value from simulations

Reply via email to