I recently estimated a normal mixture, 2 components, mean 0, different variances. I had computational difficulties until I took the time to avoid under and overflows, preserving numerical precision. I got strange answers in my application, which convinced me that I had a 3-component mixture, not 2, but it was not sufficiently important for me to modify my code to allow more than 2 components.
The code I used follows, in case it might be of interest to someone.
hope this helps. spencer graves
#######################
dnorm2.0 <-
function(x, log.sd1=0.5*log(var(x)/10),
log.sd2=0.5*log(10*var(x)), logit.w2=0,
output.all=F, log.p=T){
# log(dnorm(...)) for both densities
lg.dn1 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd1))^2)-log.sd1)
lg.dn2 <- ((-0.5)*(log(2*pi)+(x/exp(log.sd2))^2)-log.sd2)
# Adjust one to use a negative exponential
# in the denominator for
# w2 = exp(logit.w2)/(1+exp(logit.w2))
# = 1/(1+exp(-logit.w2))
if(logit.w2>=0){
lg.dn1 <- (lg.dn1-logit.w2)
logit.w2 <- (-logit.w2)
}
else
lg.dn2 <- (lg.dn2+logit.w2)
# Now write log(exp(lg.dn1)+exp(lg.dn2))
# = lg12 + log(1+exp(lg.12-lg12))
lg12 <- pmax(lg.dn1, lg.dn2)
lg.12 <- pmin(lg.dn1, lg.dn2)
# Put it all together
lg.d2 <- (lg12 + log(1+exp(lg.12-lg12))
+ logit.w2)
if(log.p)lg.d2 else exp(lg.d2)
}lglk.dn2 <-
function(x=c(log.sd1=log(1e-15), log.sd2=0, logit.w2=log(35/(127-35))),
data.){
#
dn <- dnorm2.0(data., log.sd1=x[1],
log.sd2=x[2], logit.w2=x[3])
(-sum(dn))
}Carlos J. Gil Bellosta wrote:
Well,
If k is known, you can use maximun likelihood to fit the weights, means, and sd's. The EM algorithm can be of help to solve the optimization problem. You would have to implement it yourself for your particular case, but I do not think it is big trouble.
Then you could estimate k using Bayesian formalism: from a reasonable a priory distribution on k=1, 2,... compute the posterior distributions using the densities obtained above, etc.
Carlos J. Gil Bellosta Sigma Consultores Estad�sticos http://www.consultoresestadisticos.com
Joke Allemeersch wrote:
Hello,
I have a concrete statistical question:
I have a sample of an univariate mixture of an unknown number (k) of normal distributions, each time with an unknown mean `m_i' and a standard deviation `k * m_i', where k is known factor constant for all the normal distributions. (The `i' is a subscript.)
Is there a function in R that can estimate the number of normal distributions k and the means `m_i' for the different normal distributions from a sample? Or evt. a function that can estimate the `m_i', when the number of distributions `k' is known?
So far I only found a package, called `normix'. But at first sight it only provides methods to sample from such distributions and to estimate the densities; but not to fit such a distribution.
Can someone indicate where I can find an elegant solution?
Thank you in advance
Joke Allemeersch
Katholieke universiteit Leuven. Belgium.
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
