[R] plotting dnorm() issued from mclust models
Dear all I have a problem in fitting lines() of the normal distributions identified with Mclust on a histogram or a mclust1Dplot. Here is some sample code to explain : set.seed(22) foo - c(rnorm(400, 10, 2), rnorm(500, 17, 4)) mcl - Mclust(foo, G=2) mcl.sd - sqrt(mcl$parameters$variance$sigmasq) mcl.size - c(length(mcl$classification[mcl$classification==2]), length(mcl$classification[mcl$classification==1])) x - pretty(c(0:44), 100) my plot of histogram and lines of normal distributions SEEMS OK (or am I wrong ?) using frequencies : histA - hist(foo, breaks =c(0:44), ylim = c(0,100)) lines(x, dnorm(x, mcl$parameters$mean[1], mcl.sd[1])*mcl.size[1], col =2, lw=2) lines(x, dnorm(x, mcl$parameters$mean[2], mcl.sd[2])*mcl.size[2], col =2, lw=2) my plot of histogram and lines of normal distributions IS wrong when using prob : mclust1Dplot(foo, parameters = mcl$parameters, z = mcl$z, what = density) histA - hist(foo, breaks =c(0:44), prob = T, add =T) lines(x, dnorm(x, mcl$parameters$mean[2], mcl.sd[2]), col =2, lw=2) lines(x, dnorm(x, mcl$parameters$mean[1], mcl.sd[1]), col =2, lw=2) In second plot, the bell shaped curves are obviously too high and it seems that I miss something obvious in scaling dnorm()'s in building the second plot: I tried different things like scaling dnorm() by the proportion of individuals belonging to cluster 1 and 2 respectively, but with no success. Could someone help to point my errors ? Many thanks in advance Fred J. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] significant anova but no distinct groups ?
Dear all, I am studying a dataset using the aov() function. The independant variable 'cds' is a factor() with 8 levels and here is the result in studying the dependant variable 'rta' with aov() : summary(aov(rta ~ cds)) Df Sum Sq Mean Sq F value Pr(F) cds 7 0.34713 0.04959 2.3807 0.02777 Residuals 92 1.91635 0.02083 The dependant variable 'rta' is normally distributed and variances are homogeneous. But when studying the result with TukeyHSD, no differences in 'rta' are seen among groups of 'cds' : TukeyHSD(aov(rta ~ cds), which=cds) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = rta ~ cds) $cds difflwrupr p adj 1-0 -0.1046092796 -0.4331100 0.22389141 0.9751178 2-0 0.0359991860 -0.1371359 0.20913425 0.9980970 3-0 0.0261665235 -0.1348524 0.18718540 0.9996165 4-0 0.0004502442 -0.1805448 0.18144531 1.000 5-0 -0.1438949939 -0.3104752 0.02268526 0.1422670 [...] 7-5 0.0621598639 -0.1027595 0.22707926 0.9386170 7-6 0.0256519274 -0.1757408 0.22704465 0.248 I tried a pairwise.t.test (holm correction) which also was not able to detect differences in 'rta' among groups of 'cds' I've never been confronted to such a situation before : is it just a problem of power of the /a posteriori/ tests used ? Do I miss something important in basic stats or in R ? How to highlight differences among 'cds' groups seen with aov() ? Any help appreciated Thanks in advance, Fred J. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.