Re: [R] Multivariate kernel density estimation

2008-12-08 Thread Greg Snow
Possible, yes (see fortune('Yoda')), but doing it can be a bit difficult.

Here is one approximation using an independent normal kernel in 2 dimensions:

kde.contour <- function(x,y=NULL, conf=0.95, xdiv=100, ydiv=100, kernel.sd ) {
xy <- xy.coords(x,y)
xr <- range(xy$x)
yr <- range(xy$y)

xr <- xr + c(-1,1)*0.1*diff(xr)
yr <- yr + c(-1,1)*0.1*diff(yr)

if(missing(kernel.sd)) {
kernel.sd <- c( diff(xr)/6, diff(yr)/6 )
} else if (length(kernel.sd)==1) {
kernel.sd <- rep(kernel.sd, 2)
}


xs <- seq(xr[1], xr[2], length.out=xdiv)
ys <- seq(yr[1], yr[2], length.out=ydiv)
mydf <- expand.grid( xx=xs, yy=ys )

tmpfun <- function(xx,yy) {
sum( dnorm(xx, xy$x, kernel.sd[1]) * dnorm(yy, xy$y, 
kernel.sd[2]) )
}

z <- mapply(tmpfun, xx=mydf$xx, yy=mydf$yy)

sz <- sort(z, decreasing=TRUE)
cz <- cumsum(sz)
cz <- cz/cz[length(cz)]

cutoff <- sz[ which( cz > conf )[1] ]

plot(xy, xlab='x', ylab='y', xlim=xr, ylim=yr)
#contour( xs, ys, matrix(z, nrow=xdiv), add=TRUE, col='blue')
contour( xs, ys, matrix(z, nrow=xdiv), add=TRUE, col='red',
levels=cutoff, labels='')

invisible(NULL)
}

# test
kde.contour( rnorm(100), rnorm(100) )

# correlated data
my.xy <- MASS::mvrnorm(100, c(3,10), matrix( c(1,.8,.8,1), 2) )

kde.contour( my.xy, kernel.sd=.5 )

# compare to theoritical
lines(ellipse::ellipse( 0.8, scale=c(1,1), centre=c(3,10)), col='green')


# bimodal

new.xy <- rbind( MASS::mvrnorm(65, c(3,10), matrix( c(1,.6,.6,1),2) ),
MASS::mvrnorm(35, c(6, 7), matrix( c(1,.6,.6,1), 2) ) )

kde.contour( new.xy, kernel.sd=.75 )


For more than 2 dimensions it becomes more difficult (both to visualize and to 
find the region, contour only works in 2 dimensions).  You can also see that 
the approximations are not great compared to the true theory, but possibly a 
better kernel would improve that.

Hope this helps,


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> project.org] On Behalf Of Jeroen Ooms
> Sent: Monday, December 08, 2008 5:54 AM
> To: r-help@r-project.org
> Subject: [R] Multivariate kernel density estimation
>
>
> I would like to estimate a 95% highest density area for a multivariate
> parameter space (In the context of anova). Unfortunately I have only
> experience with univariate kernel density estimation, which is
> remarkebly
> easier :)
>
> Using Gibbs, i have sampled from a posterior distirbution of an Anova
> model
> with k means (mu) and 1 common residual variance (s2). The means are
> independent of eachother, but conditional on the residual variance. So
> now I
> have a data frame of say 10.000 iterations, and k+1 parameters.
>
> I am especially interested in the posterior distribution of the mu
> parameters, because I want to test the support for an inequalty
> constrained
> model (e.g. mu1 > mu2 > mu3). I wish to derive the multivariate 95%
> highest
> density parameter space for the mu parameters. For example, if I had a
> posterior distirbution with 2 means, this should somehow result in the
> circle or elipse that contains the 95% highest density area.
>
> Is something like this possible in R? All tips are welcome.
> --
> View this message in context: http://www.nabble.com/Multivariate-
> kernel-density-estimation-tp20894766p20894766.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multivariate kernel density estimation

2008-12-08 Thread Jeroen Ooms

I would like to estimate a 95% highest density area for a multivariate
parameter space (In the context of anova). Unfortunately I have only
experience with univariate kernel density estimation, which is remarkebly
easier :)

Using Gibbs, i have sampled from a posterior distirbution of an Anova model
with k means (mu) and 1 common residual variance (s2). The means are
independent of eachother, but conditional on the residual variance. So now I
have a data frame of say 10.000 iterations, and k+1 parameters.

I am especially interested in the posterior distribution of the mu
parameters, because I want to test the support for an inequalty constrained
model (e.g. mu1 > mu2 > mu3). I wish to derive the multivariate 95% highest
density parameter space for the mu parameters. For example, if I had a
posterior distirbution with 2 means, this should somehow result in the
circle or elipse that contains the 95% highest density area. 

Is something like this possible in R? All tips are welcome.
-- 
View this message in context: 
http://www.nabble.com/Multivariate-kernel-density-estimation-tp20894766p20894766.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.