On Wed, 07 Jun 2006 19:54:32 +0200, Pedro Ramirez wrote: PR> >Not a direct answer to your question, but if you use a logspline PR> >density estimate rather than a kernal density estimate then the PR> >logspline package will help you and it has built in functions for PR> >dlogspline, qlogspline, and plogspline that do the integrals for PR> >you. PR> > PR> >If you want to stick with the KDE, then you could find the area PR> >under each of the kernals for the range you are interested in PR> >(need to work out the standard deviation used from the bandwidth, PR> >then use pnorm for the default gaussian kernal), then just sum PR> >the individual areas. PR> > PR> >Hope this helps, PR> PR> Thanks a lot for your quick help! I think I will follow your first PR> PR> suggestion (logspline PR> density estimation) instead of summing over the kernel areas PR> because at the boundaries of the range truncated kernel areas can PR> occur, so I think it is easier to do it with logsplines. Thanks PR> again for your help!! PR> PR> Pedro PR> PR>
Besides the computational aspect, there is a statistical one: the optimal choice of bandwidth for estimating the density function is not optimal (and possibly not even jsut sensible) for estimating the distribution function, and the stated problem is equivalent to estimation of the distribution function. In mathematical terms the optimal bandwith for density estimation decreases at rate n^{-1/5}, while the one for distribution function decreases at rate n^{-1/3}, if n is the sample size. In practical terms, one must choose an appreciably smaller bandwidth in the second case than in the first one. best wishes, Adelchi PR> PR> > PR> >-- PR> >Gregory (Greg) L. Snow Ph.D. PR> >Statistical Data Center PR> >Intermountain Healthcare PR> >[EMAIL PROTECTED] PR> >(801) 408-8111 PR> > PR> > PR> >-----Original Message----- PR> >From: [EMAIL PROTECTED] PR> >[mailto:[EMAIL PROTECTED] On Behalf Of Pedro PR> >Ramirez Sent: Wednesday, June 07, 2006 11:00 AM PR> >To: r-help@stat.math.ethz.ch PR> >Subject: [R] Density Estimation PR> > PR> >Dear R-list, PR> > PR> >I have made a simple kernel density estimation by PR> > PR> >x <- c(2,1,3,2,3,0,4,5,10,11,12,11,10) PR> >kde <- density(x,n=100) PR> > PR> >Now I would like to know the estimated probability that a new PR> >observation falls into the interval 0<x<3. PR> > PR> >How can I integrate over the corresponding interval? PR> >In several R-packages for kernel density estimation I did not PR> >found a corresponding function. I could apply Simpson's Rule for PR> >integrating, but perhaps somebody knows a better solution. PR> > PR> >Thanks a lot for help! PR> > PR> >Pedro PR> > PR> >_________ PR> > PR> >______________________________________________ PR> >R-help@stat.math.ethz.ch mailing list PR> >https://stat.ethz.ch/mailman/listinfo/r-help PR> >PLEASE do read the posting guide! PR> >http://www.R-project.org/posting-guide.html PR> > PR> PR> ______________________________________________ PR> R-help@stat.math.ethz.ch mailing list PR> https://stat.ethz.ch/mailman/listinfo/r-help PR> PLEASE do read the posting guide! PR> http://www.R-project.org/posting-guide.html PR> ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html