[R] CV by rpart/mvpart
Dear R-list, I am using the rpart/mvpart-package for selecting a right-sized regression tree by 10-fold cross-validation. My question: Is there a possibility to find out for every observation in which of the ten folds it is lying? I want to use the same folds for validating another regression method (moving averages) in order to choose the better one. Thanks a lot, Pedro - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CV by rpart/mvpart
Dear R-list, I am using the rpart/mvpart-package for selecting a right-sized regression tree by 10-fold cross-validation. My question: Is there a possibility to find out for every observation in which of the ten folds it is lying? I want to use the same folds for validating another regression method (moving averages) in order to choose the better one. Thanks a lot, Pedro - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Density Estimation
In mathematical terms the optimal bandwith for density estimation decreases at rate n^{-1/5}, while the one for distribution function decreases at rate n^{-1/3}, if n is the sample size. In practical terms, one must choose an appreciably smaller bandwidth in the second case than in the first one. Thanks a lot for your remark! I was not aware of the fact that the optimal bandwidths for density and distribution do not decrease at the same rate. Besides the computational aspect, there is a statistical one: the optimal choice of bandwidth for estimating the density function is not optimal (and possibly not even jsut sensible) for estimating the distribution function, and the stated problem is equivalent to estimation of the distribution function. The given interval 0x3 was only an example, in fact I would like to estimate the probability for intervals such as 0=x1 , 1=x2 , 2=x3 , 3=x4 , and compare it with the estimates of a corresponding histogram. In this case the stated problem is not anymore equivalent to the estimation of the distribution function. What do you think, can I go a ahead in this case with the optimal bandwidth for the density? Thanks a lot for your help! Best wishes Pedro best wishes, Adelchi PR PR PR -- PR Gregory (Greg) L. Snow Ph.D. PR Statistical Data Center PR Intermountain Healthcare PR [EMAIL PROTECTED] PR (801) 408-8111 PR PR PR -Original Message- PR From: [EMAIL PROTECTED] PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM PR To: r-help@stat.math.ethz.ch PR Subject: [R] Density Estimation PR PR Dear R-list, PR PR I have made a simple kernel density estimation by PR PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10) PR kde - density(x,n=100) PR PR Now I would like to know the estimated probability that a new PR observation falls into the interval 0x3. PR PR How can I integrate over the corresponding interval? PR In several R-packages for kernel density estimation I did not PR found a corresponding function. I could apply Simpson's Rule for PR integrating, but perhaps somebody knows a better solution. PR PR Thanks a lot for help! PR PR Pedro PR PR _ PR PR __ PR R-help@stat.math.ethz.ch mailing list PR https://stat.ethz.ch/mailman/listinfo/r-help PR PLEASE do read the posting guide! PR http://www.R-project.org/posting-guide.html PR PR PR __ PR R-help@stat.math.ethz.ch mailing list PR https://stat.ethz.ch/mailman/listinfo/r-help PR PLEASE do read the posting guide! PR http://www.R-project.org/posting-guide.html PR __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Density Estimation
Dear R-list, I have made a simple kernel density estimation by x - c(2,1,3,2,3,0,4,5,10,11,12,11,10) kde - density(x,n=100) Now I would like to know the estimated probability that a new observation falls into the interval 0x3. How can I integrate over the corresponding interval? In several R-packages for kernel density estimation I did not found a corresponding function. I could apply Simpson's Rule for integrating, but perhaps somebody knows a better solution. Thanks a lot for help! Pedro _ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Density Estimation
Not a direct answer to your question, but if you use a logspline density estimate rather than a kernal density estimate then the logspline package will help you and it has built in functions for dlogspline, qlogspline, and plogspline that do the integrals for you. If you want to stick with the KDE, then you could find the area under each of the kernals for the range you are interested in (need to work out the standard deviation used from the bandwidth, then use pnorm for the default gaussian kernal), then just sum the individual areas. Hope this helps, Thanks a lot for your quick help! I think I will follow your first suggestion (logspline density estimation) instead of summing over the kernel areas because at the boundaries of the range truncated kernel areas can occur, so I think it is easier to do it with logsplines. Thanks again for your help!! Pedro -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pedro Ramirez Sent: Wednesday, June 07, 2006 11:00 AM To: r-help@stat.math.ethz.ch Subject: [R] Density Estimation Dear R-list, I have made a simple kernel density estimation by x - c(2,1,3,2,3,0,4,5,10,11,12,11,10) kde - density(x,n=100) Now I would like to know the estimated probability that a new observation falls into the interval 0x3. How can I integrate over the corresponding interval? In several R-packages for kernel density estimation I did not found a corresponding function. I could apply Simpson's Rule for integrating, but perhaps somebody knows a better solution. Thanks a lot for help! Pedro _ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html