[R] CV by rpart/mvpart

2006-12-28 Thread Pedro Ramirez
Dear R-list,

I am using the rpart/mvpart-package for selecting a right-sized regression tree 
by 10-fold cross-validation. My question: Is there a possibility to find out 
for every observation in which of the ten folds it is lying? I want to use the 
same folds for validating another regression method (moving averages) in order 
to choose the better one.

Thanks a lot,
Pedro


-




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] CV by rpart/mvpart

2006-12-28 Thread Pedro Ramirez
Dear R-list,

I am using the rpart/mvpart-package for selecting a right-sized regression tree 
by
10-fold cross-validation. My question: Is there a possibility to find out for 
every
observation in which of the ten folds it is lying? I want to use the same folds 
for
validating another regression method (moving averages) in order to choose the 
better
one.

Thanks a lot,
Pedro


-




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Density Estimation

2006-06-08 Thread Pedro Ramirez
In mathematical terms the optimal bandwith for density estimation
decreases at rate n^{-1/5}, while the one for distribution function
decreases at rate n^{-1/3}, if n is the sample size. In practical terms,
one must choose an appreciably smaller bandwidth in the second case
than in the first one.

Thanks a lot for your remark! I was not aware of the fact that the
optimal bandwidths for density and distribution do not decrease
at the same rate.

Besides the computational aspect, there is a statistical one:
the optimal choice of bandwidth for estimating the density function
is not optimal (and possibly not even jsut sensible) for estimating
the distribution function, and the stated problem is equivalent to
estimation of the distribution function.

The given interval 0x3 was only an example, in fact I would
like to estimate the probability for intervals such as

0=x1 , 1=x2 , 2=x3 , 3=x4 , 

and compare it with the estimates of a corresponding histogram.
In this case the stated problem is not anymore equivalent to the
estimation of the distribution function. What do you think, can
I go a ahead in this case with the optimal bandwidth for the
density? Thanks a lot for your help!

Best wishes
Pedro




best wishes,

Adelchi


PR
PR 
PR --
PR Gregory (Greg) L. Snow Ph.D.
PR Statistical Data Center
PR Intermountain Healthcare
PR [EMAIL PROTECTED]
PR (801) 408-8111
PR 
PR 
PR -Original Message-
PR From: [EMAIL PROTECTED]
PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro
PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
PR To: r-help@stat.math.ethz.ch
PR Subject: [R] Density Estimation
PR 
PR Dear R-list,
PR 
PR I have made a simple kernel density estimation by
PR 
PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
PR kde - density(x,n=100)
PR 
PR Now I would like to know the estimated probability that a new
PR observation falls into the interval 0x3.
PR 
PR How can I integrate over the corresponding interval?
PR In several R-packages for kernel density estimation I did not
PR found a corresponding function. I could apply Simpson's Rule for
PR integrating, but perhaps somebody knows a better solution.
PR 
PR Thanks a lot for help!
PR 
PR Pedro
PR 
PR _
PR 
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR 
PR
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density Estimation

2006-06-07 Thread Pedro Ramirez
Dear R-list,

I have made a simple kernel density estimation by

x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde - density(x,n=100)

Now I would like to know the estimated probability that a
new observation falls into the interval 0x3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did
not found a corresponding function. I could apply
Simpson's Rule for integrating, but perhaps somebody
knows a better solution.

Thanks a lot for help!

Pedro

_

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-07 Thread Pedro Ramirez
Not a direct answer to your question, but if you use a logspline density
estimate rather than a kernal density estimate then the logspline
package will help you and it has built in functions for dlogspline,
qlogspline, and plogspline that do the integrals for you.

If you want to stick with the KDE, then you could find the area under
each of the kernals for the range you are interested in (need to work
out the standard deviation used from the bandwidth, then use pnorm for
the default gaussian kernal), then just sum the individual areas.

Hope this helps,

Thanks a lot for your quick help! I think I will follow your first 
suggestion (logspline
density estimation) instead of summing over the kernel areas because at the
boundaries of the range truncated kernel areas can occur, so I think it is
easier to do it with logsplines. Thanks again for your help!!

Pedro




--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pedro Ramirez
Sent: Wednesday, June 07, 2006 11:00 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Density Estimation

Dear R-list,

I have made a simple kernel density estimation by

x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde - density(x,n=100)

Now I would like to know the estimated probability that a new
observation falls into the interval 0x3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did not found a
corresponding function. I could apply Simpson's Rule for integrating,
but perhaps somebody knows a better solution.

Thanks a lot for help!

Pedro

_

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html