I would like to rank a time-series of data, extract the top ten data items from this series, determine the corresponding row numbers for each value in the sample, and take a mean of these *row numbers* (not the data).
I would like to do this in R, rather than pre-process the data on the UNIX command line if possible, as I need to calculate other statistics for the series. I understand that I can use 'sort' to order the data, but I am not aware of a function in R that would allow me to extract a given number of these data and then determine their positions within the original time series. e.g. Time series: 1.0 (row 1) 4.5 (row 2) 2.3 (row 3) 1.0 (row 4) 7.3 (row 5) Sort would give me: 1.0 1.0 2.3 4.5 7.3 I would then like to extract the top two data items: 4.5 7.3 and determine their positions within the original (unsorted) time series: 4.5 = row 2 7.3 = row 5 then take a mean: 2 and 5 = 3.5 Thanks in advance. James Brown ___________________________________________ James Brown Cambridge Coastal Research Unit (CCRU) Department of Geography University of Cambridge Downing Place Cambridge CB2 3EN, UK Telephone: +44 (0)1223 339776 Mobile: 07929 817546 Fax: +44 (0)1223 355674 E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] http://www.geog.cam.ac.uk/ccru/CCRU.html ___________________________________________ On Wed, 10 Sep 2003, Jerome Asselin wrote: > On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote: > > > > Your method looks like a naive reimplementation of integration, and > > won't work so well for distributions that have the great majority of the > > probability mass concentrated in a small fraction of the sample space. > > I was hoping for something that would retain the adaptability of > > integrate(). > > Yesterday, I've suggested to use approxfun(). Did you consider my > suggestion? Below is an example. > > N <- 500 > x <- rexp(N) > y <- rank(x)/(N+1) > empCDF <- approxfun(x,y) > xvals <- seq(0,4,.01) > plot(xvals,empCDF(xvals),type="l", > xlab="Quantile",ylab="Cumulative Distribution Function") > lines(xvals,pexp(xvals),lty=2) > legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2) > > > It's possible to tune in some parameters in approxfun() to better match > your personal preferences. Have a look at help(approxfun) for details. > > HTH, > Jerome Asselin > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help