Here's one way. Suppose your "time series" is in a vector called "x".
top10 <- sort(x, decreasing=TRUE)[1:10] mean.index <- mean(which(x %in% top10)) HTH, Andy > -----Original Message----- > From: James Brown [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 23, 2003 7:51 AM > To: [EMAIL PROTECTED] > Subject: [R] Rank and extract data from a series > > > > I would like to rank a time-series of data, extract the top > ten data items from this series, determine the corresponding > row numbers for each value in the sample, and take a mean of > these *row numbers* (not the data). > > I would like to do this in R, rather than pre-process the > data on the UNIX command line if possible, as I need to > calculate other statistics for the series. > > I understand that I can use 'sort' to order the data, but I > am not aware of a function in R that would allow me to > extract a given number of these data and then determine their > positions within the original time series. > > e.g. > > Time series: > > 1.0 (row 1) > 4.5 (row 2) > 2.3 (row 3) > 1.0 (row 4) > 7.3 (row 5) > > Sort would give me: > > 1.0 > 1.0 > 2.3 > 4.5 > 7.3 > > I would then like to extract the top two data items: > > 4.5 > 7.3 > > and determine their positions within the original (unsorted) > time series: > > 4.5 = row 2 > 7.3 = row 5 > > then take a mean: > > 2 and 5 = 3.5 > > Thanks in advance. > > James Brown > > ___________________________________________ > > James Brown > > Cambridge Coastal Research Unit (CCRU) > Department of Geography > University of Cambridge > Downing Place > Cambridge > CB2 3EN, UK > > Telephone: +44 (0)1223 339776 > Mobile: 07929 817546 > Fax: +44 (0)1223 355674 > > E-mail: [EMAIL PROTECTED] > E-mail: [EMAIL PROTECTED] > > http://www.geog.cam.ac.uk/ccru/CCRU.html > ___________________________________________ > > > > > > > On Wed, 10 Sep 2003, Jerome Asselin wrote: > > > On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote: > > > > > > Your method looks like a naive reimplementation of > integration, and > > > won't work so well for distributions that have the great > majority of > > > the probability mass concentrated in a small fraction of > the sample > > > space. I was hoping for something that would retain the > > > adaptability of integrate(). > > > > Yesterday, I've suggested to use approxfun(). Did you consider my > > suggestion? Below is an example. > > > > N <- 500 > > x <- rexp(N) > > y <- rank(x)/(N+1) > > empCDF <- approxfun(x,y) > > xvals <- seq(0,4,.01) > > plot(xvals,empCDF(xvals),type="l", > > xlab="Quantile",ylab="Cumulative Distribution Function") > > lines(xvals,pexp(xvals),lty=2) > > legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2) > > > > > > It's possible to tune in some parameters in approxfun() to better > > match your personal preferences. Have a look at help(approxfun) for > > details. > > > > HTH, > > Jerome Asselin > > > > ______________________________________________ > > [EMAIL PROTECTED] mailing list > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > > > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-help > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help