Hi,
>I would like to rank a time-series of data, extract the top ten data items from this >series, determine the >corresponding row numbers for each value in the sample, and take a mean of these *row >numbers* (not the data). >I would like to do this in R, rather than pre-process the data on the UNIX command >line if possible, as I need to >calculate other statistics for the series. >I understand that I can use 'sort' to order the data, but I am not aware of a >function in R that would allow me >to extract a given number of these data and then determine their positions within the >original time series. >e.g. >Time series: >1.0 (row 1) >4.5 (row 2) >2.3 (row 3) >1.0 (row 4) >7.3 (row 5) >Sort would give me: >1.0 >1.0 >2.3 >4.5 >7.3 >I would then like to extract the top two data items: >4.5 >7.3 >and determine their positions within the original (unsorted) time series: >4.5 = row 2 >7.3 = row 5 >then take a mean: >2 and 5 = 3.5 >Thanks in advance. >James Brown X <- c(1, 4.5, 2.3, 1, 7.3) X1 <- sort(X, decreasing=TRUE)[1:2] X2 <- match(X1, X) mean(X2) Hope this helps Thomas ___________________________________________ James Brown Cambridge Coastal Research Unit (CCRU) Department of Geography University of Cambridge Downing Place Cambridge CB2 3EN, UK Telephone: +44 (0)1223 339776 Mobile: 07929 817546 Fax: +44 (0)1223 355674 E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] http://www.geog.cam.ac.uk/ccru/CCRU.html ___________________________________________ On Wed, 10 Sep 2003, Jerome Asselin wrote: > On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote: > > > > Your method looks like a naive reimplementation of integration, and > > won't work so well for distributions that have the great majority of > > the probability mass concentrated in a small fraction of the sample > > space. I was hoping for something that would retain the > > adaptability of integrate(). > > Yesterday, I've suggested to use approxfun(). Did you consider my > suggestion? Below is an example. > > N <- 500 > x <- rexp(N) > y <- rank(x)/(N+1) > empCDF <- approxfun(x,y) > xvals <- seq(0,4,.01) > plot(xvals,empCDF(xvals),type="l", > xlab="Quantile",ylab="Cumulative Distribution Function") > lines(xvals,pexp(xvals),lty=2) > legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2) > > > It's possible to tune in some parameters in approxfun() to better > match your personal preferences. Have a look at help(approxfun) for > details. > > HTH, > Jerome Asselin > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
