Re: [R] row selection based on median in data frame

Ed L Cashin Thu, 01 Apr 2004 07:36:28 -0800

Ed L Cashin <[EMAIL PROTECTED]> writes:

...
> Is there a way to tell aggregate just do perform median on column runtime to
> select the whole row?


Some helpful folks have emailed me requesting more info about what I'm
trying to do.  Here's a simple R function to produce a data frame like
the one I am working on.

demo.frame <- function() {
  n.runs <- 3
  types <- c("red","black","blue")
  foo <- 1:5
  bar <- seq(50,90,by=10)
  d <- data.frame()

  for (i in 1:n.runs) {
    for (t in types) {
      for (f in foo) {
        for (b in bar) {
          row <- data.frame(type=t,
                            foo=f,
                            bar=b,
                            a=rnorm(1),
                            b=rnorm(1),
                            c=rnorm(1))
          d <- rbind(d,row)
        }
      }
    }
  }
  d
}

Every so often, in the resulting rows, you get a row where the type,
the foo, and the bar values are all the same.  I need to look at the
rows with such a matching set of values as a group, selecting the one
row with the median "c" value, and preserving all of that row's other
values.  So median should not be done on the "a" or "b" columns, just
the "c" column.

There are two ways I see to approach this problem.  One would be:

  for each subset of rows with matching type, foo, and bar values, 
    find the row with the median c value and output it

The other, which I've been able to do, takes advantage of knowledge
about the sequence of rows in the data frame:

median.runs <- function(d, n.runs=0) {
  if (missing(n.runs))
    stop("missing n.runs parameter is required")

  len <- length(d$type) / n.runs
  i <- c()

  # build an index that will select similar rows
  for (n in 0:(n.runs - 1)) {
    i[n + 1] <- n * len + 1
  }
  a <- list()
  for (j in 1:len) {
    cat("i:",i,"\n")
    rows <- d[i,]
    md <- median(rows$c)
    cat("md:",md,"\n")
    matches <- rows[rows$c == md,]
    a <- rbind(a, matches[1,])
    i <- i + 1
  }
  a
}


-- 
--Ed L Cashin            |   PGP public key:
  [EMAIL PROTECTED]        |   http://noserose.net/e/pgp/

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] row selection based on median in data frame

Reply via email to