sorry, for the misunderstanding. here is a more elaborate description of what i would like to achieve.
I have a data set of counts from a RNA-Seq experiment and would like to filter reads with low counts. I don't want to set everything to 0 automatically. I would like to set each categorical group (e.g. condition) to 0, if and only if all replica in the group together have less than 100 reads. in my examples I used X and Y to represents the categories. Ususally they have a more distinct names like "control", "knockout1", "dKo" etc. So what I really like to do is to check if the sum of all the "control" samples is lower than 100. If so, set all control sample to 0. This I would like to check *for each category* of every row of the data set. I hope it is more clear now thanks Assa On Fri, Nov 6, 2015 at 2:29 PM, jim holtman <jholt...@gmail.com> wrote: > Is this what you want: > > > x <- read.table(text = "X1 X2 X3 Y1 Y2 Y3 > + 1232 357 23 0 9871 72 > + 0 71 9 811 795 743 > + 43 919 1111 0 76 14", header = TRUE) > > x > X1 X2 X3 Y1 Y2 Y3 > 1 1232 357 23 0 9871 72 > 2 0 71 9 811 795 743 > 3 43 919 1111 0 76 14 > > > > # create indices of columns that start with the same character > > indx <- split(seq(ncol(x)), substring(colnames(x), 1, 1)) > > names(indx) <- NULL # remove names so output not messed up > > > > result <- lapply(indx, function(a){ > + row_sum <- rowSums(x[, a]) > + x[row_sum < 100, a] <- 0 > + x[, a] > + }) > > # combine back together > > do.call(cbind, result) > X1 X2 X3 Y1 Y2 Y3 > 1 1232 357 23 0 9871 72 > 2 0 0 0 811 795 743 > 3 43 919 1111 0 0 0 > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Fri, Nov 6, 2015 at 5:40 AM, Assa Yeroslaviz <fry...@gmail.com> wrote: > >> Hi, >> >> I have a data frame with multiple columns, which are belong to several >> groups >> like that: >> X1 X2 X3 Y1 Y2 Y3 >> 1232 357 23 0 9871 72 >> 0 71 9 811 795 743 >> 43 919 1111 0 76 14 >> >> I would like to filter such rows out, where the sums in one group is lower >> than a specifc value. For example, I would like to set all the values in a >> group of cloums to zero, if the sum in one group is less than 100 >> In my example table I would like to set the values in the second row for >> the three X-columns to 0, so that the table looks like that: >> >> X1 X2 X3 Y1 Y2 Y3 >> 1232 357 23 0 9871 72 >> 0 0 0 811 795 743 >> 43 919 1111 0 0 0 >> >> the same apply also for the Y-values in the last column. >> Is there a more efficient way of doing it than going row by row and use >> the >> apply function on each of the subgroups I have in the columns? >> >> thanks >> Assa >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.