On 7/29/06, jim holtman <[EMAIL PROTECTED]> wrote: > Is this what you want? > > > set.seed(1) > > x <- matrix(sample(c(1, NA), 100, TRUE), nrow=10) # creat some data > > x > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 1 1 NA 1 NA 1 NA 1 1 1 > [2,] 1 1 1 NA NA NA 1 NA NA 1 > [3,] NA NA NA 1 NA 1 1 1 1 NA > [4,] NA 1 1 1 NA 1 1 1 1 NA > [5,] 1 NA 1 NA NA 1 NA 1 NA NA > [6,] NA 1 1 NA NA 1 1 NA 1 NA > [7,] NA NA 1 NA 1 1 1 NA NA 1 > [8,] NA NA 1 1 1 NA NA 1 1 1 > [9,] NA 1 NA NA NA NA 1 NA 1 NA > [10,] 1 NA 1 1 NA 1 NA NA 1 NA > > # count number of NAs per row > > numNAs <- apply(x, 1, function(z) sum(is.na(z)))
It's a minor point but on a large matrix it would be better to use numNAs <- rowSums(is.na(z)) > > numNAs > [1] 3 5 5 3 6 5 5 4 7 5 > > # remove rows with more than 5 NAs > > x[!(numNAs > 5),] > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] 1 1 NA 1 NA 1 NA 1 1 1 > [2,] 1 1 1 NA NA NA 1 NA NA 1 > [3,] NA NA NA 1 NA 1 1 1 1 NA > [4,] NA 1 1 1 NA 1 1 1 1 NA > [5,] NA 1 1 NA NA 1 1 NA 1 NA > [6,] NA NA 1 NA 1 1 1 NA NA 1 > [7,] NA NA 1 1 1 NA NA 1 1 1 > [8,] 1 NA 1 1 NA 1 NA NA 1 NA > > > > > > On 7/28/06, John Morrow <[EMAIL PROTECTED]> wrote: > > > > Dear R-Helpers, > > > > I have a large data matrix (9707 rows, 60 columns), which contains missing > > data. The matrix looks something like this: > > > > 1) X X X X X X NA X X X X X X X X X > > > > 2) NA NA NA NA X NA NA NA X NA NA > > > > 3) NA NA X NA NA NA NA NA NA NA > > > > 5) NA X NA X X X NA X X X X NA X > > > > .. > > > > 9708) X NA NA X NA NA X X NA NA X > > > > .and so on. Notice that every row has a varying number of entries, all > > rows > > have at least one entry, but some rows have too much missing data. My > > goal > > is to filter out/remove rows that have ~5 (this number is yet to be > > determined, but let's say its 5) missing entries before I run pearsons to > > tell me correlation between all of the rows. The order of the columns > > does > > not matter here. > > I think that I might need to test each row for a "data, at least one NA, > > data" pattern? > > > > Is there some kind of way of doing this? I am at a loss for an easy way to > > accomplishing this. Any suggestions are most appreciated! > > > > John Morrow > > > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > [email protected] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? > > [[alternative HTML version deleted]] > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
