On 2/23/2008 6:09 AM, Chuck Cleland wrote: > On 2/22/2008 8:01 PM, Robert Walters wrote: >> R folks, >> As an R novice, I struggle with the mystery of subsetting. Textbook >> and online examples of this seem quite straightforward yet I cannot >> get my mind around it. For practice, I'm using the code in MASS Ch. 6, >> "whiteside data" to analyze a different data set with similar >> variables and structure. >> Here is my data frame: >> >> ###subset one of three cases for the variable 'position' >> >data.b<-data.a[data.a$position=="inrow",] >> > print(data.b) >> position porosity x y >> 1 inrow macro 1.40 16.5 >> 2 inrow macro . . >> . . . . >> . . . . >> 7 inrow micro >> 8 inrow micro >> >> Now I want to do separate lm's for each case of porosity, macro and >> micro. The code as given in MASS, p.141, slightly modified would be: >> >> fit1 <- lm(y ~ x, data=data.b, subset = porosity == "macro") >> fit2 <- update(fit1, subset = porosity == "micro") >> >> ###simplest code with subscripting >> fit1 <- lm(y ~ x, data.b[porosity=="macro"]) > > Assuming data.b has two dimensions, you need a comma after > porosity=="macro" to indicate that you are selecting a subset of rows of > the data frame: > > fit1 <- lm(y ~ x, data.b[porosity=="macro",])
Actually, that should be: fit1 <- lm(y ~ x, data.b[data.b$porosity=="macro",]) because [.data.frame needs to know where to find porosity, and it won't know to look inside of data.b unless you direct it to look there. >> ###following example in ?subset >> fit1 <- lm(y ~ x, data.b, subset(data.b, porosity, select=macro)) > > The select argument to subset is meant to select variables (i.e., it > indicates "columns to select from a data frame") and you are misusing it > by specifying the level of a factor. If you make your call to subset by > itself (a good idea when you are learning how a function works), you > should get an error like this: > > > subset(whiteside, Insul, select=Before) > Error in subset.data.frame(whiteside, Insul, select = Before) : > 'subset' must evaluate to logical > > What I think you intended was this: > > subset(data.b, porosity == "macro") > > Even with the correct call to subset, you also don't want both data.b > and the subset piece, because subset returns a data frame. In other > words, you would be passing lm() two different data frames. So try this > instead: > > fit1 <- lm(y ~ x, subset(data.b, porosity == "macro")) > >> None of th above, plus many permutations thereof, works. >> Can anyone educate me? >> >> Thanks, >> >> Robert Walters >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.