Hi On 12 Feb 2007 at 10:42, Michael Rennie wrote:
Date sent: Mon, 12 Feb 2007 10:42:21 -0500 To: "Petr Pikal" <[EMAIL PROTECTED]>, r-help@stat.math.ethz.ch From: Michael Rennie <[EMAIL PROTECTED]> Subject: Re: [R] Trying to replicate error message in subset() > > Okay > > First- I apologise for my poor use of terminology (error vs. warning). > > Second- thanks for pointing out what I hadn't noticed before- when I > pass one case for selection to subset, I get all observations. When I > pass two cases (as a vector), I get every second case for both cases > (if both are present, if not, I just get every second case for the one > that is present). Same happens for three cases, as pointed out by Petr > below. > > So, trying the %in% operator, I get slightly different behabviour, but > the selection still seems dependent on the length of the vector given > as a selector: > > > b<-c("D", "F", "A") > > new2.dat<-subset(ex.dat, a%in%x1) > > new2.dat > y1 x1 x2 > 1 2.34870479 A B > 3 1.66090055 A B > 5 -0.07904798 A B > 7 2.07053656 A B > 9 2.97980444 A B > ..... > > Now, I just get every second observation, over all cases of x1. > Probably doesn't get as far as A because F is not present? Are you completely sure? I get > table(ex.dat$x1) A B C D E 40 40 40 40 40 > table(new.dat$x1) A B C D E 40 0 0 40 0 so all ceses for A and D with subset like this a<-c("D", "F", "A") new.dat<-subset(ex.dat, x1 %in% a) and ex.dat constructed according to your example. If I get something what I do not expect: 1. I check if my data are what they should be 2. I check if search path and working directory does not contain some objects with conflicting names 3. If my functions are complicated I try to look how their parts really work If everything seems OK and unexpected behaviour still occures, I go through docummentation, help archives and finally I try to seek an advice from help list. I must say that this is a bit time consuming but I usually learn a lot from my mistakes which I am able to resolve myself. HTH Petr > > According to the documentation on subset(), the function works on rows > of dataframes. I'm guessing this explains the behaviour I'm seeing- > somehow, the length of the vector being passed as the subset argument > is dictating which rows to evaluate. So, can anyone offer advice on > how to select EVERY instance for multiple cases in a dataframe (i.e., > all cases of both A and D from ex.dat), or will subset always be tied > to the length of the 'subset' argument when a vector is passed to it? > > Cheers, > > Mike > > > At 02:46 AM 12/02/2007, Petr Pikal wrote: > >Hi > > > >it is not error it is just warning (Beeping a tea kettle with boiling > >water is also not an error :-) and it tells you pretty explicitly > >what is wrong see length of your objects > > > > > a<-c("D", "F", "A") > > > new.dat<-subset(ex.dat, x1 == a) > >Warning messages: > >1: longer object length > > is not a multiple of shorter object length in: is.na(e1) | > >is.na(e2) > >2: longer object length > > is not a multiple of shorter object length in: > >`==.default`(x1, a) > > > new.dat > > y1 x1 x2 > >3 0.5977786 A B > >6 2.5470739 A B > >9 0.9128595 A B > >12 1.0953531 A D > >15 2.4984470 A D > >18 1.7289529 A D > >61 -0.4848938 D B > >6 > > > >you can do better with %in% operator. > > > >HTH > >Petr > > > > > > > >On 12 Feb 2007 at 1:51, Michael Rennie wrote: > > > >Date sent: Mon, 12 Feb 2007 01:51:54 -0500 > >To: r-help@stat.math.ethz.ch > >From: Michael Rennie <[EMAIL PROTECTED]> > >Subject: [R] Trying to replicate error message in > >subset() > > > > > > > > Hi, there > > > > > > I am trying to replicate an error message in subset() to see what > > > it is that I'm doing wrong with the datasets I am trying to work > > > with. > > > > > > Essentially, I am trying to pass a string vector to subset() in > > > order to select a specific collection of cases (i.e., I have data > > > for these cases in one table, and want to select data from another > > > table that match up with the cases in the first table). > > > > > > The error I get is as follows: > > > > > > Warning messages: > > > 1: longer object length > > > is not a multiple of shorter object length in: is.na(e1) > > > | is.na(e2) > > > 2: longer object length > > > is not a multiple of shorter object length in: > > > `==.default`(LAKE, g) > > > > > > Here is an example case I've been working with (which works) that > > > I've been trying to "break"such that I can get this error message > > > to figure out what I am doing wrong in my case. > > > > > > y1<-rnorm(100, 2) > > > x1<-rep(1:5, each=20) > > > x2<-rep(1:2, each=10, times=10) > > > > > > ex.dat<-data.frame(cbind(y1,x1,x2)) > > > > > > > > > ex.dat$x1<-factor(ex.dat$x1, labels=c("A", "B", "C", "D", "E")) > > > ex.dat$x2<-factor(ex.dat$x2, labels=c("B", "D")) > > > > > > a<-c("D", "F") > > > a > > > > > > new.dat<-subset(ex.dat, x1 == a) > > > new.dat > > > > > > I thought maybe I was getting errors because I had cases in my > > > selection vector ('a' in this case) that weren't in my ex.dat > > > list, but subset handles this fine and just gives me what it can > > > find in the larger list. > > > > > > Any thoughts on how I can replicate the error? As far as I can > > > tell, the only difference between the case where I am getting > > > errors and the example above is that the levels of x1 in my case > > > are words (i.e., "Smelly", "Howdy"), but strings are strings, > > > aren't they? > > > > > > Mike > > > > > > > > > Michael Rennie > > > Ph.D. Candidate, University of Toronto at Mississauga > > > 3359 Mississauga Rd. N. > > > Mississauga, ON L5L 1C6 > > > Ph: 905-828-5452 Fax: 905-828-3792 > > > www.utm.utoronto.ca/~w3rennie > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html and provide commented, > > > minimal, self-contained, reproducible code. > > > >Petr Pikal > >[EMAIL PROTECTED] > > Michael Rennie > Ph.D. Candidate, University of Toronto at Mississauga > 3359 Mississauga Rd. N. > Mississauga, ON L5L 1C6 > Ph: 905-828-5452 Fax: 905-828-3792 > www.utm.utoronto.ca/~w3rennie > Petr Pikal [EMAIL PROTECTED] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.