>> From: "Stefan Th. Gries" <[EMAIL PROTECTED]> writes: I have a problem with splitting up a data frame called ReVerb: I would like to extract all cases where SYNTAX=="Ditrans" from ReVerb, store that in a file, and then generate ReVerb again without these cases and factor levels. My problem is probably obvious from the following lines of code:
> ditrans<-which(SYNTAX=="Ditrans") > ReVerb1<-ReVerb[-c(ditrans),]; dim(ReVerb1) [1] 91532 16 # ok, so the 92713-91532=1181 cases where SYNTAX=="Ditrans" have been removed, but ... > ReVerb1<-subset(ReVerb, SYNTAX!="Ditrans"); dim(ReVerb1) [1] 91528 16 # ... so why don't I get 91532 again as the number of rows? # Any ideas?? > From: Peter Dalgaard <[EMAIL PROTECTED]> > The SYNTAX variable is not necessarily the same. Could you retry the first > case with > ditrans <- which(ReVerb$SYNTAX=="Ditrans") > ? The results were the same as with 'ditrans<-which(SYNTAX=="Ditrans")'. > Otherwise, try doing a setdiff() on the rownames of the two discrepant > results and see which are the four cases that differ. This solved the issue: Using setdiff, I found that the cases that the second way with subset fails to include are NA's ... - I was not aware of how subset treats NA, sorry. Thanks a lot, STG -- Stefan Th. Gries ---------------------------------------- Max Planck Inst. for Evol. Anthropology http://people.freenet.de/Stefan_Th_Gries ---------------------------------------- Machen Sie aus 14 Cent spielend bis zu 100 Euro! Die neue Gaming-Area von Arcor - über 50 Onlinespiele im Angebot. http://www.arcor.de/rd/emf-gaming-1 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
