It means you have selected a response variable from one data frame (unmarried.male) and a predictor from another data frame (fieder.male) and they have different lengths.
You might be better off if you used the names in the data frame rather than selecting columns in a form such as 'some.data.frame[, 3]', This just confuses the issue and makes it very easy to make mistakes - as indeed you have done. Also, to fit models on subsets of the data, you do not have to create separate data frames. See the 'subset' argument of glm, which is standard for most fitting functions. This is also a way to avoid problems and would have helped you here as well. Bill Venables. -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of gked Sent: Monday, 14 March 2011 4:33 AM To: r-help@r-project.org Subject: [R] troubles with logistic regression hello everyone, I working on the dataset for my project in class and got stuck on trying to run logistic regression. here is my code: data <- read.csv(file="C:/Users/fieder.data.2000.csv") # creating subset of men fieder.male<-subset(data,data[,8]==1) unmarried.male<-subset(data,data[,8]==1&data[,6]==1) # glm fit agesq.male<-(unmarried.male[,5])^2 male.sqrtincome<-sqrt(unmarried.male[,9]) fieder.male.mar.glm<-glm(as.factor(unmarried.male[,6])~ factor(fieder.male[,7])+fieder.male[,5]+agesq.male+ male.sqrtincome,binomial(link="logit") ) par(mfrow=c(1,1)) plot(c(0,300),c(0,1),pch=" ", xlab="sqrt income, truncated at 90000", ylab="modeled probability of being never-married") junk<- lowess(male.sqrtincome, log(fieder.male.mar.glm$fitted.values/ (1-fieder.male.mar.glm$fitted.values))) lines(junk$x,exp(junk$y)/(1+exp(junk$y))) title(main="probability of never marrying\n males, by sqrt(income)") points(male.sqrtincome[unmarried.male==0], fieder.male.mar.glm$fitted.values[unmarried.male==0],pch=16) points(male.sqrtincome[unmarried.male==1], fieder.male.mar.glm$fitted.values[unmarried.male==1],pch=1) The error says: Error in model.frame.default(formula = as.factor(unmarried.male[, 6]) ~ : variable lengths differ (found for 'factor(fieder.male[, 7])') What does it mean? Where am i making a mistake? Thank you P.S. i am also attaching data file in .csv format http://r.789695.n4.nabble.com/file/n3352356/fieder.data.2000.csv fieder.data.2000.csv -- View this message in context: http://r.789695.n4.nabble.com/troubles-with-logistic-regression-tp3352356p3352356.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.