It means you have selected a response variable from one data frame 
(unmarried.male) and a predictor from another data frame (fieder.male) and they 
have different lengths.  

You might be better off if you used the names in the data frame rather than 
selecting columns in a form such as 'some.data.frame[, 3]',  This just confuses 
the issue and makes it very easy to make mistakes - as indeed you have done.

Also, to fit models on subsets of the data, you do not have to create separate 
data frames.  See the 'subset' argument of glm, which is standard for most 
fitting functions.  This is also a way to avoid problems and would have helped 
you here as well.

Bill Venables.
 

-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of gked
Sent: Monday, 14 March 2011 4:33 AM
To: r-help@r-project.org
Subject: [R] troubles with logistic regression

hello everyone,
I working on the dataset for my project in class and got stuck on trying to
run logistic regression. here is my code:
data <- read.csv(file="C:/Users/fieder.data.2000.csv")

# creating subset of men 
fieder.male<-subset(data,data[,8]==1)
unmarried.male<-subset(data,data[,8]==1&data[,6]==1)

# glm fit
agesq.male<-(unmarried.male[,5])^2
male.sqrtincome<-sqrt(unmarried.male[,9])

fieder.male.mar.glm<-glm(as.factor(unmarried.male[,6])~
 factor(fieder.male[,7])+fieder.male[,5]+agesq.male+
  male.sqrtincome,binomial(link="logit") )
par(mfrow=c(1,1))
plot(c(0,300),c(0,1),pch=" ",
   xlab="sqrt income, truncated at 90000",
   ylab="modeled probability of being never-married")
junk<- lowess(male.sqrtincome,
  log(fieder.male.mar.glm$fitted.values/
  (1-fieder.male.mar.glm$fitted.values)))
  lines(junk$x,exp(junk$y)/(1+exp(junk$y)))
title(main="probability of never marrying\n males, by sqrt(income)")
points(male.sqrtincome[unmarried.male==0],
  fieder.male.mar.glm$fitted.values[unmarried.male==0],pch=16)
points(male.sqrtincome[unmarried.male==1],
  fieder.male.mar.glm$fitted.values[unmarried.male==1],pch=1)

The error says: 
Error in model.frame.default(formula = as.factor(unmarried.male[, 6]) ~  : 
  variable lengths differ (found for 'factor(fieder.male[, 7])')
 
What does it mean? Where am i making a mistake?
Thank you
P.S. i  am also attaching data file in .csv format
http://r.789695.n4.nabble.com/file/n3352356/fieder.data.2000.csv
fieder.data.2000.csv 

--
View this message in context: 
http://r.789695.n4.nabble.com/troubles-with-logistic-regression-tp3352356p3352356.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to