Hi,
I have a database of 2211 rows with 31 entries each and I manually split my data into 10 folds for cross validation. I build logistic regression model as: >model <- glm(qual ~ AgGr + FaHx + PrHx + PrSr + PaLp + SvD + IndExam + Rad +BrDn + BRDS + PrinFin+ SkRtr + NpRtr + SkThck +TrThkc + SkLes + AxAdnp + ArcDst + MaDen + CaDt + MaMG + MaMrp + MaSh + SCTub + SCFoc + MaSz, family=binomial(link=logit)); Where the variables are taken from the trainSet of size 1989x31. The test set is sized 222x31. Now my question is when I try to predict on the test set it gives me the error: > predict.glm(model, testSet, type ="response") "Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) : subscript out of bounds" It does fine on trainSet. so it is something about the testSet. On the other hand, I realized that some independent variables say "MaSz" takes 3 different values in the trainset vs. 4 different ones in the testSet. I am not sure if this is the cause.If so, what would be the remedy? Since I can retrieve the coefficients of the logistic regression, I could manually calculate response for each entry in the testSet. This could solve my problem although burdensome. But, I don't know an easy way of doing it as my logistic regression have 80+ coefficients. Could somebody advise? Thanks, M [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.