I would like to predict a new response from a fitted linear model where the new data is a single case with a missing value. My reading of the help on predict() is inconclusive on whether this is possible.
Leaving out the missing value or setting it to NA both fail but differently, see example code below. > y <- runif(50) > x1 <- rnorm(50) > x2 <- rnorm(50) > dat <- data.frame(y, x1, x2) > mod <- lm(y~.,data=dat) > summary(mod) Call: lm(formula = y ~ ., data = dat) Residuals: Min 1Q Median 3Q Max -0.50467 -0.28997 0.01457 0.27970 0.47791 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.50098 0.04577 10.945 1.6e-14 *** x1 -0.01762 0.04172 -0.422 0.675 x2 -0.02753 0.04920 -0.560 0.578 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.3177 on 47 degrees of freedom Multiple R-squared: 0.009301, Adjusted R-squared: -0.03286 F-statistic: 0.2206 on 2 and 47 DF, p-value: 0.8028 > predict(mod, newdata=data.frame(x1=0.1, x2=0.3)) #OK as expected 1 0.4909624 > predict(mod, newdata=data.frame(x1=0.1)) # x2 missing Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) : variable lengths differ (found for 'x2') In addition: Warning message: 'newdata' had 1 row but variables found have 50 rows > predict(mod, newdata=data.frame(x1=0.1, x2=NA)) #x2=NA Error: variable 'x2' was fitted with type "numeric" but type "logical" was supplied > Thanks Chris ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.