Dear r-devel list members, I've recently encountered the following problem using predict() with a model that has raw-polynomial terms. (Actually, I encountered the problem using model.frame(), but the source of the error is the same.) The problem is technical and concerns the design of poly(), which is why I'm sending this message to r-devel rather than r-help.
To illustrate: ------------ snip ------------- > mod.pres <- lm(prestige ~ log(income, 10) + poly(education, 3) + poly(women, 2), + data=Prestige) # Prestige data from car package > predict(mod.pres, newdata=data.frame(education=10, income=6000, women=30)) # works 1 39.66414 > model.frame(delete.response(terms(mod.pres)), data.frame(education=10, income=6000, women=30)) log(income, 10) poly(education, 3).1 poly(education, 3).2 poly(education, 3).3 poly(women, 2).1 poly(women, 2).2 1 3.778151 -0.02691558 -0.08720086 0.07199804 0.003202256 -0.138777837 > mod.pres.raw <- lm(prestige ~ log(income, 10) + poly(education, 3, raw=TRUE) + poly(women, 2, raw=TRUE), + data=Prestige) > predict(mod.pres.raw, newdata=data.frame(education=10, income=6000, women=30)) # doesn't work Error in poly(education, 3, raw = TRUE) : 'degree' must be less than number of unique points > model.frame(delete.response(terms(mod.pres.raw)), data.frame(education=10, income=6000, women=30)) Error in poly(education, 3, raw = TRUE) : 'degree' must be less than number of unique points ------------ snip ------------- I understand the source of the error, but what's unclear to me is why it's necessary for poly() to check the degree of the polynomial against the number of unique supplied points when raw=TRUE. For example, if I simply remove this check from poly(), then ------------ snip ------------- > predict(mod.pres.raw, newdata=data.frame(education=10, income=6000, women=30)) 1 39.66414 > model.frame(delete.response(terms(mod.pres.raw)), data.frame(education=10, income=6000, women=30)) log(income, 10) poly(education, 3, raw = TRUE).1 poly(education, 3, raw = TRUE).2 poly(education, 3, raw = TRUE).3 poly(women, 2, raw = TRUE).1 poly(women, 2, raw = TRUE).2 1 3.778151 10 100 1000 30 900 ------------ snip ------------- Of course, if one then used the modified poly() in a regression with fewer unique Xs than the degree of the polynomial, the model matrix would be singular; but why not just let the error appear at that stage? I could solve my problem by maintaining a local version of poly(), but why should it not be possible to get predictions under these circumstances? Best, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel