On Thu, 27 Mar 2003 [EMAIL PROTECTED] wrote: > <Bravington wrote:> > #> `predict' complains about new factor levels, even if the > #"new" levels are > #> merely levels in the original that didn't occur in the > #original fit and were > #> sensibly dropped, and that don't occur in the prediction > #data either. > > <Ripley replied:> > #This is intentional. The coding for factors is based on the > #full set of > #levels, and should be comparable for different prediction sets. > # > #If you are using factors with fictitious levels the fix is obvious: > #improve the design. > > There is still an inconsistency bug between `lm' and `predict.lm', though. > `lm' intentionally overlooks inactive levels of a factor, but `predict.lm'
Only if an argument is set, and originally lm did not do so. > doesn't, even when it legitimately could. In particular, it is a bit odd to > have no problem predicting without a `newdata' argument even when the > original data had inactive factor levels, but then to get an error if > `newdata=<<original data>>' is supplied explicitly! (See example.) Read again. predict.lm is consistent across its inputs: unlike lm it can take variable `newdata'. As I said the intention is to be consistent across *prediction sets*. Omitting newdata is not giving a prediction set. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel