Dear Brian, I like the idea of providing support for raw polynomials in poly() and polym(), if only for pedagogical reasons.
Regards, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] > Sent: Monday, November 07, 2005 11:14 AM > To: John Fox > Cc: [email protected]; 'Kjetil Brinchmann halvorsen' > Subject: RE: [R] OLS variables > > On Mon, 7 Nov 2005, John Fox wrote: > > > Dear Brian, > > > > I don't have a strong opinion, but R's interpretation seems more > > consistent to me, and as Kjetil points out, one can use polym() to > > specify a full-polynomial model. It occurs to me that ^ and > ** could > > be differentiated in model formulae to provide both. > > However, poly[m] only provide orthogonal polynomials, and I > have from time to time considered extending them to provide > raw polynomials too. > Is that a better-supported idea? > > > > > Regards, > > John > > > > -------------------------------- > > John Fox > > Department of Sociology > > McMaster University > > Hamilton, Ontario > > Canada L8S 4M4 > > 905-525-9140x23604 > > http://socserv.mcmaster.ca/jfox > > -------------------------------- > > > >> -----Original Message----- > >> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] > >> Sent: Monday, November 07, 2005 4:05 AM > >> To: Kjetil Brinchmann halvorsen > >> Cc: John Fox; [email protected] > >> Subject: Re: [R] OLS variables > >> > >> On Sun, 6 Nov 2005, Kjetil Brinchmann halvorsen wrote: > >> > >>> John Fox wrote: > >>>> > >>>> I assume that you're using lm() to fit the model, and that > >> you don't > >>>> really want *all* of the interactions among 20 predictors: > >> You'd need > >>>> quite a lot of data to fit a model with 2^20 terms in it, > >> and might > >>>> have trouble interpreting the results. > >>>> > >>>> If you know which interactions you're looking for, then why not > >>>> specify them directly, as in lm(y ~ x1*x2 + x3*x4*x5 + > >> etc.)? On the > >>>> other hand, it you want to include all interactions, say, up to > >>>> three-way, and you've put the variables in a data frame, > >> then lm(y ~ .^3, data=DataFrame) will do it. > >>> > >>> This is nice with factors, but with continuous variables, > >> and need of > >>> a response-surface type, of model, will not do. For > instance, with > >>> variables x, y, z in data frame dat > >>> lm( y ~ (x+z)^2, data=dat ) > >>> gives a model mwith the terms x, z and x*z, not the square terms. > >>> There is a need for a semi-automatic way to get these, for > >> instance, > >>> use poly() or polym() as in: > >>> > >>> lm(y ~ polym(x,z,degree=2), data=dat) > >> > >> This is an R-S difference (FAQ 3.3.2). R's formula parser always > >> takes > >> x^2 = x whereas the S one does so only for factors. This > makes sense > >> it you interpret `interaction' strictly as in John's > description - S > >> chose to see an interaction of any two continuous variables as > >> multiplication (something which puzzled me when I first > encountered > >> it, as it was not well documented back in 1991). > >> > >> I have often wondered if this difference was thought to be an > >> improvement, or if it just a different implementation of the > >> Rogers-Wilkinson syntax. > >> Should we consider changing it? > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
