Re: [R] Translating lm.object to SQL, C, etc function

John Fox Fri, 14 Feb 2003 04:54:31 -0800

Dear Jeremy,

I've written replacements for the standard R contrast functions that produce the kind of more easily parsed (and more readable) contrast names that I think you have in mind. I intend to include these in the next release of the car package for R but haven't done so yet. Since the code isn't very long, I've appended it (and the .Rd documentation file to this note). Note that R does separate terms in an interaction with a colon.

I hope that this does what you need.
John

---------------------------- Contrasts.R -----------------------------

# last modified 2 Dec 2002 by J. Fox
# all of these functions are adapted from functions in the R base package

contr.Treatment <- function (n, base = 1, contrasts = TRUE) {
if (is.numeric(n) && length(n) == 1)
levs <- 1:n
else {
levs <- n
n <- length(n)
}
lev.opt <- getOption("decorate.contrasts")
pre <- if (is.null(lev.opt)) "[" else lev.opt[1]
suf <- if (is.null(lev.opt)) "]" else lev.opt[2]
dec <- getOption("decorate.contr.Treatment")
dec <- if (!contrasts) ""
else if (is.null(dec)) "T."
else dec
contr.names <- paste(pre, dec, levs, suf, sep="")
contr <- array(0, c(n, n), list(levs, contr.names))
diag(contr) <- 1
if (contrasts) {
if (n < 2)
stop(paste("Contrasts not defined for", n - 1, "degrees of freedom"))
if (base < 1 | base > n)
stop("Baseline group number out of range")
contr <- contr[, -base, drop = FALSE]
}
contr
}

contr.Sum <- function (n, contrasts = TRUE)
{
if (length(n) <= 1) {
if (is.numeric(n) && length(n) == 1 && n > 1)
levels <- 1:n
else stop("Not enough degrees of freedom to define contrasts")
}
else levels <- n
lenglev <- length(levels)
lev.opt <- getOption("decorate.contrasts")
pre <- if (is.null(lev.opt)) "[" else lev.opt[1]
suf <- if (is.null(lev.opt)) "]" else lev.opt[2]
dec <- getOption("decorate.contr.Sum")
dec <- if (!contrasts) ""
else if (is.null(dec)) "S."
else dec
show.lev <- getOption("contr.Sum.show.levels")
contr.names <- if ((is.null(show.lev)) || show.lev) paste(pre, dec, levels, suf, sep="")
if (contrasts) {
cont <- array(0, c(lenglev, lenglev - 1), list(levels,
contr.names[-lenglev]))
cont[col(cont) == row(cont)] <- 1
cont[lenglev, ] <- -1
}
else {
cont <- array(0, c(lenglev, lenglev), list(levels,
contr.names))
cont[col(cont) == row(cont)] <- 1
}
cont
}

contr.Helmert <- function (n, contrasts = TRUE)
{
if (length(n) <= 1) {
if (is.numeric(n) && length(n) == 1 && n > 1)
levels <- 1:n
else stop("contrasts are not defined for 0 degrees of freedom")
}
else levels <- n
lenglev <- length(levels)
lev.opt <- getOption("decorate.contrasts")
pre <- if (is.null(lev.opt)) "[" else lev.opt[1]
suf <- if (is.null(lev.opt)) "]" else lev.opt[2]
dec <- getOption("decorate.contr.Helmert")
dec <- if (!contrasts) ""
else if (is.null(dec)) "H."
else dec
nms <- if (contrasts) 1:lenglev else levels
contr.names <- paste(pre, dec, nms, suf, sep="")
if (contrasts) {
cont <- array(-1, c(lenglev, lenglev - 1), list(levels,
contr.names[-lenglev]))
cont[col(cont) <= row(cont) - 2] <- 0
cont[col(cont) == row(cont) - 1] <- 1:(lenglev - 1)
}
else {
cont <- array(0, c(lenglev, lenglev), list(levels, contr.names))
cont[col(cont) == row(cont)] <- 1
}
cont
}

------------------------------- Contrasts.Rd ------------------------------------------

\name{Contrasts}
\alias{Contrasts}
\alias{contr.Treatment}
\alias{contr.Sum}
\alias{contr.Helmert}

\title{Functions to Construct Contrasts}
\description{
These are substitutes for similarly named functions in the base package
(note the uppercase letter starting the second word in each function name).
The only difference is that the contrast functions from the car package
produce easier-to-read names for the contrasts when they are used in statistical models.

The functions and this documentation are adapted from the base package.
}

\usage{
contr.Treatment(n, base = 1, contrasts = TRUE)

contr.Sum(n, contrasts = TRUE)

contr.Helmert(n, contrasts = TRUE)
}

\arguments{
\item{n}{a vector of levels for a factor, or the number of levels.}
\item{base}{an integer specifying which level is considered the baseline level.
Ignored if \code{contrasts} is \code{FALSE}.}
\item{contrasts}{a logical indicating whether contrasts should be computed.}
}

\details{
These functions are used for creating contrast matrices for use in fitting analysis of variance and regression models.
The columns of the resulting matrices contain contrasts which can be used for coding a factor with \code{n} levels.
The returned value contains the computed contrasts. If the argument \code{contrasts} is \code{FALSE} then a square matrix is returned.

Several aspects of these contrast functions are controlled by options set via the \code{options} command:
\describe{
\item{\code{decorate.contrasts}}{This option should be set to a 2-element character vector containing the prefix and suffix
characters to surround contrast names. If the option is not set, then \code{c("[", "]")} is used. For example, setting
\code{options(decorate.contrasts=c(".", ""))} produces contrast names that are separated from factor names by a period.
Setting \code{options(decorate.contrasts=c("", ""))} reproduces the behaviour of the R base contrast functions.}
\item{\code{decorate.contr.Treatment}}{A character string to be appended to contrast names to signify treatment contrasts;
if the option is unset, then \code{"T."} is used.}
\item{\code{decorate.contr.Sum}}{Similar to the above, with default \code{"S."}.}
\item{\code{decorate.contr.Helmert}}{Similar to the above, with default \code{"H."}.}
\item{\code{contr.Sum.show.levels}}{Logical value: if \code{TRUE} (the default if unset),
then level names are used for contrasts; if \code{FALSE}, then numbers are used, as in \code{contr.sum}
in the \code{base} package.}
}

Note that there is no replacement for \code{contr.poly} in the \code{base} package (which produces
orthogonal-polynomial contrasts) since this function already constructs easy-to-read contrast names.
}

\value{
A matrix with \code{n} rows and \code{k} columns, with \code{k = n - 1} if \code{contrasts} is \code{TRUE}
and \code{k = n} if \code{contrasts} is \code{FALSE}.
}

\author{John Fox \email{[EMAIL PROTECTED]}}

\seealso{\code{\link[base]{contr.treatment}}, \code{\link[base]{contr.sum}},
\code{\link[base]{contr.helmert}}, \code{\link[base]{contr.poly}} }

\examples{
# contr.Treatment vs. contr.treatment in the base package:

data(Prestige)
lm(prestige ~ (income + education)*type, data=Prestige,
contrasts=list(type="contr.Treatment"))

## Call:
## lm(formula = prestige ~ (income + education) * type, data = Prestige,
## contrasts = list(type = "contr.Treatment"))
##
## Coefficients:
## (Intercept) income education
## 2.275753 0.003522 1.713275
## type[T.prof] type[T.wc] income:type[T.prof]
## 15.351896 -33.536652 -0.002903
## income:type[T.wc] education:type[T.prof] education:type[T.wc]
## -0.002072 1.387809 4.290875

lm(prestige ~ (income + education)*type, data=Prestige,
contrasts=list(type="contr.treatment"))

## Call:
## lm(formula = prestige ~ (income + education) * type, data = Prestige,
## contrasts = list(type = "contr.treatment"))
##
## Coefficients:
## (Intercept) income education
## 2.275753 0.003522 1.713275
## typeprof typewc income:typeprof
## 15.351896 -33.536652 -0.002903
## income:typewc education:typeprof education:typewc
## -0.002072 1.387809 4.290875
}

\keyword{models}
\keyword{regression}

-------------------------------------------------------------------------------------------------------------------------------------------------

At 04:37 PM 2/14/2003 +1100, [EMAIL PROTECTED] wrote:

This is my first post to this list so I suppose a quick intro is in
order. I've been using SPLUS 2000 and R1.6.2 for just a couple of days,
and love S already. I'm reading MASS and also John Fox's book - both have
been very useful. My background in stat software was mainly SPSS (which
I've never much liked - thanks heavens I've found S!), and Perl is my
tool of choice for general-purpose programming (I chaired the
perl6-language-data working group, responsible for improving the data
analysis capabilities in Perl).


I have just completed my first S project, and I now have 8 lm.objects.
The models are all reasonably complex with multiple numeric and factor
variables and some 2-way and 3-way interactions. I now need to use these
models in other environments, such as C code, SQL functions (using CASE)
and in Perl - I can not work out how to do this.

The difficulty I am having is that the output of coef() is not really
parsable, since there is no marker in the name of an coefficient of
separate out the components. For instance, in SPSS the name of a
coefficient might be:

  var1=[a]*var2=[b]*var3

...which is easy to write a little script to pull that apart and turn it
into a line of SQL, C, or whatever. In S however the name looks like:

  var1avar2bvar3

...which provides no way to pull the bits apart.

So my question is, how do I export an lm.object in some form that I can
then apply to prediction in C, SQL, or some other language? All I'm
looking for is some well-structured textual or data frame output that I
can then manipulate with appropriate tools, whether it be S itself, or
something like Perl.

Thanks in advance for any suggestions (and apologies in advance if this
is well documented somewhere!),

______________________________________________
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help

Re: [R] Translating lm.object to SQL, C, etc function

Reply via email to