Re: [R] obtaining p-values for lm.ridge() coefficients (package 'MASS')

2011-08-24 Thread Liviu Andronic
Dear Michael
Thank you for your pointers.


On Tue, Aug 23, 2011 at 4:05 PM, Michael Friendly frien...@yorku.ca wrote:
 First, you should be using rms::ols, as Design is old.

Good to know. I've always wondered why Design and rms, in many cases,
were providing similar functions and (upon cursory inspection)
identical documentation.


 Second, penalty in ols() is not the same as the ridge constant in lm.ridge,
 but rather a penalty on the log likelihood.  The documentation
 for ols refers to ?lrm for the description.

I see. At the same time ?rms::ols contains: The penalized maximum
likelihood estimate (penalized least squares or ridge estimate) of
beta is (X'X + P)^{-1} X'Y., while ?rms::lrm defines P as penalty
\times diag(pf) \times penalty.matrix \times diag(pf), where pf is the
vector of square roots of penalty factors computed from penalty. This
is suspiciously similar to the definition of the classical Ridge
Regression beta estimates (X'X + kI)^{-1} X'Y, where k is the lambda
constant and I the identity matrix.

Hence I was wondering if rms::ols could be forced to use 'P = kI', and
thus compute the classical Ridge estimates? I tried to hack rms::ols
to ensure that it passed 'kI' to lm.pfit(..., penalty.matrix = ...),
but I get strange results so I must be missing something. (See
attached.)


 Third, other ridge estimators (e.g., ElemStatLearn::simple.ridge also
 give slightly different results for the same data due to differences in the
 way scaling is handled.

And ?ElemStatLearn::prostate points to several other 'ridge'-related functions.

However, similar to MASS::lm.ridge, simple.ridge() roughly stops at
estimating the betas, hence my question: is the purpose of Ridge
Regression to stabilize the coefficients (when the design matrix is
collinear) and use the results to infer the relative importance of the
coefficients on the response, while keeping the unpenalized model for
prediction purposes? Or am I missing something?

Thank you
Liviu
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] obtaining p-values for lm.ridge() coefficients (package 'MASS')

2011-08-24 Thread Liviu Andronic
The attachment seems to have been dropped, so I'm pasting the code
below. Sorry for that
Liviu


On Wed, Aug 24, 2011 at 1:44 PM, Liviu Andronic landronim...@gmail.com wrote:
 Second, penalty in ols() is not the same as the ridge constant in lm.ridge,
 but rather a penalty on the log likelihood.  The documentation
 for ols refers to ?lrm for the description.

 I see. At the same time ?rms::ols contains: The penalized maximum
 likelihood estimate (penalized least squares or ridge estimate) of
 beta is (X'X + P)^{-1} X'Y., while ?rms::lrm defines P as penalty
 \times diag(pf) \times penalty.matrix \times diag(pf), where pf is the
 vector of square roots of penalty factors computed from penalty. This
 is suspiciously similar to the definition of the classical Ridge
 Regression beta estimates (X'X + kI)^{-1} X'Y, where k is the lambda
 constant and I the identity matrix.

 Hence I was wondering if rms::ols could be forced to use 'P = kI', and
 thus compute the classical Ridge estimates? I tried to hack rms::ols
 to ensure that it passed 'kI' to lm.pfit(..., penalty.matrix = ...),
 but I get strange results so I must be missing something. (See
 attached.)


ols.ridge -
function (formula, data, weights, subset, na.action = na.delete,
method = qr, model = FALSE, x = FALSE, y = FALSE, se.fit = FALSE,
linear.predictors = TRUE, penalty = 0, penalty.matrix, tol = 1e-07,
sigma = NULL, var.penalty = c(simple, sandwich), ridge=TRUE, ...)
{
call - match.call()
var.penalty - match.arg(var.penalty)
m - match.call(expand = FALSE)
mc - match(c(formula, data, subset, weights, na.action),
names(m), 0)
m - m[c(1, mc)]
m$na.action - na.action
m$drop.unused.levels - TRUE
m[[1]] - as.name(model.frame)
if(ridge) lambda - penalty
if (length(attr(terms(formula), term.labels))) {
dul - .Options$drop.unused.levels
if (!length(dul) || dul) {
on.exit(options(drop.unused.levels = dul))
options(drop.unused.levels = FALSE)
}
X - Design(eval.parent(m))
options(drop.unused.levels = dul)
atrx - attributes(X)
atr - atrx$Design
nact - atrx$na.action
Terms - atrx$terms
assig - DesignAssign(atr, 1, Terms)
penpres - FALSE
if (!missing(penalty)  any(unlist(penalty) != 0))
penpres - TRUE
if (!missing(penalty.matrix)  any(penalty.matrix !=
0))
penpres - TRUE
if (penpres  missing(var.penalty))
warning(default for var.penalty has changed to \simple\)
if (method == model.frame)
return(X)
scale - as.character(formula[2])
attr(Terms, formula) - formula
weights - model.extract(X, weights)
if (length(weights)  penpres)
stop(may not specify penalty with weights)
Y - model.extract(X, response)
n - length(Y)
if (model)
m - X
X - model.matrix(Terms, X)
if (length(atr$colnames))
dimnames(X)[[2]] - c(Intercept, atr$colnames)
else dimnames(X)[[2]] - c(Intercept, dimnames(X)[[2]][-1])
if (method == model.matrix)
return(X)
}
else {
if (length(weights))
stop(weights not implemented when no covariables are present)
assig - NULL
yy - attr(terms(formula), variables)[1]
Y - eval(yy, sys.parent(2))
nmiss - sum(is.na(Y))
if (nmiss == 0)
nmiss - NULL
else names(nmiss) - as.character(yy)
Y - Y[!is.na(Y)]
yest - mean(Y)
coef - yest
n - length(Y)
if (!length(sigma))
sigma - sqrt(sum((Y - yest)^2)/(n - 1))
cov - matrix(sigma * sigma/n, nrow = 1, ncol = 1, dimnames =
list(Intercept,
Intercept))
fit - list(coefficients = coef, var = cov, non.slopes = 1,
fail = FALSE, residuals = Y - yest, df.residual = n -
1, intercept = TRUE)
if (linear.predictors) {
fit$linear.predictors - rep(yest, n)
names(fit$linear.predictors) - names(Y)
}
if (model)
fit$model - m
if (x)
fit$x - matrix(1, ncol = 1, nrow = n, dimnames = list(NULL,
Intercept))
if (y)
fit$y - Y
fit$fitFunction - c(ols, lm)
oldClass(fit) - c(ols, rms, lm)
return(fit)
}
if (!penpres) {
fit - if (length(weights))
lm.wfit(X, Y, weights, method = method, ...)
else lm.fit(X, Y, method = method, ...)
cov.unscaled - chol2inv(fit$qr$qr)
r - fit$residuals
yhat - Y - r
if (length(weights)) {
sse - sum(weights * r^2)
m - sum(weights * yhat/sum(weights))
ssr - sum(weights * (yhat - m)^2)
r2 - ssr/(ssr + sse)
if (!length(sigma))
sigma - sqrt(sse/fit$df.residual)

[R] obtaining p-values for lm.ridge() coefficients (package 'MASS')

2011-08-23 Thread Liviu Andronic
Dear all
I'm familiarising myself with Ridge Regressions in R and the following
is bugging me: How does one get p-values for the coefficients obtained
from MASS::lm.ridge() output (for a given lambda)? Consider the
example below (adapted from PRA [1]):

 require(MASS)
 data(longley)
 gr - lm.ridge(Employed ~ .,longley,lambda = seq(0,0.1,0.001))
 plot(gr)
 select(gr)
modified HKB estimator is 0.004275
modified L-W estimator is 0.0323
smallest value of GCV  at 0.003
 ##let's choose 0.03 for lambda
 coef(gr)[gr$lam == 0.03,]
 GNP.deflator  GNP   Unemployed Armed.Forces
Population Year
-1620.429355 0.021060 0.007994-0.013146-0.007752
-0.101880 0.869116


But how does one obtain the customary 'lm' summary information for the
model above? I tried supplying the chosen lambda to Design::ols()
using its 'penalty' argument, but strangely the results differ. See
below.

 require(Design)
 Design::ols(Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
 Population + Year,data=longley,penalty = 0.03,tol=1e-12)

Linear Regression Model

Design::ols(formula = Employed ~ GNP.deflator + GNP + Unemployed +
Armed.Forces + Population + Year, data = longley, penalty = 0.03,
tol = 1e-12)

 n Model L.R.   d.f. R2  Sigma
16  78.22   7.05 0.9929 0.3438

Residuals:
 Min   1Q   Median   3Q  Max
-0.42480 -0.17774 -0.02169  0.16834  0.77203

Coefficients:
Value Std. Error   t Pr(|t|)
Intercept-1580.022688 518.501881 -3.0473 0.016006
GNP.deflator 0.023161   0.068862  0.3363 0.745322
GNP  0.008351   0.015160  0.5509 0.596839
Unemployed  -0.013057   0.002651 -4.9255 0.001178
Armed.Forces-0.007691   0.002097 -3.6671 0.006406
Population  -0.096757   0.144240 -0.6708 0.521355
Year 0.847932   0.269614  3.1450 0.013812

Adjusted R-Squared: 0.9866

 ##the above seems more similar to a lambda of 0.032 than the one required 
 (0.03)
 coef(gr)[gr$lam %in% c(0.03, 0.032),]  ##lm.ridge output
GNP.deflator  GNP Unemployed Armed.Forces Population   Year
0.030 -1620  0.02106 0.007994   -0.01315-0.007752   -0.10188 0.8691
0.032 -1580  0.02316 0.008351   -0.01306-0.007691   -0.09676 0.8479


The difference between the coefficients of MASS::lm.ridge and
Design::ols is small, but they're different nonetheless and I'm
wondering if I'm doing something wrong. Perhaps Design::ols uses a
penalty different from the one specified? Alternatively, is there some
other way to perform diagnostics on Ridge Regression models?

Thank you
Liviu

[1] http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] obtaining p-values for lm.ridge() coefficients (package 'MASS')

2011-08-23 Thread Michael Friendly

On 8/23/2011 3:35 AM, Liviu Andronic wrote:
[snip]


But how does one obtain the customary 'lm' summary information for the
model above? I tried supplying the chosen lambda to Design::ols()
using its 'penalty' argument, but strangely the results differ. See
below.


require(Design)
Design::ols(Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
Population + Year,data=longley,penalty = 0.03,tol=1e-12)




First, you should be using rms::ols, as Design is old.
Second, penalty in ols() is not the same as the ridge constant in 
lm.ridge, but rather a penalty on the log likelihood.  The documentation

for ols refers to ?lrm for the description.

Third, other ridge estimators (e.g., ElemStatLearn::simple.ridge also
give slightly different results for the same data due to differences in 
the way scaling is handled.


hth,
-Michael

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.