Re: [R] Constrained Regression

2011-01-20 Thread Samuel Le
Hello,

Your problem is y=bX+epsilon
It can be transformed into: epsilon^2=(y-bX)^2

Standard (unconstrained) regressions are about minimizing the variance of 
epsilon, ie (y-bX)^2.

In your case, you need to minimize again the quantity (y-bX)^2 with your 
constraints on b=(b1,...,b5). Solve.QP should just do that for you.

HTH,

Samuel

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Jackie Chen
Sent: 20 January 2011 16:31
To: R-help@r-project.org
Subject: [R] Constrained Regression

Hi everyone,
I'm trying to perform a linear regression y = b1x1 + b2x2 + b3x3 + b4x4 +
b5x5 while constraining the coefficients such that -3 = bi = 3, and the
sum of bi =1.  I've searched R-help and have found solutions for
constrained regression using quadratic programming (solve.QP) where the
coefficients are between 0 and 1 and sum to 1, but unfortunately do not
understand it well enough to adapt to my problem.   Is there a way to do
this using the lm function or do I absolutely need to use solve.QP?  And
if I need to use solve.QP, how would I modify the Boston data example to
my problem?
Thanks so much.
Jackie

---
This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication
in error) please notify the sender immediately and destroy this
communication. Any unauthorized copying, disclosure or distribution of the
material in this communication is strictly forbidden.

Deutsche Bank does not render legal or tax advice, and the information
contained in this communication should not be regarded as such.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__ Information from ESET NOD32 Antivirus, version of virus signature 
database 5803 (20110120) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com



__ Information from ESET NOD32 Antivirus, version of virus signature 
database 5803 (20110120) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Regression

2010-10-31 Thread Jim Silverton
Hello everyone,
I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want
to do a constrained regression such that a0 and (1-a) 0

for the model:

Y = a*X1 + (1-a)*X2

I tried the help on the constrained regression in R but I concede that it
was not helpful.

Any help is greatly appreciated
-- 
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Regression

2010-10-31 Thread Ravi Varadhan
What is the stochastic mechanism that generates the data? In other words, what 
distribution is Y, conditioned on X1 and X2, supposed to be?  

You can also get `a' if you do not wanrt to specify the probability mechanism 
by just doing a least squares fitting, but then making inferences can be a bit 
tricky.

Ravi.



Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: Jim Silverton jim.silver...@gmail.com
Date: Sunday, October 31, 2010 2:45 am
Subject: Re: [R] Constrained Regression
To: r-help@r-project.org


 Hello everyone,
  I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. 
 I want
  to do a constrained regression such that a0 and (1-a) 0
  
  for the model:
  
  Y = a*X1 + (1-a)*X2
  
  I tried the help on the constrained regression in R but I concede 
 that it
  was not helpful.
  
  Any help is greatly appreciated
  -- 
  Thanks,
  Jim.
  
   [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Regression

2010-10-31 Thread David Winsemius


On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote:


Hello everyone,
I have 3 variables Y, X1 and X2. Each variables lies between 0 and  
1. I want

to do a constrained regression such that a0 and (1-a) 0

for the model:

Y = a*X1 + (1-a)*X2



It would not accomplish the constraint that a  0 but you could  
accomplish the other constraint within an lm fit:


X3 - X1-X2
lm(Y ~ X3 + offset(X2) )

Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2)
 Y ~ intercept + beta1*X1 + (1 -beta1)*X2

... so beta1 is a.

In the case beta  0 then I suppose a would be assigned 0. This might  
be accomplished within an iterative calculation framework by a large  
penalization for negative values. In a reply (1) to a question by  
Carlos Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a  
similar problem ( sum(coef) == 1 AND coef non-negative). Modifying his  
code to incorporate the above strategy (and choosing two variables for  
which parameter values might be inside the constraint boundaries) we  
get:


library(MASS)   ## to access the Boston data
  designmat - model.matrix(medv~I(age-lstat) +offset(lstat),  
data=Boston)

  Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat,
  Boston$medv)
  Amat - cbind(1, diag(NROW(Dmat)))
  bvec - c(1,rep(0,NROW(Dmat)))
  meq - 1
library(quadprog)
  res - solve.QP(Dmat, dvec, Amat, bvec, meq)

 zapsmall(res$solution)
[1] 0.686547 0.313453

Turlach specifically advised against any interpretation of this  
particular result which was only contructed to demonstrate the  
mathematical mechanics.




I tried the help on the constrained regression in R but I concede  
that it

was not helpful.


I must not have that package installed because I got nothing that  
appeared to be useful with ??constrained regression .



David Winsemius, MD
West Hartford, CT

1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Regression

2010-10-31 Thread Spencer Graves

Have you tried the 'sos' package?


install.packages('sos') # if not already installed
library(sos)
cr - ???'constrained regression' # found 149 matches
summary(cr) # in 69 packages
cr # opens a table in a browser listing all 169 matches with links to 
the help pages



  However, I agree with Ravi Varadhan:  I'd want to understand the 
physical mechanism generating the data.  If each is, for example, a 
proportion, then I'd want to use logistic regression, possible after 
some approximate logistic transformation of X1 and X2 that prevents 
logit(X) from going to +/-Inf.  This is a different model, but it 
achieves the need to avoid predictions of Y going outside the range (0, 1).



  Spencer


On 10/31/2010 9:01 AM, David Winsemius wrote:


On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote:


Hello everyone,
I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. 
I want

to do a constrained regression such that a0 and (1-a) 0

for the model:

Y = a*X1 + (1-a)*X2



It would not accomplish the constraint that a  0 but you could 
accomplish the other constraint within an lm fit:


X3 - X1-X2
lm(Y ~ X3 + offset(X2) )

Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2)
 Y ~ intercept + beta1*X1 + (1 -beta1)*X2

... so beta1 is a.

In the case beta  0 then I suppose a would be assigned 0. This might 
be accomplished within an iterative calculation framework by a large 
penalization for negative values. In a reply (1) to a question by 
Carlos Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a 
similar problem ( sum(coef) == 1 AND coef non-negative). Modifying his 
code to incorporate the above strategy (and choosing two variables for 
which parameter values might be inside the constraint boundaries) we get:


library(MASS)   ## to access the Boston data
  designmat - model.matrix(medv~I(age-lstat) +offset(lstat), 
data=Boston)

  Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat,
  Boston$medv)
  Amat - cbind(1, diag(NROW(Dmat)))
  bvec - c(1,rep(0,NROW(Dmat)))
  meq - 1
library(quadprog)
  res - solve.QP(Dmat, dvec, Amat, bvec, meq)

 zapsmall(res$solution)
[1] 0.686547 0.313453

Turlach specifically advised against any interpretation of this 
particular result which was only contructed to demonstrate the 
mathematical mechanics.




I tried the help on the constrained regression in R but I concede 
that it

was not helpful.


I must not have that package installed because I got nothing that 
appeared to be useful with ??constrained regression .



David Winsemius, MD
West Hartford, CT

1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Regression

2010-10-31 Thread David Winsemius


On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote:


Have you tried the 'sos' package?


I have, and I am taking this opportunity to load it with my .Rprofile  
to make it more accessible. It works very well. Very clean display. I  
also have constructed a variant of RSiteSearch that I find more useful  
than the current defaults.


rhelpSearch - function(string,
  restrict = c(Rhelp10, Rhelp08, Rhelp02,  
functions ),

   matchesPerPage = 100, ...) {
  RSiteSearch(string=string,
  restrict = restrict,
  matchesPerPage = matchesPerPage, ...)}



install.packages('sos') # if not already installed
library(sos)
cr - ???'constrained regression' # found 149 matches
summary(cr) # in 69 packages
cr # opens a table in a browser listing all 169 matches with links  
to the help pages


 However, I agree with Ravi Varadhan:  I'd want to understand  
the physical mechanism generating the data.  If each is, for  
example, a proportion, then I'd want to use logistic regression,  
possible after some approximate logistic transformation of X1 and X2  
that prevents logit(X) from going to +/-Inf.  This is a different  
model, but it achieves the need to avoid predictions of Y going  
outside the range (0, 1).


No argument. I defer to both of your greater experiences in such  
problems and your interest in educating us less knowledgeable users. I  
also need to amend my suggested strategy in situations where a linear  
model _might_ be appropriate, since I think the inclusion of the  
surrogate variable in the solve.QP setup is very probably creating  
confusion. After reconsideration I think one should keep the two  
approaches separate. These are two approaches to the non-intercept  
versions of the model that yield the same estimate (but only because  
the constraints do not get invoked ):


 lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston)

Call:
lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston)

Coefficients:
I(age - lstat)
0.1163

 library(MASS)   ## to access the Boston data
  designmat - model.matrix(medv~age+lstat-1, data=Boston)
  Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat,
+  Boston$medv)
   Amat - cbind(1, diag(NROW(Dmat)));  bvec -  
c(1,rep(0,NROW(Dmat))); meq - 1

  library(quadprog);
  res - solve.QP(Dmat, dvec, Amat, bvec, meq)
 zapsmall(res$solution)  # zapsmall not really needed in this instance
[1] 0.1163065 0.8836935

--
David.



 Spencer


On 10/31/2010 9:01 AM, David Winsemius wrote:


On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote:


Hello everyone,
I have 3 variables Y, X1 and X2. Each variables lies between 0 and  
1. I want

to do a constrained regression such that a0 and (1-a) 0

for the model:

Y = a*X1 + (1-a)*X2



It would not accomplish the constraint that a  0 but you could  
accomplish the other constraint within an lm fit:


X3 - X1-X2
lm(Y ~ X3 + offset(X2) )

Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2)
Y ~ intercept + beta1*X1 + (1 -beta1)*X2

... so beta1 is a.

In the case beta  0 then I suppose a would be assigned 0. This  
might be accomplished within an iterative calculation framework by  
a large penalization for negative values. In a reply (1) to a  
question by Carlos Alzola in 2008 on rhalp, Berwin Turlach offered  
a solution to a similar problem ( sum(coef) == 1 AND coef non- 
negative). Modifying his code to incorporate the above strategy  
(and choosing two variables for which parameter values might be  
inside the constraint boundaries) we get:


library(MASS)   ## to access the Boston data
 designmat - model.matrix(medv~I(age-lstat) +offset(lstat),  
data=Boston)

 Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat,
 Boston$medv)
 Amat - cbind(1, diag(NROW(Dmat)))
 bvec - c(1,rep(0,NROW(Dmat)))
 meq - 1
library(quadprog)
 res - solve.QP(Dmat, dvec, Amat, bvec, meq)

 zapsmall(res$solution)
[1] 0.686547 0.313453

Turlach specifically advised against any interpretation of this  
particular result which was only contructed to demonstrate the  
mathematical mechanics.




I tried the help on the constrained regression in R but I concede  
that it

was not helpful.


I must not have that package installed because I got nothing that  
appeared to be useful with ??constrained regression .



David Winsemius, MD
West Hartford, CT

1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html

__


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained Regression

2010-10-31 Thread Jim Silverton
I thought I would 'add' some meat to the problem I sent.  This is all I know
(1) f = a*X1 + (1-a)*X2
(2) I know n values of f and X1 which happens to be probabilities
(3) I know nothing about X2 except that it also lies in (0,1)
(4) X1 is the probability under the null (fisher's exact test) and X2 is the
alternative. But I have no idea what the alternative to fisher's exact test
can be..
(5) I need to estimate a (which is supposed to be a proportion).
(6) I was thinking about imputing values from (0,1) using  a beta as the
values of X2.

Any help is greatly appreciated.

Jim...

On Sun, Oct 31, 2010 at 1:44 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote:

  Have you tried the 'sos' package?


 I have, and I am taking this opportunity to load it with my .Rprofile to
 make it more accessible. It works very well. Very clean display. I also have
 constructed a variant of RSiteSearch that I find more useful than the
 current defaults.

 rhelpSearch - function(string,
  restrict = c(Rhelp10, Rhelp08, Rhelp02, functions
 ),
   matchesPerPage = 100, ...) {
  RSiteSearch(string=string,
  restrict = restrict,
  matchesPerPage = matchesPerPage, ...)}



 install.packages('sos') # if not already installed
 library(sos)
 cr - ???'constrained regression' # found 149 matches
 summary(cr) # in 69 packages
 cr # opens a table in a browser listing all 169 matches with links to the
 help pages

 However, I agree with Ravi Varadhan:  I'd want to understand the
 physical mechanism generating the data.  If each is, for example, a
 proportion, then I'd want to use logistic regression, possible after some
 approximate logistic transformation of X1 and X2 that prevents logit(X) from
 going to +/-Inf.  This is a different model, but it achieves the need to
 avoid predictions of Y going outside the range (0, 1).


 No argument. I defer to both of your greater experiences in such problems
 and your interest in educating us less knowledgeable users. I also need to
 amend my suggested strategy in situations where a linear model _might_ be
 appropriate, since I think the inclusion of the surrogate variable in the
 solve.QP setup is very probably creating confusion. After reconsideration I
 think one should keep the two approaches separate. These are two approaches
 to the non-intercept versions of the model that yield the same estimate (but
 only because the constraints do not get invoked ):

  lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston)

 Call:
 lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston)

 Coefficients:
 I(age - lstat)
0.1163


  library(MASS)   ## to access the Boston data
   designmat - model.matrix(medv~age+lstat-1, data=Boston)

   Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat,
 +  Boston$medv)
Amat - cbind(1, diag(NROW(Dmat)));  bvec - c(1,rep(0,NROW(Dmat)));
 meq - 1
   library(quadprog);
   res - solve.QP(Dmat, dvec, Amat, bvec, meq)
  zapsmall(res$solution)  # zapsmall not really needed in this instance
 [1] 0.1163065 0.8836935

 --
 David.



 Spencer


 On 10/31/2010 9:01 AM, David Winsemius wrote:


 On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote:

  Hello everyone,
 I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I
 want
 to do a constrained regression such that a0 and (1-a) 0

 for the model:

 Y = a*X1 + (1-a)*X2



 It would not accomplish the constraint that a  0 but you could
 accomplish the other constraint within an lm fit:

 X3 - X1-X2
 lm(Y ~ X3 + offset(X2) )

 Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2)
Y ~ intercept + beta1*X1 + (1 -beta1)*X2

 ... so beta1 is a.

 In the case beta  0 then I suppose a would be assigned 0. This might be
 accomplished within an iterative calculation framework by a large
 penalization for negative values. In a reply (1) to a question by Carlos
 Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar
 problem ( sum(coef) == 1 AND coef non-negative). Modifying his code to
 incorporate the above strategy (and choosing two variables for which
 parameter values might be inside the constraint boundaries) we get:

 library(MASS)   ## to access the Boston data
  designmat - model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston)
  Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat,
  Boston$medv)
  Amat - cbind(1, diag(NROW(Dmat)))
  bvec - c(1,rep(0,NROW(Dmat)))
  meq - 1
 library(quadprog)
  res - solve.QP(Dmat, dvec, Amat, bvec, meq)

  zapsmall(res$solution)
 [1] 0.686547 0.313453

 Turlach specifically advised against any interpretation of this
 particular result which was only contructed to demonstrate the mathematical
 mechanics.


 I tried the help on the constrained regression in R but I concede that
 it
 was not helpful.


 I must not have that package installed because I got 

Re: [R] Constrained Regression

2010-10-31 Thread Spencer Graves


in line


On 10/31/2010 6:26 PM, Jim Silverton wrote:

I thought I would 'add' some meat to the problem I sent.  This is all I know
(1) f = a*X1 + (1-a)*X2


How do you know f = a*X1 + (1-a)*X2?


Why does this relationship make more sense than, e.g., log(f/(1-f)) = 
a*X1 + (1-a)*X2?



(2) I know n values of f and X1 which happens to be probabilities


What do you mean that f and X1 are probabilities?  Where do they come from?


Are the the results of binomial experiments like tosses of a biased coin?



(3) I know nothing about X2 except that it also lies in (0,1)
(4) X1 is the probability under the null (fisher's exact test) and X2 is the
alternative. But I have no idea what the alternative to fisher's exact test
can be..


How do you compute X1?  Do you compute it from data?  If yes, then it's 
more like an estimate of a probability rather than a probability 
itself.  Ditto for X2.



The preferred method for estimating almost anything whenever feasible is 
to write a likelihood function, i.e., the probability of data as a 
function of unknown parameters, then estimate those parameters to 
maximize the likelihood.  Even most nonparametric procedures can be 
expressed this way for an appropriately chosen probability model.



In most situations, I prefer to eliminate constraints by 
transformations, replacing f by ph = log(f/(1-f), X1 and X2 by xi1 = 
log(X1/(1-X1)) and X2 by xi2 = log(X2/(1-X2)), a by alpha = 
log(a/(1-a)), then write a relationship between these unconstrained 
variables and try to estimate alpha by maximum likelihood.



Hope this helps.
Spencer


(5) I need to estimate a (which is supposed to be a proportion).
(6) I was thinking about imputing values from (0,1) using  a beta as the
values of X2.

Any help is greatly appreciated.

Jim...

On Sun, Oct 31, 2010 at 1:44 PM, David Winsemiusdwinsem...@comcast.netwrote:


On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote:

  Have you tried the 'sos' package?
I have, and I am taking this opportunity to load it with my .Rprofile to
make it more accessible. It works very well. Very clean display. I also have
constructed a variant of RSiteSearch that I find more useful than the
current defaults.

rhelpSearch- function(string,
  restrict = c(Rhelp10, Rhelp08, Rhelp02, functions
),
   matchesPerPage = 100, ...) {
  RSiteSearch(string=string,
  restrict = restrict,
  matchesPerPage = matchesPerPage, ...)}




install.packages('sos') # if not already installed
library(sos)
cr- ???'constrained regression' # found 149 matches
summary(cr) # in 69 packages
cr # opens a table in a browser listing all 169 matches with links to the
help pages

 However, I agree with Ravi Varadhan:  I'd want to understand the
physical mechanism generating the data.  If each is, for example, a
proportion, then I'd want to use logistic regression, possible after some
approximate logistic transformation of X1 and X2 that prevents logit(X) from
going to +/-Inf.  This is a different model, but it achieves the need to
avoid predictions of Y going outside the range (0, 1).


No argument. I defer to both of your greater experiences in such problems
and your interest in educating us less knowledgeable users. I also need to
amend my suggested strategy in situations where a linear model _might_ be
appropriate, since I think the inclusion of the surrogate variable in the
solve.QP setup is very probably creating confusion. After reconsideration I
think one should keep the two approaches separate. These are two approaches
to the non-intercept versions of the model that yield the same estimate (but
only because the constraints do not get invoked ):


lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston)

Call:
lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston)

Coefficients:
I(age - lstat)
0.1163



library(MASS)   ## to access the Boston data
  designmat- model.matrix(medv~age+lstat-1, data=Boston)
  Dmat-crossprod(designmat, designmat); dvec- crossprod(designmat,

+  Boston$medv)

   Amat- cbind(1, diag(NROW(Dmat)));  bvec- c(1,rep(0,NROW(Dmat)));

meq- 1

  library(quadprog);
  res- solve.QP(Dmat, dvec, Amat, bvec, meq)
zapsmall(res$solution)  # zapsmall not really needed in this instance

[1] 0.1163065 0.8836935

--
David.



 Spencer


On 10/31/2010 9:01 AM, David Winsemius wrote:


On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote:

  Hello everyone,

I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I
want
to do a constrained regression such that a0 and (1-a)0

for the model:

Y = a*X1 + (1-a)*X2



It would not accomplish the constraint that a  0 but you could
accomplish the other constraint within an lm fit:

X3- X1-X2
lm(Y ~ X3 + offset(X2) )

Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2)
Y ~ intercept + beta1*X1 + (1 -beta1)*X2

... so beta1 is a.

In the case beta  0 then I suppose a would be assigned 0. This 

Re: [R] Constrained regression

2008-03-03 Thread Mike Cheung
Dear Carlos,

One approach is to use structural equation modeling (SEM). Some SEM
packages, such as LISREL, Mplus and Mx, allow inequality and nonlinear
constraints. Phantom variables (Rindskopf, 1984) may be used to impose
inequality constraints. Your model is basically:
y = b0 + b1*b1*x1 + b2*b2*x2 +...+ bp*bp*xp + e
1 = b1*b1 + b2*b2 +...+ bp*bp

Alternatively, you can set some condition bounds on the parameter
estimates. Then you only have to impose the second constraint.

Rindskopf, D. (1984). Using phantom and imaginary latent variables to
parameterize constraints in linear structural models. Psychometrika,
49, 37-47.

Regards,
Mike
-- 
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

On Mon, Mar 3, 2008 at 11:52 AM, Carlos Alzola [EMAIL PROTECTED] wrote:
 Dear list members,

  I am trying to get information on how to fit a linear regression with
  constrained parameters. Specifically, I have 8 predictors , their
  coeffiecients should all be non-negative and add up to 1. I understand it is
  a quadratic programming problem but I have no experience in the subject. I
  searched the archives but the results were inconclusive.

  Could someone provide suggestions and references to the literature, please?

  Thank you very much.

  Carlos

  Carlos Alzola
  [EMAIL PROTECTED]
  (703) 242-6747


 [[alternative HTML version deleted]]

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constrained regression

2008-03-03 Thread Berwin A Turlach
G'day Carlos,

On Mon, Mar 3, 2008 at 11:52 AM 
Carlos Alzola [EMAIL PROTECTED] wrote:

  I am trying to get information on how to fit a linear regression
 with constrained parameters. Specifically, I have 8 predictors ,
 their coeffiecients should all be non-negative and add up to 1. I
 understand it is a quadratic programming problem but I have no
 experience in the subject. I searched the archives but the results
 were inconclusive.

  Could someone provide suggestions and references to the
 literature, please?

A suggestion:

 library(MASS)   ## to access the Boston data
 designmat - model.matrix(medv~., data=Boston)
 Dmat - crossprod(designmat, designmat)
 dvec - crossprod(designmat, Boston$medv)
 Amat - cbind(1, diag(NROW(Dmat)))
 bvec - c(1, rep(0,NROW(Dmat))
 meq - 1
 library(quadprog)
 res - solve.QP(Dmat, dvec, Amat, bvec, meq)

The solution seems to contain values that are, for all practical
purposes, actually zero:

 res$solution
 [1]  4.535581e-16  2.661931e-18  1.016929e-01 -1.850699e-17
 [5]  1.458219e-16 -3.892418e-15  8.544939e-01  0.00e+00
 [9]  2.410742e-16  2.905722e-17 -5.700600e-20 -4.227261e-17
[13]  4.381328e-02 -3.723065e-18

So perhaps better:

 zapsmall(res$solution)
 [1] 0.000 0.000 0.1016929 0.000 0.000 0.000
 [7] 0.8544939 0.000 0.000 0.000 0.000 0.000
[13] 0.0438133 0.000

So the estimates seem to follow the constraints.

And the unconstrained solution is:

 res$unconstrainted.solution
 [1]  3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02
 [5]  2.686734e+00 -1.776661e+01  3.809865e+00  6.922246e-04
 [9] -1.475567e+00  3.060495e-01 -1.233459e-02 -9.527472e-01
[13]  9.311683e-03 -5.247584e-01

which seems to coincide with what lm() thinks it should be:

 coef(lm(medv~., Boston))
  (Intercept)  crimzn indus  chas 
 3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02  2.686734e+00 
  noxrm   age   dis   rad 
-1.776661e+01  3.809865e+00  6.922246e-04 -1.475567e+00  3.060495e-01 
  tax   ptratio black lstat 
-1.233459e-02 -9.527472e-01  9.311683e-03 -5.247584e-01 

So there seem to be no numeric problems.  Otherwise we could have done
something else (e.g calculate the QR factorization of the design
matrix, say X, and give the R factor to solve.QP, instead of
calculating X'X and giving that one to solve.QP).

If the intercept is not supposed to be included in the set of
constrained estimates, then something like the following can be done:

 Amat[1,] - 0
 res - solve.QP(Dmat, dvec, Amat, bvec, meq)
 zapsmall(res$solution)
 [1] 6.073972 0.00 0.109124 0.00 0.00 0.00 0.863421
 [8] 0.00 0.00 0.00 0.00 0.00 0.027455 0.00

Of course, since after the first command in that last block the second
column of Amat contains only zeros
 Amat[,2]
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
we might as well have removed it (and the corresponding entry in bvec)
 Amat - Amat[, -2]
 bvec - bvec[-2]
before calling solve.QP().

Note, the Boston data set was only used to illustrate how to fit such 
models, I do not want to imply that these models are sensible for these
data. :-)

Hope this helps.

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.