----- Original Message -----
From: Joe Ward
Sent: Monday, February 14, 2000 4:19 AM
Subject: Re: Linear Regression with known intercept (Long Message)

 
 
Joe Ward Sent:
 
 
Hi, Mark --
 


Glad you sent this Email.  It is a nice and simple example of the use
of Prediction/Regression/Linear Models -- which should be one of the
important objectives of a FIRST NON-CALCULUS-BASED STATISTICS COURSE.

Consider, first, the Simple Regression Model:

Y = a1*U +  a2*X + E1

where
Y  =
a vector containing observations on a  dependent or response variable.
U  = a predictor (vector) containing all 1's.
 (THE MOST NEGLECTED AND NON-UNDERSTOOD PREDICTOR OF ALL)
X  = another predictor with any elements -- could be BINARY (0,1).

E1= the Error or Residual vector.

a1 = least-squares regression coefficient of U
        (this is frequently referred to as the "Y-intercept").
a2 = least-squares regression coefficient of X
        (this is frequently referred to as the "Slope".

A powerful capability to give students who are comfortable with
Algebra is to be able to IMPOSE ANY DESIRED
LINEAR RESTRICTIONS
ON A LINEAR MODEL OF THE FORM:

Y = a1*X1 + a2*X2 + ... + ap *Xp + E

This capability is useful in many applications BESIDES STATISTICS.

Now, to your neat example:

"If I want to find the least squares estimator of the slope of a simple
linear regression model where my intercept is known, ...  "

You wish to impose the restriction that-
a1 = k (a known value)

Imposing that restriction on Model 1 above gives:

Y = k*U +  a2*X + E2

The only unknown regression coefficient is a2 which I will rename as:

Let b2 = a2 to remind us that the numerical value of the coefficient of X
in Model 1 is most likely different from the value in Model 2.

Then,
Y = k*U + b2*X + E2

Since k*U is known, the least-squares value for b2 is obtained from:

Y-k*U = b2*X + E2

or letting Y-k*U be designated by a single symbol, W

W = b2*X + E2

and the least-squares value of b2 for Model 2 (and for any ONE-PREDICTOR model) is:

    b2 = (W'X)/(X'X)  =  Sum(wi*xi)/Sum (xi*xi)
         
b2 is the "slope of the line which is "forced by the restriction"
a1 = k 

Most software now allows one to find the value of b2 by forcing
an option that requires that the vector U be omitted as a predictor.
If you have good software available, the software will produce the
standard errors of a1 and a2 by solving equation 1 and the standard
error b2 by solving equation 2.

---------------
Now, if it is "interesting" to TEST AN HYPOTHESIS THAT --

a1 = k

Then  a statistic student may want to compute:

F = (SSQE2 - SSQE1)/(2-1)   
      -----------------------------------
       (SSQE1)/(n-2)

F = (SSQE2 - SSQE1)/1   
      -----------------------------------
       (SSQE1)/(n-2)

and since F(1,df2)  = t^2(df2)

t(df2) = sqrt(F(1,df2)
)

This IS a "t-test".

And, perhaps, from this value of "t" another statistics student
might want to compute the Standard Error
of a1, and then compute
a Confidence Interval.

The astute student can compute the Standard Error from:

      t = Statistic/Standard Error

but sine the numerica
l values of t and the "Statistic" are known
we have:

Standard Error = Statistic/t
 
In this particular case,
 
Standard Error = a1/t

This procedure allows for easy computation of the "Standard
Error" of any of the 'weights' (intercept or slope) in a
regression model and in the more general case, any linear
combination of the weights in a multiple linear regression model.
 
Sorry for the length of this message, but I couldn't resist promoting the
use of Prediction/Regression/Linear Models for ALL STUDENTS.

--- Joe
 
-------------------------------------------------------------------------------------
Joe's equation for b2 and specifying no intercept, results in the sum of residuals not being zero. The forcing of no intercept and the use of the matrix solution for the slope, results in a bias. The solution is not a true maximum likelihood solution.
 
DAH
 
 
 
 
 
 
 

 

Reply via email to