Joe Ward
Sent:
Hi, Mark
--
Glad you sent this
Email. It is a nice and simple example of the use
of
Prediction/Regression/Linear Models -- which should be one of
the
important objectives of a FIRST NON-CALCULUS-BASED STATISTICS
COURSE.
Consider, first, the Simple Regression Model:
Y = a1*U
+ a2*X + E1
where
Y = a vector
containing observations on a dependent or response
variable.
U = a predictor (vector) containing
all 1's.
(THE MOST NEGLECTED AND NON-UNDERSTOOD PREDICTOR OF
ALL)
X = another predictor with any elements -- could be BINARY
(0,1).
E1= the Error or Residual vector.
a1 = least-squares regression coefficient of
U
(this is frequently referred
to as the "Y-intercept").
a2 = least-squares regression coefficient of
X
(this is frequently referred
to as the "Slope".
A powerful capability to give students who are
comfortable with
Algebra is to be able to IMPOSE ANY DESIRED LINEAR RESTRICTIONS
ON A LINEAR MODEL OF THE
FORM:
Y = a1*X1 + a2*X2 + ... + ap *Xp + E
This capability is useful in many applications BESIDES
STATISTICS.
Now, to your neat example:
"If I want to find the
least squares estimator of the slope of a simple
linear regression model
where my intercept is known, ... "
You wish to impose the
restriction that-
a1 = k (a known value)
Imposing that restriction
on Model 1 above gives:
Y = k*U + a2*X + E2
The only
unknown regression coefficient is a2 which I will rename as:
Let b2 =
a2 to remind us that the numerical value of the coefficient of X
in Model
1 is most likely different from the value in Model 2.
Then,
Y =
k*U + b2*X + E2
Since k*U is known, the least-squares value for b2 is
obtained from:
Y-k*U = b2*X + E2
or
letting Y-k*U be designated by a single symbol, W
W = b2*X + E2
and the least-squares value of b2 for Model 2
(and for any ONE-PREDICTOR model) is:
b2 =
(W'X)/(X'X) = Sum(wi*xi)/Sum
(xi*xi)
b2 is
the "slope of the line which is "forced by the restriction"
a1 =
k
Most software now allows one to find the value of b2 by
forcing
an option that requires that the vector U be omitted as a
predictor.
If you have good software available, the software will
produce the
standard errors of a1 and a2 by solving equation 1 and the
standard
error b2 by solving equation 2.
---------------
Now, if it is "interesting" to TEST AN
HYPOTHESIS THAT --
a1 = k
Then a statistic student may
want to compute:
F = (SSQE2 - SSQE1)/(2-1)
-----------------------------------
(SSQE1)/(n-2)
F = (SSQE2 - SSQE1)/1
-----------------------------------
(SSQE1)/(n-2)
and since F(1,df2) = t^2(df2)
t(df2) =
sqrt(F(1,df2))
This IS a
"t-test".
And, perhaps, from this value of "t" another statistics
student
might want to compute the Standard Error of
a1, and then compute
a Confidence
Interval.
The astute student can compute the
Standard Error from:
t =
Statistic/Standard Error
but sine the numerical values of t and the "Statistic" are known
we
have:
Standard Error = Statistic/t
In this particular case,
Standard Error =
a1/t
This procedure allows for easy computation
of the "Standard
Error" of any of the 'weights' (intercept or slope) in a
regression model and in the more general case, any linear
combination
of the weights in a multiple linear regression model.
Sorry for the length of this message, but I couldn't
resist promoting the
use of Prediction/Regression/Linear Models for ALL
STUDENTS.
--- Joe
-------------------------------------------------------------------------------------
Joe's equation for b2 and specifying no intercept,
results in the sum of residuals not being zero. The forcing of no intercept
and the use of the matrix solution for the slope, results in a
bias. The solution is not a true maximum likelihood solution.
DAH