Re: equation for constrained linear regression

Joe Ward Sun, 23 Jun 2002 12:52:18 -0700

----- Original Message -----
From: "Robert J. MacG. Dawson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, June 20, 2002 8:48 AM
Subject: Re: equation for constrained linear regression

> Presumably if one were stranded on a desert island with only Microsoft
> Office, one could get the through-the-origin regression line correctly
> by putting (-x,-y) into the data set for each (x,y). Some regression
> diagnostics would be all fouled up but it is my understanding that you
> don't get good regression diagnostics from Excel anyway.
>
> However, I do wonder why this would be done; the through-the-origin
> constraint seems in many cases to imply that data with near-0 x
> coordinates ought to have not only small y values but also small
> variance. In most cases (not all) one would probably do better to
> log-transform and fit a slope-constrained-to-1 OLS model; this is
> equivalent to taking the geometric mean of all the ratios (or, indeed,
> the ratio of the geometric means)
>
> -Robert Dawson
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================

Hi, Robert --

As you appropriately ask:
"However, I do wonder why this would be done"

MY PRIMARY USE OF THE "THROUGH-THE-ORIGIN" (OR NO-INTERCEPT) OPTION
IS AS AN INSTRUCTIONAL STRATEGY TO "UNDERSTANDING" REGRESSION /LINEAR MODELS.

It is easy to teach students about "cell means" using the "no-int" options (available  in MOST REGRESSION COMPUTER PROGRAMS),
since the "no-int" option produces "averages" from the least-squares solution when you have a set of mutually exclusive groups coded as 'dummy' variables.   -- and averages are the
numerical values that students can understand for prediction in situations that involve "mutually exclusive categories".

Then later, students can learn about how to use the "default" when appropriate. But for "understanding" the
difference between the "default" and the "no-int" option I feel that it is best to START WITH THE NO-INT OPTION
FROM AN INSTRUCTIONAL POINT-OF-VIEW.

The lack of understanding of the difference between the two situations probably is why the WRONG TOTAL
SUM OF SQUARES was used in the Excel "no-int" situation.

Many comments on various lists show that many folks don't understand the difference between the "default" and
"through-the-origin (no-int)" models.. If they had been first taught about regression via the "no-int" model, they might
better understand what the "default" is doing.

-- Joe
.

Re: equation for constrained linear regression

Reply via email to