Rich Ulrich did you a good turn, and complete.  Forcing the regression
through 0 is a no-no, unless you are fitting to a model that says this is
true.  You haven't reached that point yet, IMHO.

subtracting the 'outer' from the 'whole' before doing the regression is a
good idea.

Since you intend in the end to use the outer to predict the inner (whole
- outer), you are working with the outer as an independent variable.
Presumably there is minimal measurement error in the outer value, when
you are doing it in the field.

I am not as clear as Rich on how to handle the zero outer case.  In a
separate analysis, you might want to consider the probability (observed)
of 0 outer, > 0 inner.  If you don't include the 0 outer case in your
regression data, then it would:

Predict the degree of inner infestation, when there is observed
infestation on the outer.

This may be what you want, anyway.

Besides, it is not 0 observed.  It is "less than minimum detectable using
the procedure defined" instead of zero.  I once had a case where a spring
maker wanted to evaluate the wire at 10X, because that was what they had
and everyone in the industry (he said) did it that way.  We needed 40X to
reliably see the wire flaw that caused our grief.  There was no question
about the causality relationship between the flaw and the grief, but not
every spring maker has a scanning microscope.

You have a case where there may be a minimum size of insect damage.  In
this case, zero would mean zero if your detection system could easily see
say, 1/2 the minimum.  But we're getting back to the technology here.
That's your bag.

Cheers,
Jay

DaveM wrote:

> I am a novice at stats. I am working with a biological system. An
> insect infests a certain plant part, the scales. To determine the
> level of infestation I examine the WHOLE scale, inside and outside,
> dissecting it under a microscope. Some portion of the infestation
> occurs on the OUTSIDE of the scale. The outside can be quickly
> examined in the field w/ the naked eye. I have a series of 245
> observations; each includes the examination of both the WHOLE and
> OUTSIDE of 100 scales (5 scales from each of 20 plants, all equivalent
> aged on plant).
>
> I want to correlate the OUTSIDE (easily observed) to the WHOLE (time
> consuming). I want to develop a regression line (equation) where I
> could in the field quickly observe the OUTSIDE and then express the
> level of infestation as percent infested of the WHOLE.
>
> The WHOLE variable is what I would call the true infestation level.
> The OUTSIDE is some lesser part of that. Here are some questions?
>
> Q1. Is it correct to call the WHOLE the independent variable, and
> assign it as x on the regression graph? OUTSIDE would be the dependent
> variable?
>
> In my stat package, with linear regression I can choose the option
> "fit constant" which I take is equivalent to "with constant" and "not
> forced thru origin". Toggling this option changes various statistical
> values.
>
>                         Thru origin             Fit constant
> Obs                     245                     245
> Residuals               244                     243
> Pearson correlation     0.9398                  0.8702
> R sq.                   0.8832                  0.7573
> Adj. R sq.              0.8828                  0.7563
> Resid. Mean Square      147.636                 82.9497
> S.D.                    12.4506                 9.10767
> Stand. Error(OUTSIDE)   0.06011                 0.06812
>
> Q2. Do I force thru origin. Biologically is that appropriate? With
> this plant/insect system if WHOLE is zero, then OUTSIDE is zero. If
> Whole >zero OUTSIDE can be zero. In fact, at the lower infestation
> levels WHOLE needs to get to values of >= 5% before OUTSIDE
> infestation starts to consistently show. The fit of the line looks
> better to me w/o going thru zero.
>
> Q3. In deciding to go thru origin or not is there a value I look to
> minimize or maximize, such as correlation or Resid. Mean Square? Which
> is more important?
>
> Q4. The end use of this is to go out to the field, observe the OUTSIDE
> and use that to predict WHOLE via a regression equation or graph. I
> make my field observations (say count 100 scales), find my OUTSIDE
> value on the y-axis, run across to the regression line and drop down
> to the x-axis WHOLE value. Any problem with that setup?
>
> Thanks,
>
> DaveM
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to