"Ronny Richardson" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> Several people wrote to tell me that my results from Excel were correct
and
> the Bluman textbook had the wrong answer. I'm continuing to experiment
with
> ANOVA and I'm hoping that the list can help me answer two additional
> questions I have.
>
> First, in experimenting around with a 3x3 ANOVA without replication, I
> filled the cells with =RAND() to generate random numbers and I happened to
> get the following:
>
>                    SS    df                MS              F
> Rows     -2.105562167     2      -1.052781083    -1.336268078
> Columns  -0.226230909     2      -0.113115455    -0.143574551
> Error     3.151406819     4       0.787851705
> Total     0.819613743     8
>
>
> Unfortunately, I did not convert the random number functions to numbers so
> the specific data I used to generate this table was lost when the
worksheet
> recalculated.
>
> I've repeated this experiment several times and sometimes the sums of
> squares are positive and sometimes one or more of them are negative. Is
> this a bug in Excel? I don't see how a sum of *square* can be negative
even
> though I am using random numbers so all of the SS should be in error.
>
> Second, Excel offers a two-factor ANOVA with and without replication. None
> of the examples of two-way ANOVA in any of the textbooks that I own shows
a
> two-way ANOVA without replication. It seems counter intuitive to me to
> perform ANOVA with a sample size of one in each cell. Am I missing
> something here?
>
> The output from with and without replication is significantly different as
> well. Using made up data, without replication, I get the following:
>
>                   SS     df      MS     F     P-value       F crit
> Rows           56.25      1   56.25    25      0.1257     161.4462
> Columns       110.25      1  110.25    49      0.0903     161.4462
> Error           2.25      1    2.25
> Total         168.75      3
>
>
> Using made up data, with replication, I get the following
>
>                   SS     df      MS         F    P-value     F crit
> Sample        234.38      1  234.38  130.8140     0.0000     4.3513
> Columns       693.38      1  693.38  387.0000     0.0000     4.3513
> Interaction     0.38      1    0.38    0.2093     0.6522     4.3513
> Within         35.83     20    1.79
> Total         963.96     23
>
> It makes no sense to me to call them "rows and columns" without
replication
> and "sample and columns" with replication. It sort of makes sense not to
> get interaction without replication. I am assuming that with a sample size
> of one in each cell that there is not enough information to compute
> interactions.
>
>
> Ronny Richardson
> --------------------------------------------------------------------------
---------
The basic problem here relates to computational algorithms.

If you would first stabilize the random number set, by doing a copy (in
EXCEL) of the random number set and doing a Paste Special (values) to a new
worksheet.

Then centralize the data set by subtracting a number (value) close to the
overall average of the data set. It does not have to be the exact mean. The
structure should be the same.

Then do the ANOVA on the centralized data, and the negative SS problem will
disappear.

In the NIST ANOVA test suite, for all data sets, after centralizing, the
EXCEL routines will give accurate SS and MS values, more accurate than what
McCullough reports for STATA.

Unfortunately, I cannot provide the worksheets, tables and charts to show
this in the ASCII formats of this newsgroup.

David Heiser


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to