"Ronny Richardson" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > Several people wrote to tell me that my results from Excel were correct and > the Bluman textbook had the wrong answer. I'm continuing to experiment with > ANOVA and I'm hoping that the list can help me answer two additional > questions I have. > > First, in experimenting around with a 3x3 ANOVA without replication, I > filled the cells with =RAND() to generate random numbers and I happened to > get the following: > > SS df MS F > Rows -2.105562167 2 -1.052781083 -1.336268078 > Columns -0.226230909 2 -0.113115455 -0.143574551 > Error 3.151406819 4 0.787851705 > Total 0.819613743 8 > > > Unfortunately, I did not convert the random number functions to numbers so > the specific data I used to generate this table was lost when the worksheet > recalculated. > > I've repeated this experiment several times and sometimes the sums of > squares are positive and sometimes one or more of them are negative. Is > this a bug in Excel? I don't see how a sum of *square* can be negative even > though I am using random numbers so all of the SS should be in error. > > Second, Excel offers a two-factor ANOVA with and without replication. None > of the examples of two-way ANOVA in any of the textbooks that I own shows a > two-way ANOVA without replication. It seems counter intuitive to me to > perform ANOVA with a sample size of one in each cell. Am I missing > something here? > > The output from with and without replication is significantly different as > well. Using made up data, without replication, I get the following: > > SS df MS F P-value F crit > Rows 56.25 1 56.25 25 0.1257 161.4462 > Columns 110.25 1 110.25 49 0.0903 161.4462 > Error 2.25 1 2.25 > Total 168.75 3 > > > Using made up data, with replication, I get the following > > SS df MS F P-value F crit > Sample 234.38 1 234.38 130.8140 0.0000 4.3513 > Columns 693.38 1 693.38 387.0000 0.0000 4.3513 > Interaction 0.38 1 0.38 0.2093 0.6522 4.3513 > Within 35.83 20 1.79 > Total 963.96 23 > > It makes no sense to me to call them "rows and columns" without replication > and "sample and columns" with replication. It sort of makes sense not to > get interaction without replication. I am assuming that with a sample size > of one in each cell that there is not enough information to compute > interactions. > > > Ronny Richardson > -------------------------------------------------------------------------- --------- The basic problem here relates to computational algorithms.
If you would first stabilize the random number set, by doing a copy (in EXCEL) of the random number set and doing a Paste Special (values) to a new worksheet. Then centralize the data set by subtracting a number (value) close to the overall average of the data set. It does not have to be the exact mean. The structure should be the same. Then do the ANOVA on the centralized data, and the negative SS problem will disappear. In the NIST ANOVA test suite, for all data sets, after centralizing, the EXCEL routines will give accurate SS and MS values, more accurate than what McCullough reports for STATA. Unfortunately, I cannot provide the worksheets, tables and charts to show this in the ASCII formats of this newsgroup. David Heiser . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
