Hi,

I'm wrestling with an analysis of a dataset, which I previously 
analyzed in SYSTAT, but am now converting to R and was doing a 
re-analysis. I noticed, however, that the same model yields different 
results (different sums of squares) from the two programs. I first 
thought this might be because the two programs use different 
calculations to get the sums of squares, but the problem persisted 
even after I specified type III sums of squares. Can anyone help me 
by clarifying why there is this discrepancy?

The data table is:

host    size2   maladapt        increase
A       yes      35      21
A       yes      30      13
A       no       73     -6
A       yes     22       3
C       yes      19     -1
A       no      53      1
C       no       48     -27
A       yes      32      26
A       yes     14       1
A       no       83     42
A       yes      19     -3
A       no      66       -7
C       no      69      -14
A       yes      30     30
C       no       69     -22
A       yes      10      6
C       no      65      -15
A       yes      11     4
A       yes      15     15
A       no      77      30
C       yes     11       11
A       no      48       -4
C       yes     29      -4
A       yes     0       0
C       no      69       -2
A       yes     10       -40
C       yes      8      -6
C       no       91     -2
C       no      65      13
A       yes     12       0
C       yes     16       -26
C       yes     38      -12
A       no      43      20
C       no      81       -7
A       yes      9      9
C       no      100     25
A       yes     18       12
C       yes     27       -6
A       yes     11       -3

The dialogue in R is as follows:
> > library(car)
>
> > read.table(file="/Users/lukeharmon/Desktop/glmnosil.txt",
>header=T)->nn
> > attach(nn)
> > ls(2)
>[1] "host"     "increase" "maladapt" "size2"    "size4"
> > lm(maladapt~host*increase*size2)
>
>Call:
>lm(formula = maladapt ~ host * increase * size2)
>
>Coefficients:
>             (Intercept)                    hostC
>increase                 size2yes
>                59.54144                 17.13828
>0.34487                -44.41381
>          hostC:increase           hostC:size2yes
>increase:size2yes  hostC:increase:size2yes
>                 0.30449                -12.50558
>0.03766                 -0.90697
>
> > lm(maladapt~host*increase*size2)->fm
> > Anova(fm, type="III")
>Anova Table (Type III tests)
>
>Response: maladapt
>                      Sum Sq Df  F value    Pr(>F)
>(Intercept)         18348.5  1 152.9683 1.595e-13 ***
>host                  920.9  1   7.6774  0.009366 **
>increase              278.4  1   2.3210  0.137773
>size2                7447.0  1  62.0841 6.806e-09 ***
>host:increase         105.1  1   0.8758  0.356584
>host:size2            266.9  1   2.2252  0.145880
>increase:size2          2.0  1   0.0171  0.896902
>host:increase:size2   332.3  1   2.7703  0.106108
>Residuals            3718.4 31
>---
>Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Contrast this with the results from SYSTAT


SourceSum-of-SquaresdfMean-SquareF-ratioP
HOST$808.9491808.9496.7440.014
SIZE2$17525.418117525.418146.1060.000
INCREASE540.5791540.5794.5070.042
SIZE2$*HOST$266.9151266.9152.2250.146
SIZE2$*INCREASE279.3891279.3892.3290.137
HOST$*INCREASE35.869135.8690.2990.588
SIZE2$*HOST$*INCREASE332.2931332.2932.7700.106
Error3718.44131119.950


I've been trying to find anything in the documentation for anova() 
that would give a default that is different from what is in SYSTAT, 
but part of the problem is that SYSTAT is somewhat opaque as to its 
calculations, so it is hard to contrast the two. I would really 
really welcome feedback as to what may cause this discrepancy.

Thanks very much for your help,

Dan Bolnick
Section of Integrative Biology
University of Texas at Austin 
        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to