Hi Jari, I've only just seen your post and as you would now know there has been a generous reply from Philip Dixon in the interim - the full details of which will take me some time to digest. However I am writing to say thanks for the response and yes you have guessed right that we had an ever declining pool of fish to run the choice tests- larval fish are not particularly robust unfortunately.
I am thinking that Philips suggestion for a series of multiple tests along with the appropriate correction for doing so is the most palatable for me at this point in time. Andy On 24 October 2014 16:54, Jari Oksanen <[email protected]> wrote: > Andrew, > > If the 24 rows are the data you are analysing, I cannot replicate any of > your significant results within that glm framework *if* I take into account > the overdispersion. The full model with age*test interaction is saturated > and cannot be analysed at all, but the main effects model age+test can be > analysed (or either term separately). However, the results are > overdispersed, and you should use family=quasibinomial instead of > family=binomial, and then use test="F" instead of test="Chi": > > > anova(glm(cbind(prefer,avoid) ~ age+test, data=datafromthemail, > family=quasibinomial), test="F") > Analysis of Deviance Table > > Model: quasibinomial, link: logit > > Response: cbind(prefer, avoid) > > Terms added sequentially (first to last) > > > Df Deviance Resid. Df Resid. Dev F Pr(>F) > NULL 23 54.908 > age 5 11.2352 18 43.673 0.9844 0.4591 > test 3 1.5934 15 42.079 0.2327 0.8722 > > As you see, the Resid. Dev is much larger than Resid. Df for both terms in > this sequential model, and therefore you must use quasibinomial models and > F-tests -- and these give similar results as other tests. > > I could not get any results for the saturated models, and one of your > examples (below in this thread) seemed to use only one level of test and > only *six* observations: it was also saturated as you had no replicates for > your six age levels. You need replicates. > > However, I'm not sure I understood your data correctly. It looks like you > have a certain number of animals, but their number is reduced with age so > that you have a kind of censored data (animals not available in all cases). > Perhaps somebody can propose a better analysis for such a censored data, if > it is like that. > > Cheers, Jari Oksanen > > On 24/10/2014, at 10:41 AM, Andrew Halford wrote: > > > Dear Gavin, > > > > Firstly let me say that I take offence at your "bogus" comment. Just > > because I, like many others who interact on this list, often struggle > > conceptually with the overwhelming analysis choices that are required in > > our line of work doesn't give you the right to drop snide remarks as you > > see fit. > > > > My line of query is ALWAYS genuine from my perspective and I don't expect > > you or anyone else to belittle people on the public list! > > > > As it turns out my issues are not resolved. > > > > To recap.. > > > > I have run a bunch of choice chamber experiments with larval fish. > Graphing > > up the ratio of 0/1 choices produces a plot which shows to my eye, > evidence > > of a result for some of the tests, with fish appearing to make defined > > choices in the later age groups for 2 of the tests. > > > > What appears to be happening is that because there are some empty cells > in > > the later age x test interactions (the fish only took one option to the > > exclusion of the other) the errors are way out and hence preclude any > > chance of getting a significant result. If I add a single result to any > of > > the zero cells to remove the "blank" the analysis actually works more as > I > > hoped. However I doubt this is acceptable so I am hoping to get some help > > with producing an effective analysis without having to manipulate the > blank > > cells. > > > > Andrew > > > > > > > > > > On 24 October 2014 04:08, Gavin Simpson <[email protected]> wrote: > > > >> This all looks bogus to me; you've fit the data perfectly by fitting a > >> saturated model - there are no residual degrees of freedom and > >> (effectively) zero residual deviance. Things are clearly amiss because > you > >> have huge standard errors. You have 24 data points and fit a model with > 23 > >> coefficient plus the intercept; you just replaced your data with 24 new > >> data points (the values in the Estimate column of the summary() output) > >> > >> I really wouldn't bother interpreting it any further. > >> > >> HTH > >> > >> G > >> > >> On 21 October 2014 18:21, Andrew Halford <[email protected]> > wrote: > >> > >>> Hi Thierry, > >>> > >>> The multiple comparisons ran just fine but there was a ridiculous > amount > >>> of > >>> interaction combinations all of which were non-significant even though > >>> there was a highly significant interaction term. I decided to remove > test > >>> as a variable to simplify the analysis and run separate single > explanatory > >>> variable logistic regressions. I have included a result below which is > >>> still producing an outcome I cant explain. Namely, why am I getting > such a > >>> significant result for the ANOVA but when I do the tukey tests nothing > is > >>> significant? > >>> > >>>> sg_habitat > >>> Age Prefer Avoid > >>> 1 1 17 14 > >>> 2 2 20 10 > >>> 3 3 14 9 > >>> 4 4 13 12 > >>> 5 5 0 18 > >>> 6 6 0 5 > >>> > >>>> model_sg <- glm(cbind(Prefer,Avoid) ~ Age, data=sg_habitat, > >>> family=binomial) > >>> > >>>> anova(model_sg, test="Chisq") > >>> > >>> Analysis of Deviance Table > >>> > >>> Model: binomial, link: logit > >>> > >>> Response: cbind(Prefer, Avoid) > >>> > >>> Terms added sequentially (first to last) > >>> > >>> > >>> Df Deviance Resid. Df Resid. Dev Pr(>Chi) > >>> NULL 5 36.588 > >>> Age 5 36.588 0 0.000 7.243e-07 *** > >>> > >>> > >>>> mc_sg <- glht(model_sg, mcp(Age = "Tukey")) > >>> > >>>> summary(mc_sg) > >>> > >>> Simultaneous Tests for General Linear Hypotheses > >>> > >>> Multiple Comparisons of Means: Tukey Contrasts > >>> > >>> > >>> Fit: glm(formula = cbind(Prefer, Avoid) ~ Age, family = binomial, > >>> data = sg_habitat) > >>> > >>> Linear Hypotheses: > >>> Estimate Std. Error z value Pr(>|z|) > >>> 2 - 1 == 0 0.4990 0.5294 0.943 0.912 > >>> 3 - 1 == 0 0.2477 0.5593 0.443 0.997 > >>> 4 - 1 == 0 -0.1141 0.5390 -0.212 1.000 > >>> 5 - 1 == 0 -25.8473 53178.5362 0.000 1.000 > >>> 6 - 1 == 0 -24.7307 57729.9299 0.000 1.000 > >>> 3 - 2 == 0 -0.2513 0.5767 -0.436 0.997 > >>> 4 - 2 == 0 -0.6131 0.5570 -1.101 0.844 > >>> 5 - 2 == 0 -26.3463 53178.5362 0.000 1.000 > >>> 6 - 2 == 0 -25.2296 57729.9299 0.000 1.000 > >>> 4 - 3 == 0 -0.3618 0.5855 -0.618 0.985 > >>> 5 - 3 == 0 -26.0950 53178.5362 0.000 1.000 > >>> 6 - 3 == 0 -24.9783 57729.9299 0.000 1.000 > >>> 5 - 4 == 0 -25.7332 53178.5362 0.000 1.000 > >>> 6 - 4 == 0 -24.6165 57729.9299 0.000 1.000 > >>> 6 - 5 == 0 1.1167 78490.1364 0.000 1.000 > >>> (Adjusted p values reported -- single-step method) > >>> > >>> > >>> On 21 October 2014 22:53, ONKELINX, Thierry <[email protected]> > >>> wrote: > >>> > >>>> Hi Andrew, > >>>> > >>>> Please keep the mailing list in cc. > >>>> > >>>> The estimates in mc are the differences of the parameter estimates > >>> (betas) > >>>> between both levels. E.g. 5.LR -1.LR = -1.168 or 5.LR = 1.LR - 1.168 > >>>> > >>>> summary(mc) should give you the significance of those differences. > That > >>>> should work. If it doesn't, please provide more info: at least your > code > >>>> and the error message. A small reproducible example is better. > >>>> confint(mc) gives the confidence intervals of the differences. > >>>> > >>>> Best regards, > >>>> > >>>> Thierry > >>>> > >>>> ir. Thierry Onkelinx > >>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature > >>> and > >>>> Forest > >>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > >>>> Kliniekstraat 25 > >>>> 1070 Anderlecht > >>>> Belgium > >>>> + 32 2 525 02 51 > >>>> + 32 54 43 61 85 > >>>> [email protected] > >>>> www.inbo.be > >>>> > >>>> To call in the statistician after the experiment is done may be no > more > >>>> than asking him to perform a post-mortem examination: he may be able > to > >>> say > >>>> what the experiment died of. > >>>> ~ Sir Ronald Aylmer Fisher > >>>> > >>>> The plural of anecdote is not data. > >>>> ~ Roger Brinner > >>>> > >>>> The combination of some data and an aching desire for an answer does > not > >>>> ensure that a reasonable answer can be extracted from a given body of > >>> data. > >>>> ~ John Tukey > >>>> > >>>> Van: Andrew Halford [mailto:[email protected]] > >>>> Verzonden: dinsdag 21 oktober 2014 16:19 > >>>> Aan: ONKELINX, Thierry > >>>> Onderwerp: Re: [R-sig-eco] Logistic regression with 2 categorical > >>>> predictors > >>>> > >>>> Hi Thierry, > >>>> Thanks for the response. I have run your code but it seems you cant > call > >>>> the summary function, you just have to call the object on its own i.e. > >>> mc. > >>>> The results are I get are below but I am not sure how to interpret > >>> these, > >>>> exactly what does the estimate represent? How do I measure > significance > >>>> here? > >>>> > >>>> Estimate > >>>> 2.LR - 1.LR == 0 1.252e-01 > >>>> 3.LR - 1.LR == 0 -5.390e-01 > >>>> 4.LR - 1.LR == 0 1.802e-02 > >>>> 5.LR - 1.LR == 0 -1.168e+00 > >>>> 6.LR - 1.LR == 0 -2.575e+01 > >>>> 1.SD - 1.LR == 0 7.411e-02 > >>>> 2.SD - 1.LR == 0 -2.408e-01 > >>>> 3.SD - 1.LR == 0 2.675e-01 > >>>> etc etc > >>>> > >>>> Andy > >>>> > >>>> On 20 October 2014 23:04, ONKELINX, Thierry <[email protected] > > > >>>> wrote: > >>>> Dear Andrew, > >>>> > >>>> anova() and summary() test different hypotheses. anova() tests is at > >>> least > >>>> one level is different from the others. summary() tests if the > >>> coefficient > >>>> is different from zero. > >>>> > >>>> Multiple comparison of different interaction levels is probably the > most > >>>> relevant in this case. The easiest way is to make a new variable. > >>>> > >>>> snapper2$inter <- with(snapper2, interaction(age, test)) > >>>> model <- glm(cbind(prefer,avoid) ~ 0 + inter, data=snapper2, > >>>> family=binomial) > >>>> library(multcomp) > >>>> mc <- glht(model, mcp(inter = "Tukey)) > >>>> summary(mc) > >>>> > >>>> Best regards, > >>>> > >>>> ir. Thierry Onkelinx > >>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature > >>> and > >>>> Forest > >>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > >>>> Kliniekstraat 25 > >>>> 1070 Anderlecht > >>>> Belgium > >>>> + 32 2 525 02 51 > >>>> + 32 54 43 61 85 > >>>> [email protected] > >>>> www.inbo.be > >>>> > >>>> To call in the statistician after the experiment is done may be no > more > >>>> than asking him to perform a post-mortem examination: he may be able > to > >>> say > >>>> what the experiment died of. > >>>> ~ Sir Ronald Aylmer Fisher > >>>> > >>>> The plural of anecdote is not data. > >>>> ~ Roger Brinner > >>>> > >>>> The combination of some data and an aching desire for an answer does > not > >>>> ensure that a reasonable answer can be extracted from a given body of > >>> data. > >>>> ~ John Tukey > >>>> > >>>> > >>>> -----Oorspronkelijk bericht----- > >>>> Van: [email protected] [mailto: > >>>> [email protected]] Namens Andrew Halford > >>>> Verzonden: maandag 20 oktober 2014 16:06 > >>>> Aan: [email protected] > >>>> Onderwerp: [R-sig-eco] Logistic regression with 2 categorical > predictors > >>>> > >>>> Hi Listers, > >>>> > >>>> I am trying to run a logistic regression to look at the effects of > >>>> experiment type and age on the behavior of fish in a choice chamber > >>>> experiment. > >>>> > >>>> I am using the glm approach and would like some advice on how or > whether > >>>> to perform contrasts to work out what levels of Factor1 (Age) and > >>> Factor 2 > >>>> (Test) are significantly different from each other. I have not been > able > >>>> to clarify from my reading what is the appropriate approach to take > when > >>>> dealing with a significant interaction term. I am also not sure as to > >>> how > >>>> one interprets a model when all the coefficients are non-significant > but > >>>> the chi-square ANOVA shows a highly significant interaction term. > >>>> > >>>> I have graphed up the data as dot plots and there is definitely > evidence > >>>> of changes in proportions in later ages. > >>>> > >>>> I want to provide evidence for when and for which tests there was a > >>>> 'significant' change in behavior. > >>>> > >>>>> snapper2 > >>>> age test prefer avoid > >>>> 1 1 LR 15 14 > >>>> 2 1 SD 15 13 > >>>> 3 1 SG 17 14 > >>>> 4 1 SW 14 14 > >>>> 5 2 LR 17 14 > >>>> 6 2 SD 16 19 > >>>> 7 2 SG 20 10 > >>>> 8 2 SW 15 21 > >>>> 9 3 LR 10 16 > >>>> 10 3 SD 14 10 > >>>> 11 3 SG 14 9 > >>>> 12 3 SW 13 15 > >>>> 13 4 LR 12 11 > >>>> 14 4 SD 14 11 > >>>> 15 4 SG 13 12 > >>>> 16 4 SW 11 14 > >>>> 17 5 LR 4 12 > >>>> 18 5 SD 8 8 > >>>> 19 5 SG 0 18 > >>>> 20 5 SW 10 6 > >>>> 21 6 LR 0 6 > >>>> 22 6 SD 3 4 > >>>> 23 6 SG 0 5 > >>>> 24 6 SW 5 3 > >>>> > >>>>> > >>>> > >>>> > >>> > dotplot(age~prefer/avoid,data=snapper2,group=snapper2$test,cex=1.5,pch=19,ylab="age",auto.key=list(space="right",title="Tests")) > >>>> > >>>> > >>>>> out2 <- glm(cbind(prefer,avoid) ~ age*test, data=snapper2, > >>>> family=binomial) > >>>> > >>>>> summary(out2) > >>>> > >>>> Call: > >>>> glm(formula = cbind(prefer, avoid) ~ age * test, family = binomial, > >>>> data = snapper2) > >>>> > >>>> Deviance Residuals: > >>>> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > >>> 0 > >>>> 0 > >>>> > >>>> Coefficients: > >>>> Estimate Std. Error z value Pr(>|z|) > >>>> (Intercept) 6.899e-02 3.716e-01 0.186 0.8527 > >>>> age2 1.252e-01 5.180e-01 0.242 0.8091 > >>>> age3 -5.390e-01 5.483e-01 -0.983 0.3256 > >>>> age4 1.802e-02 5.589e-01 0.032 0.9743 > >>>> age5 -1.168e+00 6.866e-01 -1.701 0.0890 . > >>>> age6 -2.575e+01 9.348e+04 0.000 0.9998 > >>>> testSD 7.411e-02 5.307e-01 0.140 0.8890 > >>>> testSG 1.252e-01 5.180e-01 0.242 0.8091 > >>>> testSW -6.899e-02 5.301e-01 -0.130 0.8964 > >>>> age2:testSD -4.401e-01 7.260e-01 -0.606 0.5444 > >>>> age3:testSD 7.324e-01 7.846e-01 0.933 0.3506 > >>>> age4:testSD 8.004e-02 7.863e-01 0.102 0.9189 > >>>> age5:testSD 1.024e+00 9.301e-01 1.102 0.2707 > >>>> age6:testSD 2.532e+01 9.348e+04 0.000 0.9998 > >>>> age2:testSG 3.738e-01 7.407e-01 0.505 0.6138 > >>>> age3:testSG 7.867e-01 7.832e-01 1.004 0.3152 > >>>> age4:testSG -1.321e-01 7.764e-01 -0.170 0.8649 > >>>> age5:testSG -2.568e+01 8.768e+04 0.000 0.9998 > >>>> age6:testSG 2.121e-02 1.334e+05 0.000 1.0000 > >>>> age2:testSW -4.616e-01 7.249e-01 -0.637 0.5242 > >>>> age3:testSW 3.959e-01 7.662e-01 0.517 0.6054 > >>>> age4:testSW -2.592e-01 7.858e-01 -0.330 0.7415 > >>>> age5:testSW 1.678e+00 9.386e-01 1.788 0.0737 . > >>>> age6:testSW 2.626e+01 9.348e+04 0.000 0.9998 > >>>> --- > >>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > >>>> > >>>> (Dispersion parameter for binomial family taken to be 1) > >>>> > >>>> Null deviance: 5.4908e+01 on 23 degrees of freedom Residual > >>>> deviance: 2.6113e-10 on 0 degrees of freedom > >>>> AIC: 122.73 > >>>> > >>>> Number of Fisher Scoring iterations: 23 > >>>> > >>>> > >>>>> anova(out2, test="Chisq") > >>>> > >>>> Analysis of Deviance Table > >>>> > >>>> Model: binomial, link: logit > >>>> > >>>> Response: cbind(prefer, avoid) > >>>> > >>>> Terms added sequentially (first to last) > >>>> > >>>> > >>>> Df Deviance Resid. Df Resid. Dev Pr(>Chi) > >>>> NULL 23 54.908 > >>>> age 5 11.235 18 43.673 0.0469115 * > >>>> test 3 1.593 15 42.079 0.6608887 > >>>> age:test 15 42.079 0 0.000 0.0002185 *** > >>>> --- > >>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > >>>> > >>>> cheers > >>>> > >>>> Andy > >>>> [[alternative HTML version deleted]] > >>>> > >>>> _______________________________________________ > >>>> R-sig-ecology mailing list > >>>> [email protected] > >>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > >>>> * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * > * > >>>> Dit bericht en eventuele bijlagen geven enkel de visie van de > schrijver > >>>> weer en binden het INBO onder geen enkel beding, zolang dit bericht > niet > >>>> bevestigd is door een geldig ondertekend document. > >>>> The views expressed in this message and any annex are purely those of > >>> the > >>>> writer and may not be regarded as stating an official position of > INBO, > >>> as > >>>> long as the message is not confirmed by a duly signed document. > >>>> > >>>> > >>>> > >>>> -- > >>>> Andrew Halford Ph.D > >>>> Research Scientist (Kimberley Marine Parks)| Adjunct Research > Scientist > >>>> (Curtin University) > >>>> Dept. Parks and Wildlife > >>>> Western Australia > >>>> > >>>> Ph: +61 8 9219 9795 > >>>> Mobile: +61 (0) 468 419 473 > >>>> * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * > * > >>>> Dit bericht en eventuele bijlagen geven enkel de visie van de > schrijver > >>>> weer en binden het INBO onder geen enkel beding, zolang dit bericht > niet > >>>> bevestigd is door een geldig ondertekend document. > >>>> The views expressed in this message and any annex are purely those of > >>> the > >>>> writer and may not be regarded as stating an official position of > INBO, > >>> as > >>>> long as the message is not confirmed by a duly signed document. > >>>> > >>> > >>> > >>> > >>> -- > >>> Andrew Halford Ph.D > >>> Research Scientist (Kimberley Marine Parks)| Adjunct Research > Scientist > >>> (Curtin University) > >>> Dept. Parks and Wildlife > >>> Western Australia > >>> > >>> Ph: +61 8 9219 9795 > >>> Mobile: +61 (0) 468 419 473 > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> _______________________________________________ > >>> R-sig-ecology mailing list > >>> [email protected] > >>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > >>> > >> > >> > >> > >> -- > >> Gavin Simpson, PhD > >> > > > > > > > > -- > > Andrew Halford Ph.D > > Research Scientist (Kimberley Marine Parks)| Adjunct Research Scientist > > (Curtin University) > > Dept. Parks and Wildlife > > Western Australia > > > > Ph: +61 8 9219 9795 > > Mobile: +61 (0) 468 419 473 > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > R-sig-ecology mailing list > > [email protected] > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > -- Andrew Halford Ph.D Research Scientist (Kimberley Marine Parks)| Adjunct Research Scientist (Curtin University) Dept. Parks and Wildlife Western Australia Ph: +61 8 9219 9795 Mobile: +61 (0) 468 419 473 [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
