Re: [R] dreaded p-val for d^2 of a glm / gam
Hi Spencer and David, Thanks for your answers first - yes it is deviance but just before i just spoke and explain that it is the equivalent of r square from the normal regression. I hope i can do the comparison and show that the model is significant and hopefully i am off the hook. Sincerely i try to avoid all this business with p-values but certainly some are quite found of it. The problem is that you get almost by default a p-value from an F test if you use lm for example, so . quite few times i was asked to provide a similar thing for quite different models. Thanks again, Monica Date: Thu, 27 Mar 2008 16:38:12 -0700 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] dreaded p-val for d^2 of a glm / gam I assume you mean 'deviance', not 'squared deviance'; if the latter, then I have no idea. If the former, then a short and fairly quick answer to your question is that 2*log(likelihood ratio) for nested hypotheses is approximately chi-square with numbers of degrees of freedom = the number of parameters in the larger model fixed to get the smaller model, under standard regularity conditions, the most important of which is that the maximum likelihood is not at a boundary. For specificity, consider the following modification of the first example in the 'glm' help page: counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) glm.D93t - glm(counts ~ treatment, family=poisson()) anova(glm.D93t, glm.D93, test=Chisq) The p-value is not printed by default, because some people would rather NOT give an answer than give an answer that might not be very accurate in the cases where this chi-square approximation is not very good. To check that, you could do a Monte Carlo, refit the model with, say, 1000 random permutations of your response variable, collect anova(glm.D93t, glm.D93)[2, Deviance] in a vector, and then find out how extreme the deviance you actually got is relative to this permutation distribution. Hope this helps. Spencer Graves p.s. Regarding your 'dread', please see fortune(children) Monica Pisica wrote: OK, I really dread to ask that much more that I know some discussion about p-values and if they are relevant for regressions were already on the list. I know to get p-val of regression coefficients - this is not a problem. But unfortunately one editor of a journal where i would like to publish some results insists in giving p-values for the squared deviance i get out from different glm and gam models. I came up with this solution, but sincerely i would like to get yours'all opinion on the matter. p1.glm - glm(count ~be+ch+crr+home, family = 'poisson') # count - is count of species (vegetation) # be, ch, crr, home - different lidar metrics # calculating d^2 d2.p1 - round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4) d2.p1 0.6705 # calculating f statistics with N = 148 and n=4; f = (N-n-1)/(N-1)(1-d^2) f - (148-4-1)/(147*(1-0.6705)) f [1] 2.952319 #calculating p-value pval.glm - 1-pf(f, 147,143) pval.glm [1] 1.135693e-10 So, what do you think? Is this acceptable if i really have to give a p-value for the deviance squared? If it is i think i will transform everything in a fuction Thanks, Monica _ Windows Live Hotmail is giving away Zunes. M_Mobile_Zune_V3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Watch Cause Effect, a show about real people making a real difference. Learn more. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dreaded p-val for d^2 of a glm / gam
Do you think p values are bad? That's not my understanding. P values may not be reported by some software, because the algorithm developers didn't know how to efficiently compute a reasonably accurate p value. And if you do a thousand or a million statistical tests and report only the results with the smallest p value, that's ultimately fraudulent. (http://en.wikipedia.org/wiki/Multiple_comparisons). However, the concept of a significance probability or p value is quite valuable (http://en.wikipedia.org/wiki/P-value), though like any tool or concept, it can be misused. Hope this helps. Spencer Graves Monica Pisica wrote: Hi Spencer and David, Thanks for your answers first - yes it is deviance but just before i just spoke and explain that it is the equivalent of r square from the normal regression. I hope i can do the comparison and show that the model is significant and hopefully i am off the hook. Sincerely i try to avoid all this business with p-values but certainly some are quite found of it. The problem is that you get almost by default a p-value from an F test if you use lm for example, so . quite few times i was asked to provide a similar thing for quite different models. Thanks again, Monica Date: Thu, 27 Mar 2008 16:38:12 -0700 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] dreaded p-val for d^2 of a glm / gam I assume you mean 'deviance', not 'squared deviance'; if the latter, then I have no idea. If the former, then a short and fairly quick answer to your question is that 2*log(likelihood ratio) for nested hypotheses is approximately chi-square with numbers of degrees of freedom = the number of parameters in the larger model fixed to get the smaller model, under standard regularity conditions, the most important of which is that the maximum likelihood is not at a boundary. For specificity, consider the following modification of the first example in the 'glm' help page: counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) glm.D93t - glm(counts ~ treatment, family=poisson()) anova(glm.D93t, glm.D93, test=Chisq) The p-value is not printed by default, because some people would rather NOT give an answer than give an answer that might not be very accurate in the cases where this chi-square approximation is not very good. To check that, you could do a Monte Carlo, refit the model with, say, 1000 random permutations of your response variable, collect anova(glm.D93t, glm.D93)[2, Deviance] in a vector, and then find out how extreme the deviance you actually got is relative to this permutation distribution. Hope this helps. Spencer Graves p.s. Regarding your 'dread', please see fortune(children) Monica Pisica wrote: OK, I really dread to ask that much more that I know some discussion about p-values and if they are relevant for regressions were already on the list. I know to get p-val of regression coefficients - this is not a problem. But unfortunately one editor of a journal where i would like to publish some results insists in giving p-values for the squared deviance i get out from different glm and gam models. I came up with this solution, but sincerely i would like to get yours'all opinion on the matter. p1.glm - glm(count ~be+ch+crr+home, family = 'poisson') # count - is count of species (vegetation) # be, ch, crr, home - different lidar metrics # calculating d^2 d2.p1 - round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4) d2.p1 0.6705 # calculating f statistics with N = 148 and n=4; f = (N-n-1)/(N-1)(1-d^2) f - (148-4-1)/(147*(1-0.6705)) f [1] 2.952319 #calculating p-value pval.glm - 1-pf(f, 147,143) pval.glm [1] 1.135693e-10 So, what do you think? Is this acceptable if i really have to give a p-value for the deviance squared? If it is i think i will transform everything in a fuction Thanks, Monica _ Windows Live Hotmail is giving away Zunes. M_Mobile_Zune_V3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Watch “Cause Effect,” a show about real people making a real difference. Learn more. http://im.live.com/Messenger/IM/MTV/?source=text_watchcause __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] dreaded p-val for d^2 of a glm / gam
I assume you mean 'deviance', not 'squared deviance'; if the latter, then I have no idea. If the former, then a short and fairly quick answer to your question is that 2*log(likelihood ratio) for nested hypotheses is approximately chi-square with numbers of degrees of freedom = the number of parameters in the larger model fixed to get the smaller model, under standard regularity conditions, the most important of which is that the maximum likelihood is not at a boundary. For specificity, consider the following modification of the first example in the 'glm' help page: counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) glm.D93t - glm(counts ~ treatment, family=poisson()) anova(glm.D93t, glm.D93, test=Chisq) The p-value is not printed by default, because some people would rather NOT give an answer than give an answer that might not be very accurate in the cases where this chi-square approximation is not very good. To check that, you could do a Monte Carlo, refit the model with, say, 1000 random permutations of your response variable, collect anova(glm.D93t, glm.D93)[2, Deviance] in a vector, and then find out how extreme the deviance you actually got is relative to this permutation distribution. Hope this helps. Spencer Graves p.s. Regarding your 'dread', please see fortune(children) Monica Pisica wrote: OK, I really dread to ask that much more that I know some discussion about p-values and if they are relevant for regressions were already on the list. I know to get p-val of regression coefficients - this is not a problem. But unfortunately one editor of a journal where i would like to publish some results insists in giving p-values for the squared deviance i get out from different glm and gam models. I came up with this solution, but sincerely i would like to get yours'all opinion on the matter. p1.glm - glm(count ~be+ch+crr+home, family = 'poisson') # count - is count of species (vegetation) # be, ch, crr, home - different lidar metrics # calculating d^2 d2.p1 - round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4) d2.p1 0.6705 # calculating f statistics with N = 148 and n=4; f = (N-n-1)/(N-1)(1-d^2) f - (148-4-1)/(147*(1-0.6705)) f [1] 2.952319 #calculating p-value pval.glm - 1-pf(f, 147,143) pval.glm [1] 1.135693e-10 So, what do you think? Is this acceptable if i really have to give a p-value for the deviance squared? If it is i think i will transform everything in a fuction Thanks, Monica _ Windows Live Hotmail is giving away Zunes. M_Mobile_Zune_V3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.