Re: [R] dreaded p-val for d^2 of a glm / gam

2008-03-28 Thread Monica Pisica

Hi Spencer and David,
 
Thanks for your answers  first - yes it is deviance but just before i just 
spoke and explain that it is the equivalent of r square from the normal 
regression.
 
I hope i can do the comparison and show that the model is significant and 
hopefully i am off the hook. Sincerely i try to avoid all this business with 
p-values but certainly some are quite found of it. The problem is that you get 
almost by default a p-value from an F test if you use lm for example, so . 
quite few times i was asked to provide a similar thing for quite different 
models. 
 
Thanks again,
 
Monica Date: Thu, 27 Mar 2008 16:38:12 -0700 From: [EMAIL PROTECTED] To: 
[EMAIL PROTECTED] CC: r-help@r-project.org Subject: Re: [R] dreaded p-val for 
d^2 of a glm / gam  I assume you mean 'deviance', not 'squared deviance'; if 
the  latter, then I have no idea.   If the former, then a short and fairly 
quick answer to your  question is that 2*log(likelihood ratio) for nested 
hypotheses is  approximately chi-square with numbers of degrees of freedom = 
the number  of parameters in the larger model fixed to get the smaller model, 
under  standard regularity conditions, the most important of which is that the 
 maximum likelihood is not at a boundary.   For specificity, consider the 
following modification of the first  example in the 'glm' help page:   
counts - c(18,17,15,20,10,20,25,13,12) outcome - gl(3,1,9) treatment - 
gl(3,3) glm.D93 - glm(counts ~ outcome + treatment, family=poisson()) 
glm.D93t - glm(counts ~ treatment, family=poisson()) anova(glm.D93t, glm.D93, 
test=Chisq)  The p-value is not printed by default, because some people 
would  rather NOT give an answer than give an answer that might not be very  
accurate in the cases where this chi-square approximation is not very  good. 
To check that, you could do a Monte Carlo, refit the model with,  say, 1000 
random permutations of your response variable, collect  anova(glm.D93t, 
glm.D93)[2, Deviance] in a vector, and then find out  how extreme the 
deviance you actually got is relative to this  permutation distribution.   
Hope this helps.  Spencer Graves p.s. Regarding your 'dread', please see 
fortune(children)  Monica Pisica wrote:  OK,   I really dread to ask 
that  much more that I know some discussion about p-values and if they are 
relevant for regressions were already on the list. I know to get p-val of 
regression coefficients - this is not a problem. But unfortunately one editor 
of a journal where i would like to publish some results insists in giving 
p-values for the squared deviance i get out from different glm and gam models. 
I came up with this solution, but sincerely i would like to get yours'all 
opinion on the matter.   p1.glm - glm(count ~be+ch+crr+home, family = 
'poisson')   # count - is count of species (vegetation)  # be, ch, crr, 
home - different lidar metrics   # calculating d^2  d2.p1 - 
round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4)  d2.p1  0.6705   # 
calculating f statistics with N = 148 and n=4; f = (N-n-1)/(N-1)(1-d^2)  f - 
(148-4-1)/(147*(1-0.6705))  f  [1] 2.952319   #calculating p-value  
pval.glm - 1-pf(f, 147,143)  pval.glm  [1] 1.135693e-10   So, what do 
you think? Is this acceptable if i really have to give a p-value for the 
deviance squared? If it is i think i will transform everything in a fuction 
   Thanks,   Monica  
_  Windows 
Live Hotmail is giving away Zunes.   M_Mobile_Zune_V3  
__  R-help@r-project.org mailing 
list  https://stat.ethz.ch/mailman/listinfo/r-help  PLEASE do read the 
posting guide http://www.R-project.org/posting-guide.html  and provide 
commented, minimal, self-contained, reproducible code.  
_
Watch “Cause Effect,” a show about real people making a real difference.  Learn 
more.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dreaded p-val for d^2 of a glm / gam

2008-03-28 Thread Spencer Graves
  Do you think p values are bad?  That's not my understanding.  P 
values may not be reported by some software, because the algorithm 
developers didn't know how to efficiently compute a reasonably accurate 
p value.  And if you do a thousand or a million statistical tests and 
report only the results with the smallest p value, that's ultimately 
fraudulent.  (http://en.wikipedia.org/wiki/Multiple_comparisons). 

  However, the concept of a significance probability or p value is 
quite valuable (http://en.wikipedia.org/wiki/P-value), though like any 
tool or concept, it can be misused. 

  Hope this helps. 
  Spencer Graves

Monica Pisica wrote:
 Hi Spencer and David,
  
 Thanks for your answers  first - yes it is deviance but just 
 before i just spoke and explain that it is the equivalent of r square 
 from the normal regression.
  
 I hope i can do the comparison and show that the model is significant 
 and hopefully i am off the hook. Sincerely i try to avoid all this 
 business with p-values but certainly some are quite found of it. The 
 problem is that you get almost by default a p-value from an F test if 
 you use lm for example, so . quite few times i was asked to 
 provide a similar thing for quite different models.
  
 Thanks again,
  
 Monica

  Date: Thu, 27 Mar 2008 16:38:12 -0700
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  CC: r-help@r-project.org
  Subject: Re: [R] dreaded p-val for d^2 of a glm / gam
 
  I assume you mean 'deviance', not 'squared deviance'; if the
  latter, then I have no idea.
 
  If the former, then a short and fairly quick answer to your
  question is that 2*log(likelihood ratio) for nested hypotheses is
  approximately chi-square with numbers of degrees of freedom = the 
 number
  of parameters in the larger model fixed to get the smaller model, under
  standard regularity conditions, the most important of which is that the
  maximum likelihood is not at a boundary.
 
  For specificity, consider the following modification of the first
  example in the 'glm' help page:
 
  counts - c(18,17,15,20,10,20,25,13,12)
  outcome - gl(3,1,9)
  treatment - gl(3,3)
  glm.D93 - glm(counts ~ outcome + treatment, family=poisson())
  glm.D93t - glm(counts ~ treatment, family=poisson())
  anova(glm.D93t, glm.D93, test=Chisq)
 
  The p-value is not printed by default, because some people would
  rather NOT give an answer than give an answer that might not be very
  accurate in the cases where this chi-square approximation is not very
  good. To check that, you could do a Monte Carlo, refit the model with,
  say, 1000 random permutations of your response variable, collect
  anova(glm.D93t, glm.D93)[2, Deviance] in a vector, and then find out
  how extreme the deviance you actually got is relative to this
  permutation distribution.
 
  Hope this helps.
  Spencer Graves
  p.s. Regarding your 'dread', please see fortune(children)
 
  Monica Pisica wrote:
   OK,
  
   I really dread to ask that  much more that I know some 
 discussion about p-values and if they are relevant for regressions 
 were already on the list. I know to get p-val of regression 
 coefficients - this is not a problem. But unfortunately one editor of 
 a journal where i would like to publish some results insists in giving 
 p-values for the squared deviance i get out from different glm and gam 
 models. I came up with this solution, but sincerely i would like to 
 get yours'all opinion on the matter.
  
   p1.glm - glm(count ~be+ch+crr+home, family = 'poisson')
  
   # count - is count of species (vegetation)
   # be, ch, crr, home - different lidar metrics
  
   # calculating d^2
   d2.p1 - round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4)
   d2.p1
   0.6705
  
   # calculating f statistics with N = 148 and n=4; f = 
 (N-n-1)/(N-1)(1-d^2)
   f - (148-4-1)/(147*(1-0.6705))
   f
   [1] 2.952319
  
   #calculating p-value
   pval.glm - 1-pf(f, 147,143)
   pval.glm
   [1] 1.135693e-10
  
   So, what do you think? Is this acceptable if i really have to give 
 a p-value for the deviance squared? If it is i think i will transform 
 everything in a fuction 
  
   Thanks,
  
   Monica
   _
   Windows Live Hotmail is giving away Zunes.
  
   M_Mobile_Zune_V3
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  


 
 Watch “Cause Effect,” a show about real people making a real 
 difference. Learn more. 
 http://im.live.com/Messenger/IM/MTV/?source=text_watchcause

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] dreaded p-val for d^2 of a glm / gam

2008-03-27 Thread Spencer Graves
  I assume you mean 'deviance', not 'squared deviance';  if the 
latter, then I have no idea. 

  If the former, then a short and fairly quick answer to your 
question is that 2*log(likelihood ratio) for nested hypotheses is 
approximately chi-square with numbers of degrees of freedom = the number 
of parameters in the larger model fixed to get the smaller model, under 
standard regularity conditions, the most important of which is that the 
maximum likelihood is not at a boundary. 

  For specificity, consider the following modification of the first 
example in the 'glm' help page: 

 counts - c(18,17,15,20,10,20,25,13,12)
 outcome - gl(3,1,9)
 treatment - gl(3,3)
 glm.D93 - glm(counts ~ outcome + treatment, family=poisson())
 glm.D93t - glm(counts ~ treatment, family=poisson())
 anova(glm.D93t, glm.D93, test=Chisq)

  The p-value is not printed by default, because some people would 
rather NOT give an answer than give an answer that might not be very 
accurate in the cases where this chi-square approximation is not very 
good.  To check that, you could do a Monte Carlo, refit the model with, 
say, 1000 random permutations of your response variable, collect 
anova(glm.D93t, glm.D93)[2, Deviance] in a vector, and then find out 
how extreme the deviance you actually got is relative to this 
permutation distribution. 

  Hope this helps. 
  Spencer Graves
p.s.  Regarding your 'dread', please see fortune(children)

Monica Pisica wrote:
 OK,

 I really dread to ask that  much more that I know some discussion about 
 p-values and if they are relevant for regressions were already on the list. I 
 know to get p-val of regression coefficients - this is not a problem. But 
 unfortunately one editor of a journal where i would like to publish some 
 results insists in giving p-values for the squared deviance i get out from 
 different glm and gam models. I came up with this solution, but sincerely i 
 would like to get yours'all opinion on the matter.

 p1.glm - glm(count ~be+ch+crr+home,   family = 'poisson')

 # count - is count of species (vegetation)
 # be, ch, crr, home - different lidar metrics

 # calculating d^2
 d2.p1 - round((p1.glm[[12]]-p1.glm[[10]])/p1.glm[[12]],4)
 d2.p1
 0.6705

 # calculating f statistics with N = 148 and n=4; f = (N-n-1)/(N-1)(1-d^2)
 f -  (148-4-1)/(147*(1-0.6705))
 f
 [1] 2.952319

 #calculating p-value
 pval.glm - 1-pf(f, 147,143)
 pval.glm
 [1] 1.135693e-10

 So, what do you think? Is this acceptable if i really have to give a p-value 
 for the deviance squared? If it is i think i will transform everything in a 
 fuction 

 Thanks,

 Monica
 _
 Windows Live Hotmail is giving away Zunes.

 M_Mobile_Zune_V3
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.