Re: [R] opposite estimates from zeroinfl() and hurdle()

2009-10-24 Thread Michael Dewey

At 12:36 23/10/2009, Tord Snäll wrote:

Dear all,
A question related to the following has been 
asked on R-help before, but I could not find any 
answer to it. Input will be much appreciated.


The vignette explains this, and much more. I 
found it extremely instructive both for its 
intended purpose, count data, and for general 
tips about manipulating several models and comparing them.


I got an unexpected sign of the "slope" 
parameter associated with a covariate (diam) 
using zeroinfl(). It led me to compare the 
estimates given by zeroinfl() and hurdle():


The (significant) negative estimate here is 
surprising, given the biology of the species:


> summary(zeroinfl(bnl ~ 1| diam, dist = 
"poisson", data = valdaekar, EM = TRUE))

Count model coefficients (poisson with log link):
 Estimate Std. Error z value Pr(>|z|)   (Intercept)  3.74604
0.02635   142.2   <2e-16 ***

Zero-inflation model coefficients (binomial with logit link):
 Estimate Std. Error z value Pr(>|z|)  (Intercept)  21.7510
7.6525   2.842  0.00448 **
diam -1.1437 0.3941  -2.902  0.00371 **

Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df


The hurdle model gives the same estimates, but 
with opposite (and expected) signs of the parameters:


summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
 Estimate Std. Error z value Pr(>|z|)   (Intercept)  3.74604
0.02635   142.2   <2e-16 ***
Zero hurdle model coefficients (binomial with logit link):
 Estimate Std. Error z value Pr(>|z|)  (Intercept) -21.7510
7.6525  -2.842  0.00448 **
diam  1.1437 0.3941   2.902  0.00371 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df

Why is this so?

thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03

--

Tord Snäll
Department of Ecology / Swedish Species Information Centre
Swedish University of Agricultural Sciences (SLU)
P.O. 7044, SE-750 07 Uppsala, Sweden
Office/Mobile/Fax
+46-18-672612/+46-76-7662612/+46-18-673537
www.ekol.slu.se/staff_tordsnall
www.artdata.slu.se/personal/fototsn.asp




Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] opposite estimates from zeroinfl() and hurdle()

2009-10-23 Thread Brian S Cade
Tord:   The logistic zero-inflation portion of the zeroinfl() 
implementation of ZIP or ZINB predict the probability of 0 rather than the 
probability of 1 (>0 counts) so the signs of the coefficients are often 
reversed from how you would expect them to be if you had just performed a 
logistic regression.  I'm guessing that the hurdle model as a two-stage 
model is using a logistic regression predicting the probability of 1, 
hence the reversed signs of the estimates in the logistic regression 
portion of the model.

Brian


Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Tord Snäll 
To:
r-help@r-project.org
Date:
10/23/2009 07:40 AM
Subject:
[R] opposite estimates from zeroinfl() and hurdle()
Sent by:
r-help-boun...@r-project.org



Dear all,
A question related to the following has been asked on R-help before, but 
I could not find any answer to it. Input will be much appreciated.

I got an unexpected sign of the "slope" parameter associated with a 
covariate (diam) using zeroinfl(). It led me to compare the estimates 
given by zeroinfl() and hurdle():

The (significant) negative estimate here is surprising, given the 
biology of the species:

 > summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, 
EM = TRUE))
Count model coefficients (poisson with log link):
   Estimate Std. Error z value Pr(>|z|)   (Intercept) 
3.746040.02635   142.2   <2e-16 ***

Zero-inflation model coefficients (binomial with logit link):
   Estimate Std. Error z value Pr(>|z|)  (Intercept) 
21.7510 7.6525   2.842  0.00448 **
diam -1.1437 0.3941  -2.902  0.00371 **

Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df


The hurdle model gives the same estimates, but with opposite (and 
expected) signs of the parameters:

summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
   Estimate Std. Error z value Pr(>|z|)   (Intercept) 
3.746040.02635   142.2   <2e-16 ***
Zero hurdle model coefficients (binomial with logit link):
   Estimate Std. Error z value Pr(>|z|)  (Intercept) 
-21.7510 7.6525  -2.842  0.00448 **
diam  1.1437 0.3941   2.902  0.00371 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df

Why is this so?

thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] opposite estimates from zeroinfl() and hurdle()

2009-10-23 Thread Ben Bolker



Tord Snäll-4 wrote:
> 
> Dear all,
> A question related to the following has been asked on R-help before, but 
> I could not find any answer to it. Input will be much appreciated.
> 
> I got an unexpected sign of the "slope" parameter associated with a 
> covariate (diam) using zeroinfl(). It led me to compare the estimates 
> given by zeroinfl() and hurdle():
> [snip]
> 

The right thing to do in this case is to poke through the code of hurdle()
and zeroinfl(), but a simple (?) demonstration shows that hurdle()
and zeroinfl() are indeed reporting opposite values :

 hurdle reports  -log(p/(1-p)) = -qlogis(p), where p is the probability
of a zero count:

z = rpois(500,lambda=3)
z = (z[z>0])[1:90]
z = c(z,rep(0,10))
hurdle(z~1)  ##
-qlogis(0.1)
## zero coefficient always == -qlogis(0.1)

  zeroinfl reports  log(p/(1-p)), where p is the
zero-inflation:

z = rpois(90,lambda=3)
z = c(z,rep(0,10))
zeroinfl(z~1)  ##
qlogis(0.1)

tmpf = function() {
  z = rpois(90,lambda=3)
  z = c(z,rep(0,10))
  coef(zeroinfl(z~1))[2]
}

rr = replicate(1000,tmpf())

hist(rr,breaks=1000)
summary(rr)
qlogis(0.1)

  Perhaps it would be worth sending an e-mail to the
package maintainers to request a note to this effect in
the documentation, particularly if this a FAQ ...
-- 
View this message in context: 
http://www.nabble.com/opposite-estimates-from-zeroinfl%28%29-and-hurdle%28%29-tp26024735p26029131.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] opposite estimates from zeroinfl() and hurdle()

2009-10-23 Thread Tord Snäll

Dear all,
A question related to the following has been asked on R-help before, but 
I could not find any answer to it. Input will be much appreciated.


I got an unexpected sign of the "slope" parameter associated with a 
covariate (diam) using zeroinfl(). It led me to compare the estimates 
given by zeroinfl() and hurdle():


The (significant) negative estimate here is surprising, given the 
biology of the species:


> summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, 
EM = TRUE))

Count model coefficients (poisson with log link):
  Estimate Std. Error z value Pr(>|z|)   (Intercept)  
3.746040.02635   142.2   <2e-16 ***


Zero-inflation model coefficients (binomial with logit link):
  Estimate Std. Error z value Pr(>|z|)  (Intercept)  
21.7510 7.6525   2.842  0.00448 **

diam -1.1437 0.3941  -2.902  0.00371 **

Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df


The hurdle model gives the same estimates, but with opposite (and 
expected) signs of the parameters:


summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
  Estimate Std. Error z value Pr(>|z|)   (Intercept)  
3.746040.02635   142.2   <2e-16 ***

Zero hurdle model coefficients (binomial with logit link):
  Estimate Std. Error z value Pr(>|z|)  (Intercept) 
-21.7510 7.6525  -2.842  0.00448 **

diam  1.1437 0.3941   2.902  0.00371 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df

Why is this so?

thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] opposite estimates from zeroinfl() and hurdle()

2009-10-23 Thread Tord Snäll

Dear all,
A question related to the following has been asked on R-help before, but 
I could not find any answer to it. Input will be much appreciated.


I got an unexpected sign of the "slope" parameter associated with a 
covariate (diam) using zeroinfl(). It led me to compare the estimates 
given by zeroinfl() and hurdle():


The (significant) negative estimate here is surprising, given the 
biology of the species:


> summary(zeroinfl(bnl ~ 1| diam, dist = "poisson", data = valdaekar, 
EM = TRUE))

Count model coefficients (poisson with log link):
 Estimate Std. Error z value Pr(>|z|)   (Intercept)  3.74604
0.02635   142.2   <2e-16 ***


Zero-inflation model coefficients (binomial with logit link):
 Estimate Std. Error z value Pr(>|z|)  (Intercept)  21.7510 
7.6525   2.842  0.00448 **

diam -1.1437 0.3941  -2.902  0.00371 **

Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df


The hurdle model gives the same estimates, but with opposite (and 
expected) signs of the parameters:


summary(hurdle(bnl ~ 1| diam, dist = "poisson", data = valdaekar))
Count model coefficients (truncated poisson with log link):
 Estimate Std. Error z value Pr(>|z|)   (Intercept)  3.74604
0.02635   142.2   <2e-16 ***

Zero hurdle model coefficients (binomial with logit link):
 Estimate Std. Error z value Pr(>|z|)  (Intercept) -21.7510 
7.6525  -2.842  0.00448 **

diam  1.1437 0.3941   2.902  0.00371 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df

Why is this so?

thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03

--

Tord Snäll
Department of Ecology / Swedish Species Information Centre
Swedish University of Agricultural Sciences (SLU)
P.O. 7044, SE-750 07 Uppsala, Sweden
Office/Mobile/Fax
+46-18-672612/+46-76-7662612/+46-18-673537
www.ekol.slu.se/staff_tordsnall
www.artdata.slu.se/personal/fototsn.asp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.