[R] offset and poisson regression

Renaud Scheifler Mon, 27 Jul 2009 07:30:13 -0700

Not sure that the list is the best place for this question, but we aregoing mad with this... We are trying to fit a poisson regression tocount data, eg the number of fledged youngs of blue tits (NPe) as afunction of the clutch size (GPc) and other environment variables. Hereare the original data (dumped) (we just omit the environment variablesto simplify):


tab<-
structure(list(NPe = c(3L, 5L, 2L, 6L, NA, 4L, 4L, 4L, 3L, NA,
NA, 4L, 5L, 2L, 0L, 5L, NA, 1L, NA, 2L, 5L, 4L, 0L, 4L, NA, NA,
6L, 4L, 0L, 4L, 4L, 0L, 6L, 5L, 6L, 3L, NA, 6L, 5L, 3L, 6L, 7L,
NA, 7L, 6L, 4L, NA, 1L, NA, NA, 7L, 6L, NA, 5L, NA, NA, NA, 0L,
0L, NA, NA, 5L, NA, 3L, NA, NA, NA, 5L, NA, NA, 6L, NA, NA, NA,
0L, 6L, NA, NA, NA, NA, 5L, 5L, 4L, NA, 4L, 0L, 4L, 5L, 5L, 4L,
0L, 0L, 5L, 6L, 5L, 1L, NA, 0L, 7L, 0L, 0L, 3L, 3L, 7L, NA, 0L,
6L, 4L, 4L, 5L, 0L, 5L, 4L, 7L, 4L, 7L, 5L, 5L, 0L, NA, 5L, 7L,
NA, 8L, 7L, 5L, 0L), GPc = c(5L, 6L, 6L, 7L, NA, 5L, 6L, 5L,
6L, 6L, 4L, 5L, 5L, 6L, 6L, 6L, 4L, 4L, 4L, 3L, 5L, 6L, 3L, 5L,
5L, 7L, 6L, 5L, 5L, 5L, 4L, 5L, 6L, 5L, 6L, 5L, 5L, 7L, 6L, 4L,
7L, 8L, 9L, 7L, 7L, 7L, 4L, 5L, 5L, 4L, 7L, 6L, 5L, 5L, 6L, 2L,
7L, 6L, 8L, NA, NA, 7L, 6L, 6L, NA, 6L, 6L, 5L, 5L, 5L, 7L, 7L,
6L, 6L, 6L, 6L, 7L, 5L, 5L, 7L, 7L, 6L, 6L, 8L, 6L, 7L, 5L, 5L,
8L, 8L, 7L, 7L, 6L, 7L, 6L, 5L, 6L, 7L, 8L, 6L, 7L, 7L, 5L, 7L,
6L, 5L, 9L, 5L, 4L, 7L, 6L, 6L, 5L, 8L, 5L, 7L, 6L, 7L, 7L, 7L,

6L, 7L, 5L, 8L, 7L, 7L, 6L)), .Names = c("NPe", "GPc"), class ="data.frame", row.names = c(NA,

-127L))

It seems logical to insert "clutch size" as an offset term, since we areactually interested in the ratio fledged youngs/clutch size. However,the final results are quite surprising:


modsr0<-glm(NPe~offset(GPc),family="poisson",data=tab)

if we compute the predictions, we get numbers which looks like a grossoverestimation of the reality (eg 14.6, 39.7, etc...) -including thefact that it implies that one can have more fledged youngs than eggs !:

[1] 0.7 2.0 2.0 5.4 0.7 2.0 0.7 2.0 0.7 0.7 2.0 2.0 2.00.3 0.1 0.7 2.0[18] 0.1 0.7 2.0 0.7 0.7 0.7 0.3 0.7 2.0 0.7 2.0 0.7 5.42.0 0.3 5.4 14.6[35] 5.4 5.4 5.4 0.7 5.4 2.0 0.7 2.0 14.6 5.4 2.0 0.7 5.42.0 2.0 5.4 2.0[52] 2.0 2.0 5.4 0.7 0.7 14.6 14.6 5.4 5.4 2.0 5.4 2.0 0.75.4 14.6 2.0 5.4[69] 5.4 0.7 5.4 0.7 39.7 0.7 0.3 5.4 2.0 2.0 0.7 14.6 0.75.4 2.0 5.4 5.4

[86]  2.0  5.4 14.6  5.4  5.4  2.0

Otherwise, if clutch size is inserted as a variable (and not as anoffset), predictions are much more realistic, with no extreme values :


modsr0<-glm(NPe~GPc,family="poisson",data=tab)
round(exp(predict(modsr0)),1)

[1] 3.2 3.7 3.7 4.4 3.2 3.7 3.2 3.7 3.2 3.2 3.7 3.7 3.7 2.7 2.2 3.2 3.72.2 3.2 3.7 3.2 3.2[23] 3.2 2.7 3.2 3.7 3.2 3.7 3.2 4.4 3.7 2.7 4.4 5.3 4.4 4.4 4.4 3.2 4.43.7 3.2 3.7 5.3 4.4[45] 3.7 3.2 4.4 3.7 3.7 4.4 3.7 3.7 3.7 4.4 3.2 3.2 5.3 5.3 4.4 4.4 3.74.4 3.7 3.2 4.4 5.3[67] 3.7 4.4 4.4 3.2 4.4 3.2 6.2 3.2 2.7 4.4 3.7 3.7 3.2 5.3 3.2 4.4 3.74.4 4.4 3.7 4.4 5.3

[89] 4.4 4.4 3.7

Can any sound statistician provide a hint about what to do or how tointerprete this ?


Thanks in advance,

Renaud and Patrick

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] offset and poisson regression

Reply via email to