On Thu, Jan 31, 2013 at 2:13 PM, Wim Kreinen <wkrei...@gmail.com> wrote:

> Hello,
>
> I have a question about modelling via  glm.


I think you are way off track. Either the data, glm, or both, are not what
you think they are.


> I have a dataset

skn300.tab <- structure(list(n = 1:97, freq = c(0L, 0L, 0L, 0L, 1L, 7L, 40L,
100L, 276L, 543L, 952L, 1414L, 1853L, 2199L, 2435L, 2270L, 2042L,
1679L, 1386L, 1108L, 922L, 792L, 642L, 597L, 453L, 424L, 370L,
297L, 278L, 218L, 208L, 172L, 174L, 149L, 124L, 98L, 98L, 67L,
78L, 67L, 46L, 34L, 31L, 42L, 34L, 21L, 28L, 18L, 18L, 18L, 10L,
19L, 6L, 9L, 10L, 6L, 6L, 5L, 3L, 9L, 4L, 3L, 4L, 5L, 2L, 6L,
4L, 2L, 2L, 3L, 3L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 0L, 0L, 0L, 0L,
1L, 0L, 0L, 1L, 2L, 1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 1L),
    kum = c(0L, 0L, 0L, 0L, 1L, 8L, 48L, 148L, 424L, 967L, 1919L,
    3333L, 5186L, 7385L, 9820L, 12090L, 14132L, 15811L, 17197L,
    18305L, 19227L, 20019L, 20661L, 21258L, 21711L, 22135L, 22505L,
    22802L, 23080L, 23298L, 23506L, 23678L, 23852L, 24001L, 24125L,
    24223L, 24321L, 24388L, 24466L, 24533L, 24579L, 24613L, 24644L,
    24686L, 24720L, 24741L, 24769L, 24787L, 24805L, 24823L, 24833L,
    24852L, 24858L, 24867L, 24877L, 24883L, 24889L, 24894L, 24897L,
    24906L, 24910L, 24913L, 24917L, 24922L, 24924L, 24930L, 24934L,
    24936L, 24938L, 24941L, 24944L, 24944L, 24944L, 24944L, 24944L,
    24946L, 24947L, 24947L, 24947L, 24947L, 24947L, 24947L, 24948L,
    24948L, 24948L, 24949L, 24951L, 24952L, 24952L, 24952L, 24952L,
    24952L, 24954L, 24954L, 24954L, 24954L, 24955L)), .Names = c("n",
"freq", "kum"), row.names = c(NA, -97L), class = "data.frame")


> that looks like as if it where poisson distributed (actually I would
> appreciate that) but it isnt


plot(skn300.tab)

My guess, we are looking at the pdf and cdf (maybe even of a Poisson
process), but not at any "data" that lends itself to a (generalized) linear
model. Consult a statistician, post on stackexchange, read about
regression, or better define your actual R problem here, demonstrating this
is not homework - see the posting guide.

Cheers




> because  mean unequals var.
>
>
> > mean (x)
> [1] 901.7827
> > var (x)
> [1] 132439.3
>
>
> Anyway, I tried to model it via poisson and quasipoisson. Actually, just to
> get an impression how glm works. But I dont know how to interprete the
> data. Of course this is the case because my knowledge concerning logistic
> regressions is rather limited. Hoping there is somebody with mercy I would
> like to understand which parameters are important, e.g. which paramter
> might give me a hint that a poisson model is a bad idea. For hints
> concerning some tutorials  about reading glm-output I would appreciate as
> well.
>
> Thanks
> Wim
>
>
> > skn300.glmp <- glm (freq~n, data=skn300.tab, family=poisson)
> > summary (skn300.glmp)
>
> Call:
> glm(formula = freq ~ n, family = poisson, data = skn300.tab)
>
> Deviance Residuals:
>     Min       1Q   Median       3Q      Max
> -51.332   -9.383   -6.599   -3.959   55.111
>
> Coefficients:
>               Estimate Std. Error z value Pr(>|z|)
> (Intercept)  7.2374375  0.0093285   775.8   <2e-16 ***
> n           -0.0539424  0.0003699  -145.8   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for poisson family taken to be 1)
>
>     Null deviance: 71731  on 96  degrees of freedom
> Residual deviance: 37383  on 95  degrees of freedom
> AIC: 37800
>
> Number of Fisher Scoring iterations: 6
>
> >
> > skn300.glmq <- glm (freq~n, data=skn300.tab, family=quasipoisson)
> > summary (skn300.glmq)
>
> Call:
> glm(formula = freq ~ n, family = quasipoisson, data = skn300.tab)
>
> Deviance Residuals:
>     Min       1Q   Median       3Q      Max
> -51.332   -9.383   -6.599   -3.959   55.111
>
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept)  7.237438   0.186381  38.831  < 2e-16 ***
> n           -0.053942   0.007391  -7.298  8.8e-11 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for quasipoisson family taken to be 399.1874)
>
>     Null deviance: 71731  on 96  degrees of freedom
> Residual deviance: 37383  on 95  degrees of freedom
> AIC: NA
>
> Number of Fisher Scoring iterations: 6
>
>
> >  dput (skn300.tab)
> structure(list(n = 1:97, freq = c(0L, 0L, 0L, 0L, 1L, 7L, 40L,
> 100L, 276L, 543L, 952L, 1414L, 1853L, 2199L, 2435L, 2270L, 2042L,
> 1679L, 1386L, 1108L, 922L, 792L, 642L, 597L, 453L, 424L, 370L,
> 297L, 278L, 218L, 208L, 172L, 174L, 149L, 124L, 98L, 98L, 67L,
> 78L, 67L, 46L, 34L, 31L, 42L, 34L, 21L, 28L, 18L, 18L, 18L, 10L,
> 19L, 6L, 9L, 10L, 6L, 6L, 5L, 3L, 9L, 4L, 3L, 4L, 5L, 2L, 6L,
> 4L, 2L, 2L, 3L, 3L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 0L, 0L, 0L, 0L,
> 1L, 0L, 0L, 1L, 2L, 1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 1L),
>     kum = c(0L, 0L, 0L, 0L, 1L, 8L, 48L, 148L, 424L, 967L, 1919L,
>     3333L, 5186L, 7385L, 9820L, 12090L, 14132L, 15811L, 17197L,
>     18305L, 19227L, 20019L, 20661L, 21258L, 21711L, 22135L, 22505L,
>     22802L, 23080L, 23298L, 23506L, 23678L, 23852L, 24001L, 24125L,
>     24223L, 24321L, 24388L, 24466L, 24533L, 24579L, 24613L, 24644L,
>     24686L, 24720L, 24741L, 24769L, 24787L, 24805L, 24823L, 24833L,
>     24852L, 24858L, 24867L, 24877L, 24883L, 24889L, 24894L, 24897L,
>     24906L, 24910L, 24913L, 24917L, 24922L, 24924L, 24930L, 24934L,
>     24936L, 24938L, 24941L, 24944L, 24944L, 24944L, 24944L, 24944L,
>     24946L, 24947L, 24947L, 24947L, 24947L, 24947L, 24947L, 24948L,
>     24948L, 24948L, 24949L, 24951L, 24952L, 24952L, 24952L, 24952L,
>     24952L, 24954L, 24954L, 24954L, 24954L, 24955L)), .Names = c("n",
> "freq", "kum"), row.names = c(NA, -97L), class = "data.frame")
>
>         [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to