Dear R-users,
I would like to show you a simple example that gives an overview of one
of my current issue.
Although my working setting implies a different parametric model (which
cannot be framed in the glm), I guess that what I'll get from the
following example it would help for the next steps.
Anyway here it is.
Firstly I simulated from a series of exposures, a series of deaths
(given a model, Gompertz, and a probability distribution, Poisson). Then
a multiply both deaths and exposures by a constant. Finally I fitted
with glm the simulated data sets and I compared the deviance residuals.
They're different by a certain constant, but I could not get a
meaningful relationship between the used constants (a scatter plot of
them is given at the end).
The following example tried to be as completed as possible and I hope
that it will be clearer than my previous words.
Thanks,
Carlo Giovanni Camarda
#### simulation procedure
age <- 50:100 # time range
# parameters
alpha.sim <- 0.00005
beta.sim <- 0.1
# simulated hazard from a Gompertz
hazard.sim <- alpha.sim*(exp(beta.sim*age))
# first dataset
exposures <- c(4748, 4461, 4473, 4326, 4259, 4219, 4049, 3965, 3801,
3818, 3670, 3521, 3537,
3482, 3347, 3180, 3100, 2890, 2755, 2622, 2530, 2502,
2293, 2216, 1986, 1875,
1693, 1550, 1395, 1295, 1104, 952, 792, 755, 726, 608,
523, 419, 338,
246, 205, 148, 112, 75, 52, 32, 23, 15, 7, 6, 3) # just
as example
deaths.sim <- rpois(length(age), exposures*hazard.sim) # simulating
deaths from a poisson distribution (Brillinger, 1986)
my.offset <- log(exposures) # offset for the poisson regression
# new dataset: decupleing the sample size
deaths.sim1 <- deaths.sim * 10
exposures1 <- exposures * 10
my.offset1 <- log(exposures1)
# fitting the first dataset
fit <- glm(formula = deaths.sim ~ age + offset(my.offset),
family = poisson(link = "log"))
res <- residuals(fit, type="deviance")
# fitting the new dataset
fit1 <- glm(formula = deaths.sim1 ~ age + offset(my.offset1),
family = poisson(link = "log"))
res1 <- residuals(fit1, type="deviance")
# estimating hazard
hazard.act <- deaths.sim/exposures # == deaths.sim1/exposures1
hazard.fit <- predict(fit, type="response") / exposures
hazard.fit1 <- predict(fit1, type="response") / exposures1
# plotting log-hazard
plot(age, log(hazard.sim), cex=1.2)
lines(age, log(hazard.act), col="red", lwd=2)
lines(age, log(hazard.fit), col="blue", lwd=2, lty=2)
lines(age, log(hazard.fit1), col="green", lwd=2, lty=3)
all(round(hazard.fit,5)==round(hazard.fit1,5)) ## TRUE
# plotting residuals
plot(age, res, cex=1.2, lwd=2, ylim=range(res, res1))
points(age, res1, col="red", cex=1.2, lwd=2, pch=2)
all(round(res,5)==round(res1,5)) ## FALSE
# looking at the ratio
ratio <- (res/res1)[1]
ratio
all(round(res,10)==round(res1*ratio,10))
# here the scatterplot resid-ratios vs multiplier-constant
const.a <- c(1:16) # constants which I tried to multiply by
const.b <-
c(1,0.7071068,0.5773503,0.5,0.4472136,0.4082483,0.3779645,0.3535534, #
ratio between the deviance residuals
0.3333333,0.3162278,0.3015113,0.2886751,0.2773501,0.2672612,0.2581989,0.
25)
plot(const.a, const.b, pch=16)
+++++
This mail has been sent through the MPI for Demographic Rese...{{dropped}}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html