On Aug 27, 2010, at 4:32 PM, Teresa Iglesias wrote:

Christopher David Desjardins <desja004 <at> umn.edu> writes:


Hi,
I am running a Cox Mixed Effects Hazard model using the library coxme. I
am trying to model time to onset (age_sym1) of thought problems (e.g.
hearing voices) (sym1).  As I have siblings in my dataset,  I have
decided to account for this by including a random effect for family
(famid). My covariate of interest is Mother's diagnosis where a 0 is
bipolar, 1 is control, and 2 is major depression.  I am trying to fit
the following model.

thorp1 <- coxme(Surv(age_sym1, sym1) ~ lifedxm + (1 | famid), data =
bip.surv)

Which provides the following output:

-------------------------------------------------
thorp1
Cox mixed-effects model fit by maximum likelihood
  Data: bip.surv
  events, n = 99, 189 (3 observations deleted due to missingness)
  Iterations= 10 54
                    NULL Integrated Penalized
Log-likelihood -479.0372 -467.3549 -435.2096
                  Chisq    df          p   AIC     BIC
Integrated loglik 23.36 3.00 3.3897e-05 17.36     9.58
 Penalized loglik 87.66 47.27 3.2374e-04 -6.88 -129.54
Model: Surv(age_sym1, sym1) ~ lifedxm + (1 | famid)
Fixed coefficients
                     coef exp(coef) se(coef)      z      p
lifedxmCONTROL -1.0576257 0.3472794 0.3872527 -2.73 0.0063
lifedxmMAJOR   -0.6329957 0.5309987 0.3460847 -1.83 0.0670
Random effects
 Group Variable Std Dev    Variance
 famid Intercept 0.9877770 0.9757033

--------------------------------------------------

So it appears that there is a difference in hazard rates associated with Mother's diagnosis between BIPOLAR and CONTROL. Just to be safe, I fit a model without this covariate present and decided to compare the models with AIC and see if fit decreased with this covariate not in the model.

----------------------------------------------------------
thorp1.n <- coxme(Surv(age_sym1, sym1) ~ (1 | famid), data = bip.surv)
thorp1.n
Cox mixed-effects model fit by maximum likelihood
  Data: bip.surv
  events, n = 99, 189 (3 observations deleted due to missingness)
  Iterations= 10 54
                    NULL Integrated Penalized
Log-likelihood -479.0372 -471.3333 -436.0478
                  Chisq    df          p    AIC     BIC
Integrated loglik 15.41 1.00 0.00008663 13.41     10.81
 Penalized loglik 85.98 49.48 0.00099915 -12.97 -141.37
Model:  Surv(age_sym1, sym1) ~ (1 | famid)
Random effects
 Group Variable Std Dev Variance
 famid Intercept 1.050949 1.104495
------------------------------------------------------------

The problem is that the AIC for the model w/o lifedxm is 13.41 and the model w/ lifedxm is 17.36. So fit actually improved without lifedxm in
the model but really the fit is indistinguishable if I use Burnham &
Anderson, 2002's criteria.

So my conundrum is this - The AIC and the p-values appear to disagree. The p-value would suggest that lifedxm should be included in the model and the AIC says it doesn't improve fit. In general, I tend to favor the AIC (I usually work with large sample sizes and multilevel models) but I
am just beginning with survival models and I am curious if a survival
model guru might suggest whether lifedxm needs to be in the model or not
based on my results here? Would people generally use the AIC in this
situation?  Also, I can't run anova() on this models because of the
random effect.

I am happy to provide the data if necessary.

Please cc me on a reply.

Thanks,
Chris

Hi Chris,
Did you get an answer to why the p-value and AIC score disagreed?
I am getting the same results with my own data. They're not small
disagreements either. The AIC score is telling me the opposite of
what the p-value and the parameter coef are saying.
The p-value and the coef for the predictor variable are in agreement.

I've also noticed in some published papers with tables containing
p-values and AIC scores that the chisq p-value and AIC score
are in opposition too. This makes me think I'm missing something
and that this all actually makes sense.

However given that AIC = − 2 (log L) + 2K (where K is the number of parameters) and seeing as how the log-likelihood scores below are in the hundreds, shouldn't
the AIC score be in the hundreds as well??

That is different than my understanding of AIC. I thought that the AIC and BIC both took as input the difference in -2LL and then adjusted those differences for the differences in number of degrees of freedom. That perspective would inform one that the absolute value of the deviance ( = -2LL) is immaterial and only differences are subject to interpretation. The p-values are calculated as Wald tests, which are basically first order approximations and have lower credence than model level comparisons. The OP also seemed to have ignored the fact that the penalized estimates of AIC and BIC went in the opposite direction from what he might have hoped.

--------------------------------------

model0 (intercept only model)
                              NULL Integrated Penalized
Log-likelihood -119.8470  -117.9749 -113.1215

                        Chisq   df        p           AIC    BIC
Integrated loglik  3.74 1.00 0.052989  1.74   0.08
Penalized loglik 13.45 7.06 0.063794 -0.68 -12.43


Random effects
Group   Variable  Std Dev   Variance
Site    Intercept    0.6399200 0.4094977


_____
model1 (before vs after)
                            NULL Integrated Penalized
Log-likelihood -119.8470  -112.1598 -108.1663

                              Chisq   df          p        AIC   BIC
Integrated loglik 15.37 2.00 0.00045863 11.37  8.05
Penalized loglik 23.36 7.06 0.00153710  9.25 -2.49

Fixed coefficients
               Coef       exp(coef)  se(coef)         z       p
Time  -1.329997 0.2644779 0.4009229 -3.32 0.00091

Random effects
Group   Variable  Std Dev   Variance
Site    Intercept 0.5625206 0.3164294

______
model2 (stim1 vs stim2)
                              NULL Integrated Penalized
Log-likelihood -119.8470  -117.8677  -113.167

                             Chisq   df        p      AIC    BIC
Integrated loglik  3.96 2.00 0.138160 -0.04  -3.37
Penalized loglik 13.36 7.83 0.093314 -2.31 -15.34

Fixed coefficients
            coef          exp(coef)  se(coef)       z       p
stim 0.1621367  1.176021 0.3534788 0.46 0.65

Random effects
Group  Variable  Std Dev   Variance
Site   Intercept   0.6262600 0.3922016
----------------------------------------------

If you didn't get an answer, hopefully, someone that understands all this will
reply for both of us.

Yes. it would be good to get more authoritative opinion. Perhaps my standing as a non-statistician will prompt a real statistician to weigh in and correct any misapprehensions I have been laboring under.

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to