Re: [R] Question regarding significance of a covariate in a coxme survival model

David Winsemius Sat, 28 Aug 2010 18:40:23 -0700


On Aug 27, 2010, at 4:32 PM, Teresa Iglesias wrote:

Christopher David Desjardins <desja004 <at> umn.edu> writes:

Hi,

I am running a Cox Mixed Effects Hazard model using the librarycoxme. I

am trying to model time to onset (age_sym1) of thought problems (e.g.
hearing voices) (sym1).  As I have siblings in my dataset,  I have
decided to account for this by including a random effect for family
(famid). My covariate of interest is Mother's diagnosis where a 0 is
bipolar, 1 is control, and 2 is major depression.  I am trying to fit
the following model.

thorp1 <- coxme(Surv(age_sym1, sym1) ~ lifedxm + (1 | famid), data =
bip.surv)

Which provides the following output:

-------------------------------------------------

thorp1

Cox mixed-effects model fit by maximum likelihood
  Data: bip.surv
  events, n = 99, 189 (3 observations deleted due to missingness)
  Iterations= 10 54
                    NULL Integrated Penalized
Log-likelihood -479.0372 -467.3549 -435.2096
                  Chisq    df          p   AIC     BIC
Integrated loglik 23.36 3.00 3.3897e-05 17.36     9.58
 Penalized loglik 87.66 47.27 3.2374e-04 -6.88 -129.54
Model: Surv(age_sym1, sym1) ~ lifedxm + (1 | famid)
Fixed coefficients
                     coef exp(coef) se(coef)      z      p
lifedxmCONTROL -1.0576257 0.3472794 0.3872527 -2.73 0.0063
lifedxmMAJOR   -0.6329957 0.5309987 0.3460847 -1.83 0.0670
Random effects
 Group Variable Std Dev    Variance
 famid Intercept 0.9877770 0.9757033

--------------------------------------------------

So it appears that there is a difference in hazard rates associatedwithMother's diagnosis between BIPOLAR and CONTROL. Just to be safe, Ifit amodel without this covariate present and decided to compare themodelswith AIC and see if fit decreased with this covariate not in themodel.


----------------------------------------------------------

thorp1.n <- coxme(Surv(age_sym1, sym1) ~ (1 | famid), data =bip.surv)

thorp1.n

Cox mixed-effects model fit by maximum likelihood
  Data: bip.surv
  events, n = 99, 189 (3 observations deleted due to missingness)
  Iterations= 10 54
                    NULL Integrated Penalized
Log-likelihood -479.0372 -471.3333 -436.0478
                  Chisq    df          p    AIC     BIC
Integrated loglik 15.41 1.00 0.00008663 13.41     10.81
 Penalized loglik 85.98 49.48 0.00099915 -12.97 -141.37
Model:  Surv(age_sym1, sym1) ~ (1 | famid)
Random effects
 Group Variable Std Dev Variance
 famid Intercept 1.050949 1.104495
------------------------------------------------------------

The problem is that the AIC for the model w/o lifedxm is 13.41 andthemodel w/ lifedxm is 17.36. So fit actually improved without lifedxmin

the model but really the fit is indistinguishable if I use Burnham &
Anderson, 2002's criteria.

So my conundrum is this - The AIC and the p-values appear todisagree.The p-value would suggest that lifedxm should be included in themodeland the AIC says it doesn't improve fit. In general, I tend tofavor theAIC (I usually work with large sample sizes and multilevel models)but I

am just beginning with survival models and I am curious if a survival

model guru might suggest whether lifedxm needs to be in the modelor not

based on my results here? Would people generally use the AIC in this
situation?  Also, I can't run anova() on this models because of the
random effect.

I am happy to provide the data if necessary.

Please cc me on a reply.

Thanks,
Chris


Hi Chris,
Did you get an answer to why the p-value and AIC score disagreed?
I am getting the same results with my own data. They're not small
disagreements either. The AIC score is telling me the opposite of
what the p-value and the parameter coef are saying.
The p-value and the coef for the predictor variable are in agreement.

I've also noticed in some published papers with tables containing
p-values and AIC scores that the chisq p-value and AIC score
are in opposition too. This makes me think I'm missing something
and that this all actually makes sense.

However given that AIC = − 2 (log L) + 2K (where K is the number ofparameters)and seeing as how the log-likelihood scores below are in thehundreds, shouldn't

the AIC score be in the hundreds as well??

That is different than my understanding of AIC. I thought that the AICand BIC both took as input the difference in -2LL and then adjustedthose differences for the differences in number of degrees of freedom.That perspective would inform one that the absolute value of thedeviance ( = -2LL) is immaterial and only differences are subject tointerpretation. The p-values are calculated as Wald tests, which arebasically first order approximations and have lower credence thanmodel level comparisons. The OP also seemed to have ignored the factthat the penalized estimates of AIC and BIC went in the oppositedirection from what he might have hoped.

--------------------------------------

model0 (intercept only model)
                              NULL Integrated Penalized
Log-likelihood -119.8470  -117.9749 -113.1215

                        Chisq   df        p           AIC    BIC
Integrated loglik  3.74 1.00 0.052989  1.74   0.08
Penalized loglik 13.45 7.06 0.063794 -0.68 -12.43


Random effects
Group   Variable  Std Dev   Variance
Site    Intercept    0.6399200 0.4094977


_____
model1 (before vs after)
                            NULL Integrated Penalized
Log-likelihood -119.8470  -112.1598 -108.1663

                              Chisq   df          p        AIC   BIC
Integrated loglik 15.37 2.00 0.00045863 11.37  8.05
Penalized loglik 23.36 7.06 0.00153710  9.25 -2.49

Fixed coefficients
               Coef       exp(coef)  se(coef)         z       p
Time  -1.329997 0.2644779 0.4009229 -3.32 0.00091

Random effects
Group   Variable  Std Dev   Variance
Site    Intercept 0.5625206 0.3164294

______
model2 (stim1 vs stim2)
                              NULL Integrated Penalized
Log-likelihood -119.8470  -117.8677  -113.167

                             Chisq   df        p      AIC    BIC
Integrated loglik  3.96 2.00 0.138160 -0.04  -3.37
Penalized loglik 13.36 7.83 0.093314 -2.31 -15.34

Fixed coefficients
            coef          exp(coef)  se(coef)       z       p
stim 0.1621367  1.176021 0.3534788 0.46 0.65

Random effects
Group  Variable  Std Dev   Variance
Site   Intercept   0.6262600 0.3922016
----------------------------------------------

If you didn't get an answer, hopefully, someone that understands allthis will

reply for both of us.

Yes. it would be good to get more authoritative opinion. Perhaps mystanding as a non-statistician will prompt a real statistician toweigh in and correct any misapprehensions I have been laboring under.


--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question regarding significance of a covariate in a coxme survival model

Reply via email to