Hey Rod,

Thanks for your response (and to Steven, also).  I have generated quite a few 
hits for the topic on the multi-level list serve.  

Even though my variable clearly appears to be mixed, I am reluctant to analyze 
it as so because of concerns about causal interpretations.  Seems to me the 
two-stage models run into more problems with causal interpretation than do 
linear models.  

I have pasted below the note that I sent out on the other list serve.  
Apologies to those who subscribe to both.

--Dave

Allow me though to add some information. I am analyzing a randomized trial.  
Because of this, I don't want to use any sort of mixture model even though that 
seems to be the obvious best choice for a model.  The problem with it as I see 
it is that it is hard to give a causal interpretation to the conditional model. 
 Perhaps it can be done with principal strata (in the terminology of Frangakis 
and Rubin), but I didn't want to have to get involved with that sort of work.  
So I was really interested in robustness studies for linear models.  How well 
do they work when a zero-inflated model seems to be the more appropriate model 
but cannot be used because it cannot be given a causal interpretation without 
additional strong assumptions?

To see the problem with a zero-inflated model, consider a randomized trial of 
alternative smoking cessation interventions with an outcome of daily cigarette 
consumption after some interval of time.  Suppose that never quitters under 
treatment smoke 25 a day, while never quitters under control smoke 40 a day, 
and compliers under control smoke 15 a day.  The true effect of treatment on 
never quitters is to reduce consumption by 15 a day, but depending on the 
relative frequency of never quitters and compliers, the average consumption on 
the control side among those who did not quit might be lower than the average 
consumption on the treatment side.  

A simple solution is to ignore the extra information about consumption among 
persistent smokers and just analyze the binary outcome of quitting or not 
quitting. That seems to be the common approach.  But I thought that the reading 
intervention I am studying might both lift nonreaders to be readers and help 
current readers to become better readers.  The logit model and linear model 
both have a single location parameter.  

Maybe I need to use a multi-level nonparametric test.  Of course, then I lose a 
lot of power.  

-----Original Message-----
From: Rod Little [mailto:[email protected]] 
Sent: Monday, September 07, 2009 11:53 AM
To: David Judkins
Cc: [email protected]
Subject: Re: [Impute] RE: Impute Digest, Vol 47, Issue 1

David, the sequential regression program IVEware allows for "mixed" 
variable types like the one you describe. I think it multiply imputes 
using a two stage model, for presence/absence and then amount given 
presence. Rod

On Mon, 7 Sep 2009, Gregorich, Steven wrote:

> Hi, David.
>
> It sounds like you are talking about a zero-inflated model, e.g., a 
> zero-inflated Poisson (ZIP) or zero-inflated negative
> binomial (ZINB). You could also fit a zero-inflated normal model (ZIN; Joe 
> Schafer and a colleague wrote a paper
> about that model in the late 1990's; they called it a 'two-part model'). I 
> never studied the ZIN model closely, but
> presumably the normal part of the model would allow negative predicted 
> values, which I would not be happy with
> in your application.
>
> You can fit a 2-level multilevel ZIP, ZIN, ZINB, etc in PROC NLMIXED and you 
> can probably find related SAS
> code in the SAS-L archives; especially posts by Dale McLerran (sp?).
>
> HTH
>
> Steve
> ________________________________________
>
> Message: 1
> Date: Fri, 4 Sep 2009 14:25:33 -0400
> From: David Judkins <[email protected]>
> Subject: [Impute] Robustness of Multi-Level Modeling Software
> To: "[email protected]"
>        <[email protected]>
> Message-ID:
>        <[email protected]>
> Content-Type: text/plain; charset="us-ascii"
>
> This is not an imputation question, but I don't know of a list serve for 
> complex modeling questions.  Maybe one of you will be able to help.
>
> Consider a mixed binary-normal distribution that results in a large point 
> mass on the edge of an otherwise more-or-less normal distribution.  An 
> example is number of alcoholic drinks per day.  Cigarettes per day is another 
> example. Or the number of questions reading questions answered correctly on a 
> sample that contains a large number of children who can't read at all.  The 
> child reading example is my real concern because the children come grouped by 
> school.
>
> Anyone know of robustness studies of MLwin, HLM, Mixed, MPLUS, et cetera to 
> this radical departure from normality?  I have heard it asserted that 
> school-level departures from normality are more of a concern than 
> student-level departures, but is this too much of a departure?
>
>
> David Judkins
> Senior Statistician
> Westat
> 1650 Research Boulevard
> Rockville, MD 20850
> (301) 315-5970
> [email protected]
> _______________________________________________
> Impute mailing list
> [email protected]
> http://lists.utsouthwestern.edu/mailman/listinfo/impute
>
>
>

___________________________________________________________________________________
Roderick Little
Professor and Chair, Department of Biostatistics
U-M School of Public Health                 Tel (734) 936 1003
M4208 SPH II                                Fax (734) 763 2215
1420 Washington Hgts                        email [email protected]
Ann Arbor, MI 48109-2029             http://www.sph.umich.edu/~rlittle/

Reply via email to