You need to tell us more about the objectives of the analysis and target 
population of the study, and how the high CRP levels relate to those.  You 
might, for example, only be interested in the population of uninfected people 
(at the time the CRP measure was taken) and then want to remove the infected 
person's entire data from the list.  or perhaps you are interested in each 
person's "normal" uninfected CRP level but it is missing for some because they 
were infected temporarily when the measure was taken; then you might discard 
the uninformative data on levels when infected and impute missing "normal" 
values (just as you would if the scale were broken when some people went in to 
be weighed).  Of course only good if other variables are unaffected by the 
infection.

________________________________
From: Impute -- Imputations in Data Analysis 
[[email protected]] On Behalf Of Jonathan Mohr [[email protected]]
Sent: Thursday, March 27, 2014 4:34 PM
To: [email protected]
Subject: Impute invalid data?

I'm writing with a question about a small sample longitudinal study where the 
main outcome variable is level of C-reactive protein (CRP) measured at age 30 
(which is the most recent time point for which data were collected). Typically, 
scores above a certain level are thrown out as invalid because the high level 
often indicates that the person has an infection.

The folks who have been doing the main data analysis simply dropped cases with 
unacceptably high CRP levels. However, my sense is that a better strategy might 
be to simply score such participants' CRP scores as missing, and then conduct 
analyses with multiple imputation. I suggested this approach, and the main 
analyst stated that it "seems odd to me to impute CRP values for people in the 
first place, but to impute values for participant who actually had values that 
we then discarded and are now imputing seems even weirder. Maybe statistically 
there's no issue with doing that, but conceptually it just seems odd."

I'm writing to see what folks on this list think. I'm certainly open to 
arguments for listwise deletion, but I'm not currently seeing a reason to do so 
given that all other data from the "high CRV" participants at earlier time 
points appear to be valid. Thanks in advance for your thoughts!
Jon

--
***Please note change of email to [email protected]<mailto:[email protected]>***

Jonathan Mohr
Assistant Professor
Department of Psychology
Biology-Psychology Building
University of Maryland
College Park, MD 20742-4411

Office phone: 301-405-5907
Fax: 301-314-5966
Email: [email protected]<mailto:[email protected]>

Reply via email to