Hello all,

 

I have a question regarding the use of a second round of imputation.  If a
group of data has had imputation conducted on it (say single-point
imputation), and then more cases are added to the database (perhaps because
the initial imputation was conducted for a midpoint analysis), what should
be done for the additional analyses?  For example, should the original
imputation values remain while just estimating the missing data for the new
cases, or should all missing values be estimated (including a re-estimation
of the original cases)?

 

Many thanks,

 

Jason 

 

**************************************************************

Jason C. Cole, PhD

Senior Statistician

Department of Psychiatry and Biobehavioral Sciences

Cousins Center for Psychoneuroimmunology

300 UCLA Medical Plaza, Room 3148 

Los Angeles, CA 90095-7057

Tel: 310 267 4390

FAX: 310 794 9247

E-mail:  <mailto:[email protected]> [email protected]

 <http://www.cousinspni.org/> http://www.cousinspni.org

**************************************************************

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20040407/a90f89e1/attachment.htm
From TRobert.Harris <@t> UTSouthwestern.edu  Thu Apr  8 16:36:36 2004
From: TRobert.Harris <@t> UTSouthwestern.edu (TRobert Harris)
Date: Sun Jun 26 08:25:01 2005
Subject: [Impute] Fwd: calculate risk score before or after imputation?
Message-ID: <[email protected]>

Submitted on behalf of Bill Howells:

>>> "Howells, William" <[email protected]> 4/8/2004 4:16:54 PM
>>>
We have a regression model that uses a risk score that is calculated
as
a weighted sum of 10 prognostic variables.  The weights were obtained
from a regression analysis on another sample.  In my sample of n=800,
one or more of the 10 individual variables is missing in 40% of the
patients, resulting in 40% missing risk score.  I calculated the risk
score on the observed data and then imputed the missing risk score
using
the 10 variables, in addition to other var in the analysis.  Later, I
found out that statisticians at our data center at another institution
had imputed the 10 variables and then calculated the risk score on the
imputed data.  The combined inferences from the two approaches are not
wildly different, eg. hazard ratio of 2.5 (p=0.005) vs 2.7 (p=0.003),
but I wonder if there is any theory or simulations to help decide
which
approach is better?  

Bill Howells, MS
Behavioral Medicine Center
Washington University School of Medicine
St Louis, MO

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Header
Type: application/octet-stream
Size: 1687 bytes
Desc: not available
Url : 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20040408/4ab4514b/Header.obj
From arlinen <@t> amgen.com  Thu Apr  8 21:54:40 2004
From: arlinen <@t> amgen.com (Nakanishi, Arline)
Date: Sun Jun 26 08:25:01 2005
Subject: [Impute] Adding cases after imputation
Message-ID: <[email protected]>

Seems to me that your answer would depend on what method was used to derive
the imputed value.  In other words, would your imputed values change with
the additional cases?  For example, if you were using a bootstrapping
routine to pick imputed values, the additional cases would change your pool
of possible values.  If so, then I would opt to reimpute for the original
missing values under the presumption that the imputation would be improved.
 
Arline Nakanishi
Amgen Inc.

-----Original Message-----
From: [email protected]
[mailto:[email protected]]on Behalf Of Cole, Jason
Ph.D.
Sent: Wednesday, April 07, 2004 9:32 AM
To: Imputation Listserve ([email protected])
Subject: [Impute] Adding cases after imputation



Hello all,

 

I have a question regarding the use of a second round of imputation.  If a
group of data has had imputation conducted on it (say single-point
imputation), and then more cases are added to the database (perhaps because
the initial imputation was conducted for a midpoint analysis), what should
be done for the additional analyses?  For example, should the original
imputation values remain while just estimating the missing data for the new
cases, or should all missing values be estimated (including a re-estimation
of the original cases)?

 

Many thanks,

 

Jason 

 

**************************************************************

Jason C. Cole, PhD

Senior Statistician

Department of Psychiatry and Biobehavioral Sciences

Cousins Center for Psychoneuroimmunology

300 UCLA Medical Plaza, Room 3148 

Los Angeles, CA 90095-7057

Tel: 310 267 4390

FAX: 310 794 9247

E-mail:  <mailto:[email protected]> [email protected]

 <http://www.cousinspni.org/> http://www.cousinspni.org

**************************************************************

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20040408/b3960819/attachment.htm
From finch <@t> epi.umn.edu  Fri Apr  9 17:03:51 2004
From: finch <@t> epi.umn.edu (Emily Finch)
Date: Sun Jun 26 08:25:01 2005
Subject: [Impute] Imputing interactions with time using Proc Mixed
Message-ID: <000001c41e7e$8ab4bdf0$09255...@epifinch>

Subject: Imputing interactions with time using Proc Mixed

 

I would like to impute some missing data on a set of repeated measures
variables for the purpose of conducting a Proc Mixed analysis in SAS that
includes an interaction with the "time" variable.  The "time" variable is
created when the data are structured so that there is one record per
participant-time point.  I understand that because imputation is based on
linear regression, imputed values do not reflect interactions, therefore it
is best to either (1) divide the data and conduct imputation separately on
each category of one of the variables in the interaction, OR (2) form the
product of the two variables in the interaction and include this product in
the interaction.  However, the "time" variable does not exist until I
structure the data so that there is one record per participant-time point,
yet I can't properly impute the variables when they are structured for the
mixed analysis.  How do I get around this problem?

 

Thank you,

 

Emily Finch

 

*********************************************************************

Emily A. Finch, M.A.

Research Coordinator

Division of Epidemiology

University of Minnesota

1100 Washington Ave. S. #201

Minneapolis, MN 55415

(612)625-5895

[email protected]

**********************************************************************

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20040409/d62f2fca/attachment.htm
From william.dupont <@t> Vanderbilt.Edu  Fri Apr  9 17:23:04 2004
From: william.dupont <@t> Vanderbilt.Edu (Dupont, William)
Date: Sun Jun 26 08:25:01 2005
Subject: [Impute] Strata size for hotdeck multiple imputation
Message-ID: <291ef2d70eb7cb43b642323745ed5111d22...@mailbe09>

Dear Impute Listers,

I am currently doing an analysis with multiply imputed missing values
using a hotdeck approach.  We are using Rubin and Schenker's (1986)
approximate Bayesian bootstrap to do the multiple imputations.  In this
approach, the data are divided into strata in which patients in the same
strata are similar with respect to the likely value of some covariate of
interest.  Imputed values of this covariate are then drawn from the real
values of this covariate for patients in the same strata as that of the
case to be imputed.

I have not been able to find any guidance in the literature as to how
large the strata should be.  Larger strata will produce a better
distribution for the draws, while smaller strata will produce a group of
patients who are more similar to the patient of interest.  Do you have
any recommendation as to how large the strata should be?

With many thanks for any advice you can give me.

Bill

William D. Dupont          phone: 615-322-2001          URL
http://www.mc.vanderbilt.edu/prevmed/facstaff/dupont.htm

Reply via email to