Pat:
I am not even sure that you SHOULD be imputing the "occupational prestige"
variable for someone who is unemployed. Is this truly a case of "missing"
data or that this variable is "undefined" for unemployed persons?
If the latter, you should assign a 0 to occupational prestige for all
unemployed persons. With no missing data, you can then fit the model
g(mu) = b0 + b1*EMPLOY + b2*OCCPREST
with
EMPLOY = 0 (unemployed) or1 (employed)
OCCPREST = 0, if EMPLOY = 0
or
OCCPREST = values from your standard continuous scale, if EMPLOY = 1
The interpretation of OCCPREST is the effect of this variable, given that a
person is employed.
With missing data, I would first impute EMPLOY (from other vars but not
OCCPREST), and then use the observed and imputed data on EMPLOY (and the
other vars) to impute OCCPREST:
-- when EMPLOY = 0, impute OCCPREST = 0, with probability 1 (ie, set to 0,
irrespective of anything else);
-- when EMPLOY = 1, impute OCCPREST according to standard rules.
Here are the more specific steps that I suggest:
1) For all cases where EMPLOY is observed and equal to 0 (unemployed), set
OCCPREST = 0
2) For all cases where EMPLOY = MISS, impute EMPLOY (using all other
variables but not OCCPREST)
3) For cases w/ imputed value 0 for EMPLOY (unemployed), set OCCPREST = 0
4) For cases w/ imputed value 1 for EMPLOY (employed), impute OCCPREST
(using all other variables, but not EMPLOY). Do this second imputation step
only from the subset of data where EMPLOY = 1 (whether observed or imputed).
Hopefully, if I am way off, someone will correct me.
Best,
Constantine
At 03:22 PM 10/5/2006, Patrick Malone wrote:
>Good afternoon.
>
>I'm running what I thought was a straightforward imputation problem in SAS
>PROC MI. The dataset includes age, sex, and dichotomous race (none
>missing), 6 predictors with some missing (4 dichotomies, not grossly
>unbalanced, two reasonably distributed continua), unemployment at 6 waves
>(dichotomies, some missing), and occupational prestige at 6 waves
>(continuous, well-distributed, missing wherever unemployment is a yes).
>
>Everything looks fine except the plots for the occupational prestige
>variables -- all the plots for the dichotomies look good. This is with a
>large number of burn-in iterations (10000) and iterations between
>imputations (5000).
>
>I'm sure most people here know more about what I'm doing than I do, so I'd
>appreciate any advice.
>
>Thanks,
>Pat Malone
>
>--
>Patrick S. Malone, Ph.D., Research Scientist
>Duke University Center for Child and Family Policy
><http://fds.duke.edu/db/aas/PublicPolicy/faculty/malone>http://fds.duke.edu/db/aas/PublicPolicy/faculty/malone
>http://childandfamilypolicy.duke.edu/
>Yahoo Messenger: patricksmalone AOL Instant Messenger: pat2048
The documents accompanying this transmission may contain confidential
health or business information. This information is intended for the use of
the individual or entity named above. If you have received this information
in error, please notify the sender immediately and arrange for the return
or destruction of these documents.
________________________________________________________________
Constantine Daskalakis, ScD
Assistant Professor,
Thomas Jefferson University, Division of Biostatistics
***** NEW ADDRESS *****
1015 Chestnut St., Suite M100, Philadelphia, PA 19107
Tel: 215-955-5695
Fax: 215-503-3804
Email: [email protected]
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20061006/23034a2d/attachment.htm
From malone.ps <@t> gmail.com Wed Oct 11 10:13:44 2006
From: malone.ps <@t> gmail.com (Patrick Malone)
Date: Wed Oct 11 10:14:11 2006
Subject: [Impute] Bad ACF and time series plots
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
Message-ID: <[email protected]>
Thanks, all. I apologize for the delay in responding.
I'll try removing the unemployment variable. However, there are cases that
are missing occupational prestige and unemployment, so they *might* have a
job. I'd welcome suggestions there.
I am planning to analyze with a two-part model, thanks.
Pat
On 10/6/06, Constantine Daskalakis <[email protected]> wrote:
>
> Pat:
>
> I am not even sure that you SHOULD be imputing the "occupational prestige"
> variable for someone who is unemployed. Is this truly a case of "missing"
> data or that this variable is "undefined" for unemployed persons?
>
>
> If the latter, you should assign a 0 to occupational prestige for all
> unemployed persons. With no missing data, you can then fit the model
>
> g(mu) = b0 + b1*EMPLOY + b2*OCCPREST
>
> with
>
> EMPLOY = 0 (unemployed) or1 (employed)
>
> OCCPREST = 0, if EMPLOY = 0
> or
> OCCPREST = values from your standard continuous scale, if EMPLOY = 1
>
> The interpretation of OCCPREST is the effect of this variable, given that
> a person is employed.
>
>
> With missing data, I would first impute EMPLOY (from other vars but not
> OCCPREST), and then use the observed and imputed data on EMPLOY (and the
> other vars) to impute OCCPREST:
>
> -- when EMPLOY = 0, impute OCCPREST = 0, with probability 1 (ie, set to 0,
> irrespective of anything else);
>
> -- when EMPLOY = 1, impute OCCPREST according to standard rules.
>
>
> Here are the more specific steps that I suggest:
>
> 1) For all cases where EMPLOY is observed and equal to 0 (unemployed), set
> OCCPREST = 0
>
> 2) For all cases where EMPLOY = MISS, impute EMPLOY (using all other
> variables but not OCCPREST)
>
> 3) For cases w/ imputed value 0 for EMPLOY (unemployed), set OCCPREST = 0
>
> 4) For cases w/ imputed value 1 for EMPLOY (employed), impute OCCPREST
> (using all other variables, but not EMPLOY). Do this second imputation step
> only from the subset of data where EMPLOY = 1 (whether observed or imputed).
>
> Hopefully, if I am way off, someone will correct me.
>
> Best,
> Constantine
>
>
> At 03:22 PM 10/5/2006, Patrick Malone wrote:
>
> Good afternoon.
>
> I'm running what I thought was a straightforward imputation problem in SAS
> PROC MI. The dataset includes age, sex, and dichotomous race (none
> missing), 6 predictors with some missing (4 dichotomies, not grossly
> unbalanced, two reasonably distributed continua), unemployment at 6 waves
> (dichotomies, some missing), and occupational prestige at 6 waves
> (continuous, well-distributed, missing wherever unemployment is a yes).
>
> Everything looks fine except the plots for the occupational prestige
> variables -- all the plots for the dichotomies look good. This is with a
> large number of burn-in iterations (10000) and iterations between
> imputations (5000).
>
> I'm sure most people here know more about what I'm doing than I do, so I'd
> appreciate any advice.
>
> Thanks,
> Pat Malone
>
> --
> Patrick S. Malone, Ph.D., Research Scientist
> Duke University Center for Child and Family Policy
> http://fds.duke.edu/db/aas/PublicPolicy/faculty/malone
> http://childandfamilypolicy.duke.edu/
> Yahoo Messenger: patricksmalone AOL Instant Messenger: pat2048
>
>
>
> The documents accompanying this transmission may contain confidential
> health or business information. This information is intended for the use of
> the individual or entity named above. If you have received this information
> in error, please notify the sender immediately and arrange for the return or
> destruction of these documents.
>
> ________________________________________________________________
> *Constantine Daskalakis, ScD
> *Assistant Professor,
> Thomas Jefferson University, Division of Biostatistics
> ***** NEW ADDRESS *****
> 1015 Chestnut St., Suite M100, Philadelphia, PA 19107
> Tel: 215-955-5695
> Fax: 215-503-3804
> Email: [email protected]
>
>
> _______________________________________________
> Impute mailing list
> [email protected]
> http://lists.utsouthwestern.edu/mailman/listinfo/impute
>
>
>
--
Patrick S. Malone, Ph.D., Research Scientist
Duke University Center for Child and Family Policy
http://fds.duke.edu/db/aas/PublicPolicy/faculty/malone
http://childandfamilypolicy.duke.edu/
Yahoo Messenger: patricksmalone AOL Instant Messenger: pat2048
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20061011/9bb1459b/attachment-0001.htm
From pmiller <@t> ohtn.on.ca Thu Oct 12 09:02:20 2006
From: pmiller <@t> ohtn.on.ca (Paul Miller)
Date: Thu Oct 12 09:06:11 2006
Subject: [Impute] Lack of between imputation variance in imputing HIV drug
start and stop dates (and drug durations)
Message-ID: <[email protected]>
Hello Everyone,
Awhile ago, I wrote for advice about how go about imputing start and
stop dates for HIV drugs. I got several helpful replies, and have since
gotten IVEWARE to impute the dates within known minimum and maximum
values. I've also tried imputing the start date or stop date along with
the duration, but have not always been able to get the values within
specified bounds. (I need to subsequently code for combination
therapies. So whether I impute starts and stops or duration plus start
and then calculate the stop based on the two, I ultimately need to come
up with estimates of when the patient started and stopped taking each
drug).
When I initially wrote about this, Frank Harrell suggested that there
might be problems with imputed values tending to fall at the boundary
constraints for the start date and the stop date. Craig Newgard then
indicated that IVEWARE actually typically imputes values that are toward
the middle of boundary constraints. In some ways, it seems they were
both right. When the boundaries encompass a narrow range (e.g. less than
or equal to 1 month) the values show a strong tendency to bunch up
around the boundary constraints. In fact, there are many cases where all
of my imputed values are either at the lower or upper boundary yielding
zero between imputation variance for a particular drug. When the
boundaries encompass a relatively wide range, values tend to bunch up a
little toward the middle of the constraints but it doesn't look too bad.
So I was hoping that someone might know how to fix this situation.
Because we are a SAS shop, I've been using IVEWARE. But at this point,
I'd be happy to find a solution that works in IVEWARE or any other
program. I've been working on this for a few months now, but imputing
appropriate start and stop dates for HIV medications is a problem that
has plagued my organization for ten years now. We've simply never had a
good way of doing it and that has hampered our research efforts. So
being as close as I am now, I'd like to find some way to see this
through.
I've been thinking of an alternative way of doing this that involves
coding combination therapies based on a combination of complete and
partial start and stop information for their constituent drugs. If I
were to take this approach, I could presumably create known start and
stop dates for regimens where all of the constituent drugs start and
stop dates are known. I also could create upper and lower bounds for
regimens where some or all of the constituent drugs start and stop dates
are partially observed. But the data manipulation involved seems like it
would be very complex and possibly prone to error. And so I'd like to
avoid that if I can.
Thanks,
Paul
Paul J. Miller, Ph.D.
Research Scientist and Statistician
Ontario HIV Treatment Network
1300 Yonge St., Suite 308
Toronto, Ontario M4T 1X3
Phone: (416) 642-6486 ext 232
Fax: (416) 640-4245
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20061012/89fa4d0c/attachment.htm
From malone.ps <@t> gmail.com Mon Oct 16 14:34:01 2006
From: malone.ps <@t> gmail.com (Patrick Malone)
Date: Mon Oct 16 14:34:06 2006
Subject: [Impute] Bad ACF and time series plots
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
Message-ID: <[email protected]>
I ended up going with a different strategy -- instead of controlling for
early prestige, when there was a lot of unemployment, I just controlled for
early unemployment, which had nearly complete data.
Thanks for the suggestions, everyone.
Pat
--
Patrick S. Malone, Ph.D., Research Scientist
Duke University Center for Child and Family Policy
http://fds.duke.edu/db/aas/PublicPolicy/faculty/malone
http://childandfamilypolicy.duke.edu/
Yahoo Messenger: patricksmalone AOL Instant Messenger: pat2048
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20061016/656b45ac/attachment.htm