My thanks to all who responded to my query.



On Feb 27, 2005, at 8:58 AM, Alan Zaslavsky wrote:

There are a couple of ways of looking at this, and it pays to think a bit beyond the definitions to the analyses that would be required. I agree that your first scenario is MCAR. In the second scenario, missingness is clustered by months. An analysis that ignored the clustering (by ignoring the dependence of missingness on the month labels) would be likely to be wrong, esepcially if you were interested in inference beyond the sampled months. However we sometimes make a distinction between design variables (those known before data are collected) and data; with that distinction, the month labels are design variables but the missingness is independent of both observed and unobserved data values.

In any case the bottom line is that an analysis that didn't take into account the fact that data are only collected in some months would most likely be incorrect in some way.

Date: Fri, 25 Feb 2005 15:39:12 -0700
From: Melissa Roberts <[EMAIL PROTECTED]>
Subject: [Impute] MCAR & MAR assessment
I have monthly event data that covers many individuals over several years. For discussion purposes say I have 100 people and 36 months of data for them, so I have 3600 observations. Events are not consistent from month to month, but there is some consistency in events across months at an individual level.
If I randomly sample from those 3600 observations - for example take 20% using a uniform random number generator - then I am confident is saying the unsampled data can be characterized as MCAR - missing completely at random. The mechanism for being unsampled has nothing to do with variables in the data.
NOW, another sampling method is to randomly sample the MONTHS in the dataset. I take a 20% sample of the months (producing 7 months), and take all the people represented in those months (100 each month), for a total of 700 observations.
Can I assert that this second sample is also MCAR? The mechanism for not being sampled is based solely on a random number generator.
Is the fact that some months are not represented a problem? Would it be just MAR because the nature of the events in those unsampled months could be different than those sampled months? Would characterizing it as MAR be a problem also?



_______________________________________________ Impute mailing list Impute@lists.utsouthwestern.edu http://lists.utsouthwestern.edu/mailman/listinfo/impute

_________________________________

Melissa H. Roberts
Energy, Economic and Environmental Consultants
E3c, Inc.
5600 Wyoming Blvd. NE, Suite 225
Albuquerque, NM  87109
(505) 822-9760 (voice)
(505) 822-9762 (fax)
[EMAIL PROTECTED]
__________________________________


_______________________________________________ Impute mailing list Impute@lists.utsouthwestern.edu http://lists.utsouthwestern.edu/mailman/listinfo/impute

Reply via email to