Clem McDonald wrote:

>What Ed has described is the proper way to bet on the number that is not there
>when you have to for statistical purposes
>
>You ahve to assume some complete date for the purpose of the statistics.
>Taking the Mean date for hte month or the  mean month for the year is the best
>you can do. (We have used hte first of hte month and the first of the year and
>that DOES produce a bias
>
Right, so we have followed the first approach in the model:

day missing from date => synthesize 15th / MM / yyyy
day and month missing from date => synthesize 30 / JUN / yyyy

We also added the function

possible_dates:INTERVAL<DATE>

For a missing day => synthesize {1 / MM / yyyy - days_in_month(MM) / MM 
/ yyyy}
For a missing month => synthesize {1 / 1 / yyyy - 31 / DEC / yyyy}

These functions can be used to catch the fuzzy dates when querying.  I 
guess there will be a skew when multiple successive queries are run on 
the same data because fuzzy dates will match more queries, so there must 
be a better way to do this. Methods I can think of include:

- assigning a random date to each fuzzy date. This is hard, because as 
more dates are added to the system, you have to keep monitoring them to 
be sure that you are still setting the fuzzy ones to a truly random value

- using the interval method above, but when doing a series of queries, 
remembering when you already have matched an item, to prevent double 
inclusion.

Do Regenstrief or Duke have any method for making fuzzy dates match queries?

- thomas beale

>
>
>As it turns out the Social security adminestration and the Tumor registry
>systems do the same thing (that ed reccomends) when the specfics are not
>known.
>
>Thomas Beale wrote:
>
>>William E Hammond wrote:
>>
>>>Time to weigh in on fuzzy dates.  We have been using fuzzy dates at Duke
>>>and in TMR since the early 70s for just the reason Sam states.  Often
>>>patients will know on;y the year, more frequently the month and year only
>>>but no date.  We discover that partial data is much more useful than no
>>>data.
>>>
>>>So we used fuzzy dates.  The fuzzy dates are displayed with ?? for the
>>>unknown parts.  Whenever we sort, a fuzzy day sorts to the 15th of the
>>>month, and a fuzzy year sorts to July.
>>>
>>Ed, presumably you meant "a fuzzy month". This is the design we have
>>used, so that's encouraging (when can we install it at Duke?-).
>>
>>>Statisticians are generally unhappy
>>>with fuzzy dates and want to throw them out.
>>>
>>I am not convinced that the statistical arguments are so great - I can
>>see that there would be a skew towards things that happen more often on
>>the 15th of the month, due to the day-less dates in the system, but I
>>can't think of any clinical research that would be looking at that. Are
>>there any studies on the dangers of fuzzy dates in statistical analysis?
>>
>>>But every one seems happy
>>>when someone records the date of onset for hypertension as July 4, 1976.
>>>Where is the hour, minutes and seconds.  I argue that fuzzy dates are
>>>acceptable and valid data points and should be used in statistical
>>>analysis.
>>>
>>>In a datetime stamp, unknowns are stored as 00.  Thank goodness, we use
>>>another saymbol for a totally unknown date.
>>>
>>>Ed Hammond
>>>
>>- thomas beale
>>
>>-
>>If you have any questions about using this list,
>>please send a message to d.lloyd at openehr.org
>>
>
>--
>Director, Regenstrief Institute
>Regenstrief Professor of Medical Informatics
>Distinguished Professor of Medicine
>Indiana University School of Medicine
>1050 Wishard Blvd  RG5
>Indianapolis  IN  46202-2872
>Phone:  317/630-7070
>Fax:  317/630-6962
>URL:  www.regenstrief.org
>
>
>
>

-- 
..............................................................
Deep Thought Informatics Pty Ltd  

mailto:thomas at deepthought.com.au
open EHR - http://www.openEHR.org
Archetype Methodology - http://www.deepthought.com.au/it/archetypes.html        
        
Community Informatics - http://www.deepthought.com.au/ci/rii/Output/mainTOC.html
..............................................................



-
If you have any questions about using this list,
please send a message to d.lloyd at openehr.org

Reply via email to