[original post from Prof John Roddick, Flinders University South 
Australia, which failed to get through]

> At 3:43 PM +1000 17/6/02, Thomas Beale wrote:
>
>> One thing to be clear on - we must differentiate between "not 
>> recorded" and "not there". Not recording someone's weight does not 
>> make them "weightless" (don't worry I understood the joke, but this 
>> is a serious point as well). A better example would be - not 
>> recording smoking status doesn't make the patient a non-smoker.
>
> I've been following this discussion with some interest.  Apart from 
> Sam's valuable contribution to this, you might want to refer to Simon 
> Parson's paper:

> Parsons, S., 1996. Current approaches to handling imperfect 
> information in data and knowledge bases. IEEE Transactions on 
> Knowledge and Data Engineering 8 (3): 353-372.

> in which he identifies five types of imperfection in data.  Namely:

> 1.  Incomplete.  (eg. test results not known or qualified as in 
> "interim results only")

> 2.  Imprecise.  (eg. age "between 25 and 30" etc.).  This arises from 
> a lack of granularity.

> 3.  Vague.  (eg. blood pressure "high", smokes "a lot", pain "acute", 
> etc.)   This arises from the use of fuzzy terms.

> 4.  Uncertain.  (eg. a 95% chance of accuracy).  Arises from a lack of 
> knowledge or subjective assessment.

> 5.  Inconsistent.  (ie. contradictory information).

> to that you can add a sixth

> 6.  Out-of-date.  (ie. correct when stored by unlikely to be true now).

> These can, of course, be combined!

> Incompleteness has traditionally been handled in databases with the 
> null value. In my opinion this has been totally inadequate but that 
> doesn't stop it being the only option available in most systems.  
> Imprecision and uncertainly is often handled through coercion to the 
> nearest value with all the problems that might cause and vagueness and 
> inconsistency is often not handled at all.  Out-of-date-ness is 
> handled by assuming it doesn't happen.

> For the purposes of GEHR, I would suggest that No. 5. Inconsistent 
> data is a fact of life and since this is somewhat different (it 
> required two pieces of information for example) then we should leave 
> this category to constraint handling and expert interpretation.  
> However, I would suggest we need to find a way of handling the other 
> 5.  It's not initially clear how though.  Perhaps a qualifying field 
> for each critical value?

> If this is seen as important by others, I'll put my mind to thinking 
> it through.

> My two cents worth...

> John.

> --

> Professor John Roddick

> Knowledge Discovery and Management Laboratory

> Flinders University * Adelaide * South Australia

> ----

> Ph: +61 8 8201 5611  Fax: +61 8 8201 3626  Mobile: 0414 190 073

> URL: http://kdm.first.flinders.edu.au/

> Email: roddick at cs.flinders.edu.au



-
If you have any questions about using this list,
please send a message to d.lloyd at openehr.org

Reply via email to