Flavour of null

Koray Atalag Sun, 10 Apr 2005 10:56:27 +0300

Dear All,

I had been pretty busy with an ambitious EU FP6 Project Proposal which is
related with my thesis work as most of you are also aware of: CEREBRUS which
we were not able to finish till the deadline at March 22. But we decided to
move on and wait for next calls and look for other opportunities of
funding/support. Even the proposal activity is Open Source and at SF.NET:
http://cerebrus-fp6.sourceforge.net

All these messages about "Flavours of Null" are indeed very useful solution
proposals I think pointing out to the very fact that the "Null" is not that
"easy" entity to handle...Apart from examples in clinical LAB and quality
measures (I think in US CLIA 88 regulations) there are many other contexts
that is paradigm is a real problem: such as the Bethesda System 2001 in
cervical smear reporting. I had been working on this since 2001 and in fact
developed many working information systems (Freely available as opensource
from SF.NET: http://sourceforge.net/projects/pathos-web/ )

I think we should not add too many
meanings/information/pre-information/contextual information to a single
entity, and we had better separate the levels of:
A) data
B) information
C) knowledge

in approaching the problem of "NULL"...

In order to make "sensible", really "implementable" and "user-friendly"
systems in mind, the approach to the problem has to be restated as below and
possibly be formulated in a methodology and solved as with my proposal:

Problem statement and intial analysis from a historical perspective:
1) From a computational point of view: We should evaluate if if makes any
"sense" to a computational/information system whether all the additional
"contextual/information/knowledge" level attributes are of any use or change
the "result/response" expected from it; i.e. The current "digital" or I
should say "binary" computers do not understand at the lowest level the
null. It is either "present/positive/one" or "absent/negative/zero". That is
why in all early programming and database systems the data type
"boolean/bit" is modelled and it is very useful for many systems in many
domains.

2) As the informatics science is evolved, the real systems had to
incorporate "empty/null" aspects of data...Then two bits are appended to
incorporate this "more than 1 bit" of information and in many cases it is
not needed and 1 bit is lost for nothing; more memory, more storage and etc.
Even the Y2K problem originated from such an approach. This is str?ngly
related with "Fuzzy Systems" and there is vast amount of research and
solutions as far as I know. I think at the end we will probably need to
change the very architecture of our computer systems and storage techologies
so as to handle this: 3 state and continous/near analogue processors and
storage schemes.

3) In complex domains such as clinical medicine, then we are also adding the
"Flavours of Null" and now considering to incorporate into the very heart of
our models: the Data Types...If this happens then we will be spending a
whole lot of our memory/storage and eventually processing performance of our
"next-generation" systems as the Americans did with their cars back at the
60s...

Methodology:
1) Analyse the data values expected to be encountered: if it is mostly 0/1,
Yes/No, True/False, Positive/Negative then do not bother with flavours of
Null and use the "good-old" technique and spend 1 bit.
2) If there is a need to put more information to the value
(information/context) Then start with the "essential" flavours of null - the
true natural features, not the ones we the human kind created to make this
complex world even more complicated! These are in fact the "context and
domain" free aspects of being "Null" or "Empty" as Grahame Grieve pointed
out in his nice and useful message:
   * *no information*: No information provided; nothing can be inferred
>       as to the reason why, including whether there might be a possible
>       applicable value or not
>     * *not available* (unknown): A possible value exists but is not
>       provided (ask user)
>     * *masked*: The value has not been provided due to privacy settings
>       (settable by extract / message serialiser)
>     * *not applicable*: No valid value exists for this data item in this
>       context (should be knowable by application)
This can be further discussed and improved in a more "ICT philosophical" way
I believe...

3) Put all the "non-essential" and "contextual" components via a "contextual
archetype-component" as in the solution for "protocols" and employ the
"knowledge" aspects; meaning which information is appropriate in that
particular situation/context. So this is a Knowledge Enabled Contextual
Plug-In or Add-On approach as I just propose in this beautiful Sunday
morning! But they all should map to the "essential" information entities as
given in Methodology 2.

I would recomment that you all examine by heart how openSDE approaches this
problem in its newer version developed by Erasmus team; mainly by Astrid van
Ginneken and Marcel de Wilde...There is an immense and many years of
practice and feedback from real clinical uses and users behind this work. It
is also Open Source and freely available from SF.NET:
http://sourceforge.net/projects/opensde/

At the end of the day, the information/computational system makes use of
true/unknown/false aspects of information provided by user to make
inferences or just produce reports and etc. 

I also want to point out to the fact that the real "Null" concept is a
problem of infinity in math: you can in fact model and represent any value
with a continuous number space between -1 0 +1 .... There is not much debate
on the "Flavours of True/Positive or False/Negative" but in fact it also
exists and in clinical medicine it should better be presented with at least
sensitivity/specifity measures. But the current debate on "Flavours of Null"
is more complicated as it is an essential paradigm that is present in the
universe in many interesting domains. So all these concepts I think have to
be approached little "philosophically" but solved in a "sensible and
practical" way.

So my proposal in short is:
1) Examine the possible data values to be expected in a particular field: If
can be solved with a simple True/False then assign a bit.
2) If not then employ the "Essential Flavours of Null" which should appear
as a separate Data Type I believe
3) If extra contextual information is needed at the time of design or will
probably be needed in future (This requires a careful study by taking into
consideration all viewpoints: legal, epidemiological and etc.) then assign
"Knowledge-Enabled Contextual Archetype Plug-Ins" and provide mappings to
the essential ones. 

That is my "simple" solution proposal that I had been deeply thinking on for
some years!

Best regards,

Dr. Koray Atalag
METU Informatics Institute
Ph.D. Candidate on Information Systems

-----Original Message-----
From: [email protected]
[mailto:owner-openehr-technical at openehr.org] On Behalf Of Elkin, Peter L.,
M.D.
Sent: Sunday, April 10, 2005 4:57 AM
To: 'openehr-technical at openehr.org'
Subject: RE: Flavour of null

Sam,

By way of a friendly amendment, I would say that the Information Model of
Null should include both the type of Null and separately the reason for it
being Null as separate attributes (employing an Ontology of Null).  I agree
that Null should be part of the Information Model explicitly rather than a
datatype.  For Example:

Null
        Unknown
        Not available
        Not evaluated
        Insufficient Information
        Result out of Valid Range
        Testing yielded no value

Reason_for_Null
        Lost the Sample
        Specimen destroyed
        Sample Hemolyzed
        Sample Lipemic
        Sample too long in transport
        Specimen Clotted
        Machine Error
        Human Error
        Etc.

Warm regards,

Peter

Peter L. Elkin, MD
Professor of Medicine
Director, Laboratory of Biomedical Informatics Department of Internal
Medicine Mayo Clinic, College of Medicine Mayo Clinic, Rochester
(507) 284-1551
Fax: (507) 284-5370

-----Original Message-----
From: [email protected]
[mailto:owner-openehr-technical at openehr.org] On Behalf Of Sam Heard
Sent: Saturday, April 09, 2005 5:56 PM
To: openehr-technical at openehr.org
Subject: Re: Flavour of null

Dear All

OK, I was just checking to see where the detailed reason for a result being
NULL should be and am happy to build this into the laboratory archetype in
the way Graham suggests, and to leave the generic Flavour of NULL as Tom
thinks.

Cheers, Sam

> Grahame Grieve wrote:
> 
>> Hi Sam
>>
>> I've discussed this particular case at HL7 before, but I don't 
>> remember whether any answer was agreed. But to me, this case needs to 
>> be coded - there's a fairly small set of reasons why the laboratory 
>> would report that an answer was not available, and the reasons 
>> themselves may have meaning
>>
>> I advance this small hierarchical vocab:
>>
>> + NS - not suitable
>>   + HM - haemolysed
>>     + HM1
>>     + HM2  { rating for how haemolysed
>>     + HM3  { ? maybe a seperate element
>>     + HM4
>>   + LP - lipeamic
>>     + LP1
>>     + LP2  { rating for how lipaemic
>>     + LP3  { ? maybe a seperate element
>>     + LP4
>>   + WP - wrong preservative
>>   + INS - insufficient sample
>> + ERR - handling error
>>   + AGE - too long to deliver to lab or other delivery problem
>>   + LACC - laboratory accident
>>   + FAIL - specimen could not be analysed for
>>            technical reasons that were not accidental
>>
>> I may have missed some heam and micro specific reasons - I worked in 
>> the core lab.
>>
>> Some Australian laboratories are reporting meaningless numbers and 
>> then reporting the error as a comment, rather than reporting a null 
>> value - so they can be paid. In spite of my strong clinical objection 
>> to this practice, this suggests that this isn't a null-flavour issue, 
>> and indeed, for lipaemic samples, except for a few analytes, I used 
>> to report the numbers and just note that the numbers were lower 
>> because of the volume effects.
>>
>> So I think that this is a "laboratory quality indicator"
>> that is a separate element to the actual value, since there is 
>> various cases where you'd want to report both - and I think this is 
>> worth modelling in the base pathology result archetype.
> 
> I agree 100% - I don't see this as a flavour of null problem, because 
> flavour of null is/should be about:
> - inability to provide a value to the computer system at runtime. A 
> possible value set I have proposed in the past:
> 
>     * *no information*: No information provided; nothing can be inferred
>       as to the reason why, including whether there might be a possible
>       applicable value or not
>     * *not available* (unknown): A possible value exists but is not
>       provided (ask user)
>     * *masked*: The value has not been provided due to privacy settings
>       (settable by extract / message serialiser)
>     * *not applicable*: No valid value exists for this data item in this
>       context (should be knowable by application)
> 
> This value set works for all contexts, is independent of setting, and 
> (I
> believe) should be settable by software. I have my doubts as to 
> whether there is any milage in having the first two distinct.
> 
> In any case, this idea of null value is only partly the same as the 
> use case here. In the lab situation, some information items are not 
> available, so you could set the null flavour to "not available", but 
> the actual reasons for this are specific to the setting and the test.
> Clearly we cannot have a single vocabulary for flavour-of-null which 
> rolls in the value sets of all the possible flavours-of-null for all 
> settings, tests etc, such as the one Grahame has used above.
> 
> One solution initially appears to be to allow the flavour of null 
> vocabulary itself to be settable (i.e. that in archetypes you could 
> set a different flavour of null vocabulary depending on the field), 
> but this is flawed, since we still want a generic flavour of null 
> value (e.g. one of the 4 above), so that querying can work properly. 
> So we either need two flavour of null values for each value field - 
> one generic, one specific to setting & context - which seems somewhat 
> excessive, or we need to regard the specific "flavour of null" as 
> something else, probably something like a lab quality indicator as 
> Grahame suggests. I agree that pathology archetypes should probably 
> include such indicators explicitly in their model of data.
> 
> - thomas beale
> 
>>
>> Grahame
>>
>> Sam Heard wrote:
>>
>>> Dear All
>>>
>>> A reminder on why flavour of null is at the ELEMENT level: it allows 
>>> a composition with mandatory data to be saved even if the data is 
>>> not available, or allows a reason to be stated for data that is missing.
>>> It also allows us to deal with the HL7 flavour of null on the data 
>>> types.
>>>
>>> I am concerned that the flavour of null is set to DV_CODED_TEXT and 
>>> not DV_TEXT (ie. it has to be coded from a terminology). I agree 
>>> that some systems will want things coded for safety in some 
>>> situations, but I believe that this should be handled through 
>>> archetypes and templates.
>>>
>>> Laboratories will want to use this for all sorts of reasons, one 
>>> clear example is when an electrolyte sample has haemolysed - and 
>>> they cannot give a potassium reading (they do not want to omit it!)
>>>
>>> So I want to propose that the flavour of null is set to DV_TEXT.
>>>
>>> Cheers
>>> Sam Heard
>>> -
>>> If you have any questions about using this list, please send a 
>>> message to d.lloyd at openehr.org
>>
> 
> 
> --
> ______________________________________________________________________
> _____________ Research Fellow, University College London 
> (http://www.chime.ucl.ac.uk) Chair Architectural Review Board, openEHR 
> (http://www.openEHR.org) CTO Ocean Informatics 
> (http://www.OceanInformatics.biz)
> 
> - If you have any questions about using this list, please send a 
> message to d.lloyd at openehr.org
-
If you have any questions about using this list, please send a message to
d.lloyd at openehr.org
-
If you have any questions about using this list, please send a message to
d.lloyd at openehr.org

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.5 - Release Date: 07/04/2005

-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.5 - Release Date: 07/04/2005

-
If you have any questions about using this list,
please send a message to d.lloyd at openehr.org

Flavour of null

Reply via email to