On 11/03/2010 11:59, Stef Verlinden wrote:
> For those of you interested in the 'problems' within Snomed as an ontology,
> here (http://precedings.nature.com/documents/3465/version/1) you can find a
> good and recent article describing them. This doesn't mean we shouldn't use
> Snomed, but knowing where the problems are is helpful to find solutions as
> Thomas already stated.
>
>
this is one of the best short papers I have seen on Snomed - I recommend
everyone read this. I have never had the time to investigate this
properly, but I made some comments in IHTSDO Tech Committee last last
year, viz:
~~~~~~~~~~~~~~~~~~~~~~ TB post on IHTSDO Nov 2009 ~~~~~~~~~~~~~~~~~~~~~~
Context in Information and terminology models
From: Thomas Beale
Date: Wed, 18 Nov 2009 at 6:25pm
Category: Hot Topic
<https://thecap.basecamphq.com/projects/1384601/cat/13604244/posts>
I have been reading Kent Spackman's "SNOMED expressions and context
patterns" slides from August, which I saw for the first time after
Bethesda. I think the pattern-based approach is a welcome advance.
Coming from the EHR/information model point of view, I was starting to
develop some ideas on mapping to structural models. However, the more I
looked at the context model details against a real example, the more
problems (my misunderstandings?) I have.
I made some initial analysis at http://www.openehr.org/wiki/x/1YJb . See
from about halfway down. I am struggling with the utility of embedding
'temporal context' inside coded expressions, and I also have questions
about 'finding/procedure context' and 'subject relationship context'.
I also read Hanfei Bao's recent document 'A Speculation On Context
Problems of SNOMED CT', where a definition of 'context' is given.
Following that there are examples like
"FH: Myocardial infarction" is-not-a "Myocardial infarction"
My question here would be what does the latter term actually mean, if FH
of MI is-not-a MI? That means that "Myocardial infarction" is assumed to
be some specific kind of MI that doesn't include MIs of family members.
If this is true then the meanings of 'naked' terms like "Myocardial
infarction" are not what we expect (in this case: the phenomenon of a
kind of heart attack, regardless of context). The mere fact that "FH:
Myocardial infarction" includes the term "Myocardial infarction"
indicates that in normal ontological terms it is in fact a kind of MI,
since otherwise we would be talking about the family history of some
other phenomenon.
This is an entirely different kind of consideration to "without skull
fracture" is-not-a "skull fracture", which is a negation.
From my point of view, aspects of context that we need to address include:
* the IHTSDO definition; given the above, I am not sure it is clear yet;
* consider developing a context model as a small ontology rather than
in SNOMED, and on this, base both information models and any SNOMED
representation of context
* issues to do with how complex post-coordinated codes are going to be
routinely and safely created in real EHR systems
* issues of performance in real systems, particularly querying
- thomas beale
~~~~~~~~~~~~~~~~~~~~~~~~~ Jereny Rogers reply ~~~~~~~~~~~~~~~~~~~~~~~
Thomas -- FWIW here are my thoughts and ramblings in response to reading
most (but not yet all) of the WIKI page. Hopefully more as and when, but
this much at least should stimulate some debate.
*On representation of numerics and ordinals*
Its true that neither SNOMED nor AFAIK any other DL-logic based ontology
supports reasoning over ordinals, and for that reason there would be
little advantage in encoding them within 100% ontology expressions --
except for the fact that The Sins of The Past mean that the ontology
already contains a modest number of legacy content concepts that includs
ordinal value expressions, for example the descendents of
417597005|Urine dipstick test finding|.
If we were to kick ordinals firmly into the information model, then
there's an issue of how you would describe the relationship between
these legacy 100% terminology expressions and some external construct
that's a combination of a terminology expunged of ordinals and one or
more information models. On the whole, I could agree with your analysis
if we were in a green field, but the reality is that earlier work has
killed off all green lifeforms to the horizon and beyond.
Meanwhile, the boundary with regard to where values should be
represented (ontology vs information model) is perhaps now somewhat
blurred because OWL at least /does/ support reasoning over real numbers.
It might be worth asking the DL folk what the use case was for including
this reasoning support within the ontology if its was already readily
available in information models.
*On representing any value set substructure (e.g. 2 knee reflex recordings)*
The current MRCM release says the associated_finding attribute has 1:1
cardinality. This means that if you want to record a left and a right
knee reflex finding, then it has to be two separate coded elements
within one observation (or, two separate observations of one element each).
*On the significance of the order in which values and other attributes
are captured/stored*
I'm puzzled by what you mean; some examples of what information is to be
encoded in the ordering would be useful. But in general I'd have thought
it a profoundly bad idea to encode any important information only
implicitly through token order.
*On temporal context*
I agree that where any temporal info and associated temporal inference
is to be managed is a significant problem, especially since so many
clinical queries hinge on the temporal relationship between events
and/or states.
One of the reasons for needing temporal 'classifiers' somewhere in the
overall system is that some clinically significant information
inevitably has to be entered retrospectively and may therefore have a
very imprecise time stamp. So you can't rely on time-stamp reasoning
entirely for all temporal reasoning jobs.
A realistic clinical scenario example would be:
Q. Have you had any major illness or operations?
A. Yes - I lost sight in one eye for a month
Q. When?
A. About 20 years ago; I don't remember exactly when. They did lots of tests.
One solution (that I actually have to follow today in clinical
information systems that only support date stamping) is to encode the
above with a fictitious system date stamp of e.g. 1.1.1989. But this is
clearly a false level of accuracy and can also cause problems if the
original but correctly time-stamped record later becomes available and
is then merged into mine.
Having said all that, I'd personally agree that SNOMED may have built
out from this difficult edge case and constructed something that has at
least the appearance of a more comprehensive solution for temporal
relationships, and then conflated this into models for epistemology,
probability and state transition. But if some data in the EPR *will* be
accurately time-stamped, then how that interoperates with the resulting
complex set of SNOMED temporal classifier values remains an open
question. And then, of course, we also have the 'follows' and 'after'
semantic links in SNOMED.
All this does of course create a problem for OpenEHR if it has to
maintain more than one 'context' solution for different
terminologies/classifications, depending on whether they individually
offer within themselves solutions for context (temporal or subject of
record). But SNOMED already has the exact mirror image of the same
problem: much of its content was developed for and is used today with
existing somewhat impoverished information models, most of which have no
model for context.
*On "FH: Myocardial infarction" is-not-a "Myocardial infarction"*
This statement unpacks into the (hopefully) unarguable statement that a
patient whose mother has had an MI should not be returned in response to
a query looking for patients who have themselves had an MI.
/If/ both entities 'FH of MI' and 'PPH of MI' (PPH=previous personal
history) are encoded in the terminology, and /if/ all subtype querying
functionality is encoded in /and only in/ an 'is-a' hierarchy, /then/
the only way you can get the clinically correct answer is if 'FH of MI'
is-not-a 'PPH of MI'.
Buried in the above are a number of assumptions and conditions, notably
that sticking 'MI' into the record actually means 'PPH of MI'. But
ontologically speaking we'd still say that 'H of MI' is-not-a 'MI'.
There *is* a semantic relationship between the two, but it is
emphatically not an is-a relationship.
~~~~~~~~~~~~~~~~~~~~~ TB reply ~~~~~~~~~~~~~~~~~~~
Thanks Jeremy,
*Ordinals:*
In principle I think the data type should only be in the information
model, however, the terms used in the data type would preferably be in
the terminology (but with freedom for people to create ordinals using
local/specialist terms where necessary). The key here is that the
implementation guidance on how to represent ordinal values in a health
information system -- should be to use an ordinal data type (e.g.
DvOrdinal in openEHR) not just a naked term, which gives no
computability (can't determine the < relations).
*Order in information*
It is not so much that order encodes 'hard information' that would be
lost if the ordering were lost, but in clinical information
examples/requests we get from in the field workers, order is seen as
very important to comprehensibility in many places. Order in many kinds
of notes essentially corresponds to things like a) a chronological train
of analytical thought of the physician, b) a structural model of
something, e.g. disease course described as a group of dates like date
of last occurrence, date initially recognised etc, c) a 'typical' or
customary way of presenting information, e.g. endoscope findings.
Throwing out order will make a lot of health professionals really mad.
*Temporal context*
A lot of clinical information is entered after the fact (a majority in
some places), but in general this doesn't change the accuracy of the
timing information. I would suggest that the kind of information where
this is the case is more findings/diagnoses recounted by patients, e.g.
telling the GP when they were diagnosed with diabetes, or when their
parent died of a heart attack. But even in these cases, they are
providing a partial date, e.g. 1990-xx-xx or even '1930 plus or minus
5y', both of which in health computing can be used like any other date.
I still don't see how it helps to classify such information as 'in the
past' -- this is simply a term that has to be additionally recorded, and
I can't see how it helps computability. It also adds the risk that
software creates the wrong term; then some other part of the system
might forget to look at the date, if it saw the (wrong) term 'in the
future'. In general we just need proper models of date/time, which
openEHR and HL7 have had for many years. The work in NHS CUI also
assumes partial dates/times.
On the more strategic question of whether to use SNOMED for representing
context in 'more impoverished information models' (of which many
proprietary ones qualify), I would first be interested to know which
such models actually use terminology at all, beyond ICDx, ICPC etc. In
my experience so far, it is almost none -- very few in the US (Mayo
being a post-coding exception). Clem McDonald said 2 years ago that he
had never seen a SNOMED code in data. I think any perceived 'need' for
SCT to supply a context model to older private models of health data
that don't supply their own should be backed up by some evidence.
Secondly, if there is evidence (and I am not saying there isn't), then
the correct approach in my view is to have a common underlying ontology
of context. The representation of context in openEHR is in fact based on
ontological principles to achieve this. People like Barry Smith would
say: do it properly, make a self-standing ontological model of context
in clinical recording, and that is probably what we should do. Trying to
model context in any comprehensive way in a terminology won't help the
'impoverished' systems, since SCT can't do any better in terms of
quantities, dates, times or other non-text values anyway.
*On "FH: Myocardial infarction" is-not-a "Myocardial infarction"*
If the assumption of SNOMED means that the term 'Myocardial Infarction'
really means 'Previous personal history of Myocardial Infarction', then
what SNOMED term do we use to represent just 'Myocardial Infarction'?
For example if I wanted to list the things that patient X might be at
risk of? Can a patient be at risk of 'Previous Personal History of
Myocardial Infarction'? How does this compare to the ICD term for the
same thing? How are we to use SNOMED in a reporting or clinical study
application if we can't just use the term 'disease X' to mean 'disease
X' instead of 'personal history of disease X'?
I hope people do not mind me pushing in a few places. I think that if
SCT is going to be widely used, its foundations must be absolutely rock
solid, and at the moment, I need some convincing.
thanks for listening
- thomas beale
*
*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20100311/1da48cb2/attachment.html>