On 11/03/2010 11:59, Stef Verlinden wrote: > For those of you interested in the 'problems' within Snomed as an ontology, > here (http://precedings.nature.com/documents/3465/version/1) you can find a > good and recent article describing them. This doesn't mean we shouldn't use > Snomed, but knowing where the problems are is helpful to find solutions as > Thomas already stated. > >
this is one of the best short papers I have seen on Snomed - I recommend everyone read this. I have never had the time to investigate this properly, but I made some comments in IHTSDO Tech Committee last last year, viz: ~~~~~~~~~~~~~~~~~~~~~~ TB post on IHTSDO Nov 2009 ~~~~~~~~~~~~~~~~~~~~~~ Context in Information and terminology models From: Thomas Beale Date: Wed, 18 Nov 2009 at 6:25pm Category: Hot Topic <https://thecap.basecamphq.com/projects/1384601/cat/13604244/posts> I have been reading Kent Spackman's "SNOMED expressions and context patterns" slides from August, which I saw for the first time after Bethesda. I think the pattern-based approach is a welcome advance. Coming from the EHR/information model point of view, I was starting to develop some ideas on mapping to structural models. However, the more I looked at the context model details against a real example, the more problems (my misunderstandings?) I have. I made some initial analysis at http://www.openehr.org/wiki/x/1YJb . See from about halfway down. I am struggling with the utility of embedding 'temporal context' inside coded expressions, and I also have questions about 'finding/procedure context' and 'subject relationship context'. I also read Hanfei Bao's recent document 'A Speculation On Context Problems of SNOMED CT', where a definition of 'context' is given. Following that there are examples like "FH: Myocardial infarction" is-not-a "Myocardial infarction" My question here would be what does the latter term actually mean, if FH of MI is-not-a MI? That means that "Myocardial infarction" is assumed to be some specific kind of MI that doesn't include MIs of family members. If this is true then the meanings of 'naked' terms like "Myocardial infarction" are not what we expect (in this case: the phenomenon of a kind of heart attack, regardless of context). The mere fact that "FH: Myocardial infarction" includes the term "Myocardial infarction" indicates that in normal ontological terms it is in fact a kind of MI, since otherwise we would be talking about the family history of some other phenomenon. This is an entirely different kind of consideration to "without skull fracture" is-not-a "skull fracture", which is a negation. From my point of view, aspects of context that we need to address include: * the IHTSDO definition; given the above, I am not sure it is clear yet; * consider developing a context model as a small ontology rather than in SNOMED, and on this, base both information models and any SNOMED representation of context * issues to do with how complex post-coordinated codes are going to be routinely and safely created in real EHR systems * issues of performance in real systems, particularly querying - thomas beale ~~~~~~~~~~~~~~~~~~~~~~~~~ Jereny Rogers reply ~~~~~~~~~~~~~~~~~~~~~~~ Thomas -- FWIW here are my thoughts and ramblings in response to reading most (but not yet all) of the WIKI page. Hopefully more as and when, but this much at least should stimulate some debate. *On representation of numerics and ordinals* Its true that neither SNOMED nor AFAIK any other DL-logic based ontology supports reasoning over ordinals, and for that reason there would be little advantage in encoding them within 100% ontology expressions -- except for the fact that The Sins of The Past mean that the ontology already contains a modest number of legacy content concepts that includs ordinal value expressions, for example the descendents of 417597005|Urine dipstick test finding|. If we were to kick ordinals firmly into the information model, then there's an issue of how you would describe the relationship between these legacy 100% terminology expressions and some external construct that's a combination of a terminology expunged of ordinals and one or more information models. On the whole, I could agree with your analysis if we were in a green field, but the reality is that earlier work has killed off all green lifeforms to the horizon and beyond. Meanwhile, the boundary with regard to where values should be represented (ontology vs information model) is perhaps now somewhat blurred because OWL at least /does/ support reasoning over real numbers. It might be worth asking the DL folk what the use case was for including this reasoning support within the ontology if its was already readily available in information models. *On representing any value set substructure (e.g. 2 knee reflex recordings)* The current MRCM release says the associated_finding attribute has 1:1 cardinality. This means that if you want to record a left and a right knee reflex finding, then it has to be two separate coded elements within one observation (or, two separate observations of one element each). *On the significance of the order in which values and other attributes are captured/stored* I'm puzzled by what you mean; some examples of what information is to be encoded in the ordering would be useful. But in general I'd have thought it a profoundly bad idea to encode any important information only implicitly through token order. *On temporal context* I agree that where any temporal info and associated temporal inference is to be managed is a significant problem, especially since so many clinical queries hinge on the temporal relationship between events and/or states. One of the reasons for needing temporal 'classifiers' somewhere in the overall system is that some clinically significant information inevitably has to be entered retrospectively and may therefore have a very imprecise time stamp. So you can't rely on time-stamp reasoning entirely for all temporal reasoning jobs. A realistic clinical scenario example would be: Q. Have you had any major illness or operations? A. Yes - I lost sight in one eye for a month Q. When? A. About 20 years ago; I don't remember exactly when. They did lots of tests. One solution (that I actually have to follow today in clinical information systems that only support date stamping) is to encode the above with a fictitious system date stamp of e.g. 1.1.1989. But this is clearly a false level of accuracy and can also cause problems if the original but correctly time-stamped record later becomes available and is then merged into mine. Having said all that, I'd personally agree that SNOMED may have built out from this difficult edge case and constructed something that has at least the appearance of a more comprehensive solution for temporal relationships, and then conflated this into models for epistemology, probability and state transition. But if some data in the EPR *will* be accurately time-stamped, then how that interoperates with the resulting complex set of SNOMED temporal classifier values remains an open question. And then, of course, we also have the 'follows' and 'after' semantic links in SNOMED. All this does of course create a problem for OpenEHR if it has to maintain more than one 'context' solution for different terminologies/classifications, depending on whether they individually offer within themselves solutions for context (temporal or subject of record). But SNOMED already has the exact mirror image of the same problem: much of its content was developed for and is used today with existing somewhat impoverished information models, most of which have no model for context. *On "FH: Myocardial infarction" is-not-a "Myocardial infarction"* This statement unpacks into the (hopefully) unarguable statement that a patient whose mother has had an MI should not be returned in response to a query looking for patients who have themselves had an MI. /If/ both entities 'FH of MI' and 'PPH of MI' (PPH=previous personal history) are encoded in the terminology, and /if/ all subtype querying functionality is encoded in /and only in/ an 'is-a' hierarchy, /then/ the only way you can get the clinically correct answer is if 'FH of MI' is-not-a 'PPH of MI'. Buried in the above are a number of assumptions and conditions, notably that sticking 'MI' into the record actually means 'PPH of MI'. But ontologically speaking we'd still say that 'H of MI' is-not-a 'MI'. There *is* a semantic relationship between the two, but it is emphatically not an is-a relationship. ~~~~~~~~~~~~~~~~~~~~~ TB reply ~~~~~~~~~~~~~~~~~~~ Thanks Jeremy, *Ordinals:* In principle I think the data type should only be in the information model, however, the terms used in the data type would preferably be in the terminology (but with freedom for people to create ordinals using local/specialist terms where necessary). The key here is that the implementation guidance on how to represent ordinal values in a health information system -- should be to use an ordinal data type (e.g. DvOrdinal in openEHR) not just a naked term, which gives no computability (can't determine the < relations). *Order in information* It is not so much that order encodes 'hard information' that would be lost if the ordering were lost, but in clinical information examples/requests we get from in the field workers, order is seen as very important to comprehensibility in many places. Order in many kinds of notes essentially corresponds to things like a) a chronological train of analytical thought of the physician, b) a structural model of something, e.g. disease course described as a group of dates like date of last occurrence, date initially recognised etc, c) a 'typical' or customary way of presenting information, e.g. endoscope findings. Throwing out order will make a lot of health professionals really mad. *Temporal context* A lot of clinical information is entered after the fact (a majority in some places), but in general this doesn't change the accuracy of the timing information. I would suggest that the kind of information where this is the case is more findings/diagnoses recounted by patients, e.g. telling the GP when they were diagnosed with diabetes, or when their parent died of a heart attack. But even in these cases, they are providing a partial date, e.g. 1990-xx-xx or even '1930 plus or minus 5y', both of which in health computing can be used like any other date. I still don't see how it helps to classify such information as 'in the past' -- this is simply a term that has to be additionally recorded, and I can't see how it helps computability. It also adds the risk that software creates the wrong term; then some other part of the system might forget to look at the date, if it saw the (wrong) term 'in the future'. In general we just need proper models of date/time, which openEHR and HL7 have had for many years. The work in NHS CUI also assumes partial dates/times. On the more strategic question of whether to use SNOMED for representing context in 'more impoverished information models' (of which many proprietary ones qualify), I would first be interested to know which such models actually use terminology at all, beyond ICDx, ICPC etc. In my experience so far, it is almost none -- very few in the US (Mayo being a post-coding exception). Clem McDonald said 2 years ago that he had never seen a SNOMED code in data. I think any perceived 'need' for SCT to supply a context model to older private models of health data that don't supply their own should be backed up by some evidence. Secondly, if there is evidence (and I am not saying there isn't), then the correct approach in my view is to have a common underlying ontology of context. The representation of context in openEHR is in fact based on ontological principles to achieve this. People like Barry Smith would say: do it properly, make a self-standing ontological model of context in clinical recording, and that is probably what we should do. Trying to model context in any comprehensive way in a terminology won't help the 'impoverished' systems, since SCT can't do any better in terms of quantities, dates, times or other non-text values anyway. *On "FH: Myocardial infarction" is-not-a "Myocardial infarction"* If the assumption of SNOMED means that the term 'Myocardial Infarction' really means 'Previous personal history of Myocardial Infarction', then what SNOMED term do we use to represent just 'Myocardial Infarction'? For example if I wanted to list the things that patient X might be at risk of? Can a patient be at risk of 'Previous Personal History of Myocardial Infarction'? How does this compare to the ICD term for the same thing? How are we to use SNOMED in a reporting or clinical study application if we can't just use the term 'disease X' to mean 'disease X' instead of 'personal history of disease X'? I hope people do not mind me pushing in a few places. I think that if SCT is going to be widely used, its foundations must be absolutely rock solid, and at the moment, I need some convincing. thanks for listening - thomas beale * * -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20100311/1da48cb2/attachment.html>