ISO 21090 data types too complex? - (longish)

Eric Browne Mon, 8 Nov 2010 13:34:56 +1030

Stef et al,

In response to Stef's plea for others' opinions, I'd like to add my voice to 
Tom's concerns.

I certainly believe that the whole ISO process with respect to health 
informatics standards is deeply flawed. As Grahame implies with the datatypes 
standard, the process is politically driven and compromises in modelling, 
engineering, safety, implementability inevitably occur. The question is how 
significant are these compromises and what effect will they have on the 
evolution of e-health?

It is highly unlikely that we would have an ISO standard for "Health 
Informatics - Harmonized data types for information interchange" without the 
monumental effort of Grahame Grieve in producing and managing the draft. 
However, it is, first and foremost, an HL7 flavoured standard. The most recent 
draft I have seen is, according to its forward, "a shared document between 
Health Level Seven (HL7) and ISO". ISO 21090 is undoubtedly complex. One has to 
question the value of an international standard, if it is so complex that it 
has to be 'profiled' by different organisations before it can be used. By whom, 
for what purposes, and by what processes, will such profiling be managed?

ISO 21090 suffers some of the significant flaws that permeate much of HL7 
specifications. Tom has already cited the peculiar inheritance hierarchy 
amongst others. Another engineering flaw is the pervasive use of cryptic, often 
ad hoc enumerations. Even the names of the types wouldn't pass muster in most 
quality engineering schools. Names like ENP, HXIT, CO, EN, EN.TN, CD.CV, URG 
are simply inexcusable. Levels of indirection never aid readability, and lead 
to difficulty in implementation and testing.

It is not necessarily sensible to compare openEHR datatypes with ISO 21090. 
They are designed for different purposes. openEHR datatypes underpin openEHR's 
reference implementation and archetype object models for building electronic 
health record software and so can be augmented by these additional artefacts, 
as described below. The ISO datatypes should be able to stand on their own in a 
diverse range of implementation environments. This is a much harder task, and 
bumps up against fundamental principles of information exchange, whereby the 
assumptions of participating systems need to be carefully considered. 
Contraints and constraint mechanisms are pivotal here.

A datatype embodies the "agreed" set of values and operations pertaining to 
that type. If an item of received data "211414" has been denoted to be of type 
integer, then the receiving system "knows" how to process it, and will process 
it differently than if it had been denoted as a date ( AKA TS.DATE in 
HL7/ISO/DIS 21090 HI-HDTII ).  Healthcare includes a very rich vocabulary, and 
text-based value sets are common in information exchange. A datatype for coded 
text, say, needs to convey the agreed set of values of that type. Let's firstly 
consider values for "severity of adverse reaction to medication". Ideally, both 
a sending and a receiving system needs to agree on the set of values - and may 
behave sub-optimally if one system uses the set { "undetectable", "mild", 
"moderate" } and the other uses the set { "mild", "moderate", "severe", 
"extreme", "almost inevitably fatal" } , even if these values all came from the 
same terminology. In other words, the sending and receiving system are not 
actually using the same datatypes in this case.

How do we deal with this in real systems? The United Kingdom's Connecting for 
Health program has addressed this in their HL7 V3 - based models by carrying 
the constraint within the datatype - in the coding scheme's identifier. So 
rather than say the values come from some specific version of SNOMED CT, they 
constrain the values to a specific subset using a Refset Identifier. And this 
can be carried in instance data.

Now whilst ISO 21090 is capable of constraining text-based value sets, such 
constraints are often done by other means - particularly through conformance 
statements in non-computable documents, most notably HL7 CDA Implementation 
Guides. We are seeing plenty of this in the US, as a result of their Meaningful 
Use provisions. In these cases, the datatype does not necessarily carry the 
constraint. It almost invariably doesn't. This means that in such transactions, 
the receiving system has no way of knowing the true datatype - i.e. the set of 
values - for each such data item. The only way for such constraints to be known 
to the receiving system is through access to HL7 templates - thus violating THE 
principal tenet of HL7's RIM-based information exchange paradigm.

This leads on to one of William Goosen's favourite topics - that of Coded 
Ordinals. These have been introduced in ISO 21090 to meet the needs, often 
encountered in patient assessment forms, whereby  weights are given to 
descriptive phrases to indicate the scope of functionality a patient has to 
perform, say, activities of daily living (e.g.  Barthel Index). The weights can 
be used to derive an accumulated score for a collection of individual 
activities.  Unfortunately, ISO 21090 can't actually provide for this use case 
via the CO ( that's code for Coded Ordinal ) datatype, because it has no way of 
denoting the set of allowed values. Such a set might look like

[ { 0 , "unable"}, { 5, "needs help (verbal, physical, carrying aid" }, {10, 
"independent"}]

i.e. a set of pairs of weights and phrases. A coded ordinal only describes one 
value, not the set of permissable values! Now, of course it would be possible 
to specify these sorts of sets, and to publish them for use in clinical systems 
and information exchange. My point is that ISO 21090 doesn't support such a 
type and there currently is not a mechanism for this within HL7 - the primary 
standard for communicating clinical information. Even after all these years! 
I'd like to know how William, for one, hopes to solve this problem? Perhaps Ed 
Hammond has a solution in mind?

So I contend that ISO 21090 cannot solve the typing problem on it's own. Other 
components are required. And this questions (and always has questioned) the 
scope of an ISO standard.

One of the many beauties of openEHR is that it long ago recognised the need for 
a computable constraint mechanism that can help solve these problems. It not 
only recognised the need, but it has produced excellent specifications that 
marry datatypes, an information model, a constraint-based archetype object 
model, and a companion  syntax language(ADL) for archetype serialisation. And 
particularly to Tom Beale's credit, it did so through a well-engineered process 
that has produced a set of coherent, implementable artefacts for which the 
health informatics community should be immensely grateful. And it did so openly 
and for free to the community.

People who question the quality, safety, implementability, suitability of 
clinical information artefacts should never be characterised as merely "taking 
broadsides" at another organisation, whoever that may be. There are far too few 
people of ability and commitment for that . Their skills and commitment are 
sorely needed. The others can go off and sit in their committees to 
compromise/compromize and harmonise/harmonize.

- eric

On 2010-11-08, at 12:23 AM, Stef Verlinden wrote:

> It looks like we're getting to the heart of the matter here.....
> 
> What I really  would like to know from the others what their opinion's on 
> these subjects are?
> 
> If it indeed turns out to be true that Tom don't understand how datatypes, 
> RIM or data types are working, we, as the openEHR community, should ask him 
> to shut up. If not we should find better ways to get the message across...
> 
> 
> Cheers,
> 
> Stef
> 
> Op 7 nov 2010, om 12:12 heeft Grahame Grieve het volgende geschreven:
> 
>> hi Tom
>> 
> 
> .....
> 
>> 
>>> The context specific stuff is specific to HL7 only. It just doesn't apply 
>>> elsewhere.
>> 
>> not at all. And I'm surprised you still think this. HXIT is to do with 
>> capturing
>> and managing foreign data. As is some of the II stuff. It doesn't and won't
>> arise in an EHR system for internal data, but it will for imported data. So
>> where it does arise is not HL7 specific.
>> 
>> Flavors are a ISO 21090 thing. And optional - they aren't in the schema,
>> for instance.
>> 
>> Update mode is transactional. Almost everybody will profile it out.
>> 
> 
> ......
> 
>> 
>>> There is not a close correspondence between the 21090 idea of
>>> ?ANY? and the typical Any/Object or other root class of most
>>> object-oriented type systems ? this name clash would have to be resolved in 
>>> some way;
>> 
>> It appears I will have to keep repeating this until I am blue in the face.
>> It is not a name clash, nor does it (or should it) correspond to a root class
>> in any other system - it is it's own class. The fact you think this indicates
>> that you are totally confused as to what ISO 21090 is. (Hint: look at how you
>> modeled your own data types...)
>> 
> 
> .......
> 
>> 
>> 
>>> The modelling style seems to follow the strange HL7 obsession
>>> with non-object orientation, popularised in the RIM.
>> 
>> which indicates that you don't understand the RIM or the data types,
>> and how they differ.
>> 
>> Grahame
>> 
>>

ISO 21090 data types too complex? - (longish)

Reply via email to