More on ISO 21090 complexity

Thomas Beale Thu, 18 Nov 2010 17:09:05 +0000

Apparently there are others including the CDISC people very unhappy 
about the ISO21090 data types. This comment from Barry Smith's blog 
<http://hl7-watch.blogspot.com/2010/11/are-iso-21090-data-types-too-complex.html>
 
(emphasis added by me):

XML4Pharma said:

As an XML-guy, my major problem with all these 'standards' (as well
HL7, ISO21090, OpenEHR) is that they all start from UML modeling and
auto-generate the XML-schemas from that, ignoring any of the
advanced features (such as native XML datatypes) of XML.

I have no problem with UML modeling, but I have major problems with
the na?ve believe that one can generate high quality XML-Schemas
automatically from UML diagrams.
Essentially, these 'standards' are abusing the XML standard itself.
*They even manage to reinvent basic datatypes such as integer, date,
time, duration which are already defined by XML and XML-schema
itself*. For example, HL7- and ISO21090 'date' is expressed as
YYYYMMDD, where there is already a base XML-schema datatype 'date'
expressed as YYYY-MM-DD. So when validating an HL7-v3-message
against the schema (and even against the schematron), the date
20070231 (February 31, 2007) is accepted as a correct date. If they
had respected the XML standard instead of abusing it, then the same
data in correct XML (2007-02-31) would have immediately been
rejected by XML-Schema as being an invalid date. The same applies
for many other HL7- and ISO21090 datatypes (e.g. date, time,
datetime, duration). *Each time HL7 as well as ISO21090 'reinvents'
these datatypes making life much more complicated than is necessary.*
I am one of the developers of the CDISC ODM standard used in
clinical research. We also use UML for modeling, but our schemas are
not automatically generated from them, but created manually. We use
native XML datatypes as much as possible, not trying to reinvent the
wheel. We try to make our standard so that instance files are even
understandable by non-specialist when looking at the XML itself (so
without stylesheet). They are also 'human-readable'. The latter
cannot be said of any HL7- or ISO21090 instance file.

Some time ago, our team was asked to enable ISO21090 in ODM. The
request came from one of the largest nonprofit research
organizations in the US. We had a teleconf with them and it soon
became clear that they wanted us to replace our own (XML-Schema
based) datatypes by the ISO21090 datatypes. *We refused*.

*Instead, we will develop an ODM-extension that allows to attach
ISO21090 data points *(in their own namespace) to ODM 'ItemData'
elements. Doing so, the ODM standard will support ISO21090 AND
remain ODM.
Somewhat more than a year ago, I developed a stylesheet to extract
information from HL7-CCD (health records) to prepopulate clinical
forms in ODM format. This stylesheet was soon regarded as a
key-enabler for integration between health records and clinical
research. However, I cannot guarantee that it will work for ANY
health record in CCD format. The reason is that CCD is so complex
that I fear that one can put the same information (for example a
systolic blood pressure) in very many ways in a CCD. So how can I
guarantee that the systolic blood pressure can be extracted in all
cases?

The ISO21090 'standard' is clearly a political compromise, not a
technical compromise. As such, from the technical point of view it
is probably not an improvement.
*In my personal opinion, the best that can be done for a data
standard for healthcare is to restart from nearly scratch*. Yes, UML
modeling can and should be used, but based on solid and agreed
principles, and taking into account that XML will later be the
transport format. So, no 'reinvention' of datatypes, but using the
XML native datatypes right from the start. No fully-automated
generation of XML-Schemas, but development of the schemas by
schemawriters (though part of the work can be automated).

And most of all, involve all eligible players (HL7, ISO, OpenEHR,
etc.) right from the start.

I am not involved with these people, but their response sounds entirely
rational. In my view this just shows that the 21090 data types in their
current form are not at all what we should be standardising on.

- thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20101118/dd2623f0/attachment.html>

More on ISO 21090 complexity

Reply via email to