The Truth About XML was: openEHR Subversion => Github move progress

Thomas Beale Thu, 04 Apr 2013 16:50:50 +0100

On 04/04/2013 12:09, Tim Cook wrote:
>>> well, since the primary openEHR projects are in Java, Ruby, C#, PHP, etc, I
>>> don't see where the disconnect between the projects and the talent pool is.
>>> I think if you look at the 'who is using it' pages, and also the openEHR
>>> Github projects, you won't find much that doesn't connect to the mainstream.
> The discussion about talent pool is about the data representation and
> constraint languages.
> XML and ADL. The development languages are common across the application 
> domain.
> I know that you believe that ADL is superior because it was designed
> specifically to support the openEHR information model. It is an impressive 
> piece of work, but
> this is where its value falls off.


actually, ADL was specifically designed to not support any information 
model, and it doesn't. It's just an abstract syntax, free of the 
vagaries of any other syntax.

> XML has widespread industry acceptance and plethora of development and 
> validation tools against a global standard.

sure. In terms of being able to /serialise /archetypes to XML, that has 
been available for probably a decade, and is in wide use. Some users 
ignore ADL entirely. I don't think anyone has an issue with this.

>> <NB: in the below I am talking about the industry standard XSD 1.0, not the
>> 9-month old XML Schema 1.1 spec>
> The industry standard XML Schema Language is 1.1. The first draft was
> published in April 2004
> making it nine years old,

well, but it's been stillborn for years, everyone knows that...

>> But XML schema as an information modelling language has been of no serious
>> use, primarily because its inheritance model is utterly broken. There are
>> two competing notions of specialisation - restriction and extension.
> Interesting.  I believe that the broader industry sees them as
> complimentary, not competing.

if you mean the competing inheritance models - I have yet to meet any 
XML specialist who thinks they work. The maths are against it.


>> Restriction is not a tool you can use in object-land because the semantics
>> are additive down the inheritance hierarchy, but you can of course try and
>> use it for constraint modelling.
> Restriction, as its name implies, is exactly intended and very useful for 
> constraint modelling.
> Constraint modelling by restriction is, as you know, the corner-stone of 
> multi-level modelling.
> Not OO modelling. Which is, of course, why openEHR has a reference model and 
> a constraint model.
> They are used for the two complimetary aspects of multi-level modelling.

but your original statement was (I thought) that you are using XML for 
the information model as well. That's where it breaks, because of the 
inability to represent basic concepts like inheritance in the way that 
is normally used in object modelling (and most database schema languages 
these days).


>> Although it is generally too weak for
>> anything serious, and most projects I have seen going this route eventually
>> give in and build tools to interpolate Schematron statements to do the job
>> properly. Now you have two languages, plus you are mixing object (additive)
>> and constraint (subtractive) modelling.
>>
> Those examples you are referring to are not using XML Schema 1.1.
> Or at least not in its specified capacity. There is no longer a need
> for RelaxNG or Schematron to be mixed-in.
> Your information on XML technologies seems to be quite a bit out of date.

I'm just reporting what I know to be the case in various current 
national e-health modelling initiatives, none of which I am directly 
involved in... all the serious ones use XSD 1.0 + Schematron.


>> Add to this the fact that the inheritance rules for XML attributes and
>> Elements are different, and you have a modelling disaster area.
>>
> I will confess that XML attributes are, IMHO, over used and inject
> semantics into a model
> that shouldn't be there.  For example HL7v3 and FHIR use them extensively.
>
>
>> James Clark, designer of Relax NG, sees inheritance in XML as a design flaw
>> (from http://www.thaiopensource.com/relaxng/design.html#section:15 ):
> Of course! But then you are referencing an undated document by the
> author of a competing/complimentary tool,
> that is announcingannounces RelaxNG as new AND its most recent
> reference is 2001.
> So, my guess is that it is at least a decade old. Hardly a valid opinion 
> today.

I can't say whether it is valid with respect to XSD 1.1, but it remains 
valid with respect to 1.0. I don't see that XSD 1.1 has a healthier 
inheritance model, so it seems to me that anyone trying to do 
information modelling (not constraint modelling) is still going to get 
into trouble. I can't see anything that contradicts Clark's statements, 
even if they are not from last week.

But let's assume I don't know what I am talking about. It must 
(according to you) be easy to express e.g. this part of the openEHR RM 
<http://www.openehr.org/local/releases/1.0.1/uml/Browsable/_9_0_76d0249_1109599337877_94556_1510Report.html>
 
in XSD 1.1. I would be very interested to see how it deals with the 
generic types and inheritance, both handled by any normal programming 
language.


>> Difficulties in using type restriction (i.e. subtyping) in XSD seem
>> well-known - here. Not to mention the inability to deal with generic types
>> of any kind, e.g. Interval<Date>, necessitating the creation of numerous
>> fake types.
>>
> Hmmmmm, what is wrong with xs:duration?
> I don't think I understand what you mean by "fake types".

You can't define the type Interval <Date> in XSD - it doesn't have 
parameterised types, even though all programming languages and UML have 
them.


>> And of course, the underlying inefficiency and messiness of the data are
>> serious problems as well. Google and Facebook (and I think Amazon) don't
>> move data around internally in XML for this reason.
> That is kind of vague. Can you expand on this?
> The fact taht any other domain does or doesn't use XML is really pretty
> irrelevant to multi-level modelling in healthcare. I am comfortable in 
> assuming
> that none of them use ADL for anything. So the comparison is quite the
> red-herring. I think that limiting the conversation to multi-level modelling 
> in
> healthcare is an appropriate approach. Otherwise, it is kind of pointless.

what I was pointing out is that XML as a general technology is far from 
a 'final solution' in any area of application. In modelling it is well 
known as problematic, and in data representation as well.


>> None of this is to say that XML or XML-schema can't be 'used' - I don't know
>> of any product or project in openEHR space that doesn't use it somewhere,
>> and of course it's completely ubiquitous in the general IT world. What I am
>> saying here is that the minute you try to express your information model
>> primarily in XSD, you are in a world of pain.
>
> I will admit that expressing the MLHIM information model in XML Schema
> 1.1 terms and then developing the actual implementation was a challenge at 
> first.
> But if you take a look today you will see that it is quite easy to understand,
> standards compliant and fully functional.
> The original challenge was to overcome my predjudice against XML.

Can you point to some MLHIM models that show specialisation, 
redefinition, clarity of expression, that sort of thing? I tried to find 
some but ran into raw XML source.


>
>> but what we need are models that can describe data, software, documents,
>> documentation, interfaces, etc
>>
> But these are all VERY different artifacts and require different
> models, tools and langauges.

well that's just the point, they don't  - it's possible to define a 
model so that an XSD form, a programming form, a display screen form and 
many others are all derivable from that source model. We only want to 
define the model of 'microbiology result' once, after all. This 
single-source modelling is a key goal of the approach.


>> get imported data out of XML as soon as possible, and into a tractable
>> computational formalism
> Very much like get data out of ADL as soon as possible. Once you build
> some tools to do that.

there is no data in ADL, only models. Not sure what you are trying to 
say here....


>> treat XSDs as interface specifications, to be generated from the underlying
>> primary information models, not as any kind of primary expression in their
>> own right
>> Define XSDs with as little inheritance as possible, avoid subtyping, i.e.
>> define types as standalone, regardless of the duplication.
> I am not sure you understand XML Schema 1.1. Again you seem to be
> approaching multi-level modelling in healthcare as if OO modelling were the 
> only choice. This
> is the "I have a hammer, everything is a nail" approach. It isn't very 
> effective in the real
> world where various tools are need to solve various problems.

well, pretty much the whole world is using programming languages that 
are essentially object-oriented or object-enabled - even uber languages 
like Haskell do most OO tricks. You're using Python, that's an OO language.


>> Maximise the space optimisation of the data, no matter what it takes. It
>> usually requires all kinds of tricks, heavy use of XML attributes, structure
>> flattening from the object model and so on. If you don't do this, any XML
>> data storage or will cost twice what it should and web services using XML be
>> horribly slow.
> So, in your opinion, should you build your APIs in ADL?  Of course not.
> I fail to see your arguments against using XML for what it is designed for;
> data representation and constraint modelling. Of course then you have all of
> the related tools such as XSLT, SOAP, WSDL, XSLT, XPath, XQuery, etc.
> for other tasks.
> A fairly complete suite. There isn't a real, practical reason to
> re-invent them. They make the interactions
> smoother and easier understood by the IT industry as compared to using
> a domain specific language.

XML wasn't designed for data representation, it was designed for 
structured document mark-up. That's why it's so horrible for data 
representation.

>
>> XML Schema 1.1 introduces useful things that may reduce some of the above
>> problems (good overview here), however as far as I can tell, its inheritance
>> model is not much better than XSD 1.0 (although you can now inherit
>> attributes properly, so that's good).
> It is not an OO language. If you are judging it based on those
> characteriscs; please
> see my critical failure comment above. OO is not the be all, end all
> solution in
> computer science. Much less, in multilevel modeling.

well let me just point to a single feature of object languages 
(including ADL) - inheritance / specialisation. Are you saying that's of 
no use? How do you propose to adapt a model that you have to include 
local needs, without breaking the parent model semantics?

> Well, I can't predicate how long it will take it to be used on a broader 
> basis.
> Probably, like most things, as people need the capability. Sometimes,
> people resist change.
> It takes them outside their comfort zone and they don't like it.
>
> I can tell you that XML Schema 1.1 is very functional.  It is
> supported by open source and proprietary
> tools and it is working quite well, without tricks, in MLHIM.

if you can point to some online MLHIM models so we can see the result - 
the information model, and layers of MLHIM archetypes specialised based 
on that, it would be very helpful.



>
>>
> I am not sure that there is any requirement for mapping.
>
> While there are a number of people producing openEHR archetypes.
> AFAICT tell there are only a dozen or so
> that are in compliance with the openEHR specifications.
> Specifically the "Knowledge Artefact Identification" document.
> To address a couple of issues I have with the current openEHR eco-system:
>
> Section 2.2 says:
> "It is possible to define an identification scheme in which either or both 
> ontological and machine iden-
> tifiers are used. If machine identification only is used, all human artefact 
> 'identification' is relegated
> to meta-data description, such as names, purpose, and so on. One problem with 
> such schemes is that
> meta-data characteristics are informal, and therefore can clash ? preventing 
> any formalisation of the
> ontological space occupied by the artefacts. Discovery of overlaps and in 
> fact any comparative fea-
> ture of artefacts cannot be formalised, and therefore cannot be made properly 
> computable."
>
>
> I will argue that UUIDs are very definitely "computable"; without ambiguity.
> Metadata characteristics are very definitely formalized and have been
> since at least 1995 (DCMI) and has been an
> ISO standard since at least 2003. Therefore this paragraph is
> inaccurate in the description of the usefulness
> of machine processable identifiers and using metadata for formal descriptions.

Firstly , we are in the process of rewriting this, but also I think you 
misread what it said (which might not have been very clear) - it's 
saying that machine ids are computable (obviously they are, at a basic 
level)

>
>
>
> Archetypes on the NEHTA CKM http://dcm.nehta.org.au/ckm/ carry only
> the openEHR RM namespace.
> So, are therefore uncontrolled and of unknown quality; by openEHR definition.

the spec you are quoting from is about the future of identification, not 
the past (or present).

>
>
>
> The openEHR eco-system is well engineered. It just isn't
> sociologically acceptable. People want to be free to
> design their concept models without top-down consensus. MLHIM allows
> that with industry standard, off the shelf tooling.

so does openEHR, that's what namespaces are about. If two groups both 
define a 'blood pressure' archetype today, there is an immediate 
problem. In the future with namespaced ids, the problem becomes 
manageable, since both forms can co-exist.


I followed some of your URLs, but I still can't locate a) the MLHIM 
reference model or any b) MLHIM archetypes that I can understand / read. 
I know they are lurking out there somewhere... can you provide some links?

- thomas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.openehr.org/pipermail/openehr-technical_lists.openehr.org/attachments/20130404/5bbe57fa/attachment-0001.html>

The Truth About XML was: openEHR Subversion => Github move progress

Reply via email to