This thread seemed to start with the assumptions that firstly, a DV_CODED_TEXT will inevitably be stored in XML and that, secondly, the mapping to XML will be a very direct one, even up to the names of the corresponding elements. Are either of these valid assumptions? Even if they are, should the low-level implementation details really be a concern in the ADL language design? I'm sure there are all sorts of clever tricks that will be use to persist OpenEHR data efficiently, and it's by no means a given that the persistence solution will involve XML when large datasets are being managed.
-----Original Message----- From: [email protected] [mailto:owner-openehr-technical at openehr.org] On Behalf Of Thomas Beale Sent: Monday, 23 January 2006 3:45 PM To: openehr-technical at openehr.org Subject: Re: Proposed slightly radical change to CODE_PHRASE in Text package in openEHR sorry I missed this one - didn't get the message in fact - something odd going on with this list.... Heath Frankel wrote: >Tom, >My only comments is related to the resulting XML schema. Any reason we >couldn't simplify the XML further to the following: > > <name xsi:type="DV_CODED_TEXT"> > <value>clinical finding</value> > <defining_code>SNOMED-CT::404684003</defining_code> > </name> > > I don't care too much wht the XML looks like - i.e. if it diverges from the form it should have if strictly following the object structure - which would be the following: <name xsi:type="DV_CODED_TEXT"> <value>clinical finding</value> <defining_code><value>SNOMED-CT::404684003</value></defining_code> </name> since CODE_PHRASE is still an object; whereas your (neater) XML treats it as a String field or function of DV_CODED_TEXT. Maybe this is the solution...to be technically correct, you could even have a new function or field defined on DV_CODED_TEXT called e.g. defining_code_string, then your XML would be exactly correct: <name xsi:type="DV_CODED_TEXT"> <value>clinical finding</value> <defining_code_string>SNOMED-CT::404684003</defining_code_string> </name> Even your initial XML is "correct", as long as we are happy to make the XML-schema diverge from the object model at that level. I think we have to be resigned to that kind of thing with XML, since otherwise it just generates too much garbage. >Having data enclosed within the defining_code element indicates that >this is >the value anyway so we don't need the additional value child element. >The >only potential downside of this is if there are additional attributes or >associations added to code_phrase later which would need to then be >represented as follows: > > <name xsi:type="DV_CODED_TEXT"> > <value>clinical finding</value> > <defining_code>SNOMED-CT::404684003 > <some_other_attribute>some >data</some_other_attribute> > </defining_code> > </name> > >Even though the result is still valid XML it is not the normal >representation in XML. > Is the above valid XML? I didn't know you could do that...But it seems pretty unorthodox, especially if we want to have a clear mapping to object structures, which I think we should consider the "statement of truth". >We would then need to change the schema to >include the value element again. What is the likelihood of additional >attributes to Code_Phrase? > >Sorry for the discussion of what the XML looks like, but you started it >:>. > > touch? ;-) The think to keep in mind is to do with paths. We will obviously use Xpaths in XML data, but we also use Xpath-like paths as logical paths against objects in memory, and non-XML forms of the data. We want all these paths to be the same (or let's say, the Xpath-like openEHR paths to be a nearly strict syntactical subset of the W3C Xpath syntax). In these paths, ideally we would be able to just reference something like SNOMED-CT::404684003 rather than having to write stuff like some_attr [ terminology_id = "SNOMED-CT" AND code_string= "404684003" ] ...well, it seems like a nicer idea, but maybe it doesn't matter that much? Doing the latter means you can more easily have expressions like: some_attr [ terminology_id = "SNOMED-CT" AND ( code_string= "404684003" OR code_string = "404684017") ] and so on, which Hugh or someone else mentioned here....if we think this is a distinct possibility, then sticking with the current model may be better anyway. On the other hand, if we think that the XML instance will just be too heavy because of this, we should go with the new proposal. Further thoughts? (Given that we are very close to our 1.0 release date, I am inclined to leave this as it is, and allow for a divergence in the XML-schema which enables fewer tags, as in Heath's original piece of XML). - thomas beale

