Adam Flinton wrote: >> > > I would like though to enquire wrt the rationale of containing _id info > in a separate <value/> element. > > If you are being consistent > instead of : > > <terminology_id> > <value>ISO_639-1</value> > </terminology_id> > > it should be simply: > > <terminology_id>ISO_639-1</terminology_id> > > > or <terminology_id value="ISO_639-1"/> > > Adam, when you say it 'should' be - either pulled up a level, with an object attribute removed OR represented as an XML attribute - what is the driver? Is it semantic (you think there is something wrong with the reprsentation of the object structure defined by the specification) or is it to do with space/signal-to-noise (using one of the last two methods uses less characters)?
The way it currently is is due to a direct machine-performed object serialisation process - in other words, it simply follows the same rules for transforming any object data into XML. Your suggestion (I presume) is a special case of the general idea of representing all so-called basic types (Strings, Integers, dates etc) as XML attributes rather than as XML elements. But we have already just discussed and agreed that long text strings (especially containing unicode, backslash quoting and whitespace) should be XML elements. As I have said before, what I think is most important is regular encoding from data to and from XML, so that a) software is as simple and clean as possible and b) changes are not needed due to particular content (i.e. data). Now, ideally we would minimise use of bandwidth / space with the representation as well. The problem is that XML is pretty poorly designed for efficiently representing data, and has a poor signal to noise ratio...making data serialise in a way that is either 'more aesthetic' or smaller always implies more complex software containing exceptional rules. Further, although XML isn't well designed for data representation, in its original design, 'attributes' were intended for meta-data items, rather than 'data'. Whether this semantic needs to be retained in the XML we are talking about here is a question. So the question is: at what level do we include exceptional processing to reduce space wastage, since this complicates the software? How much do we compromise the intended semantics of XML, where attributes are designed for holding meta-data (including real meta-data, e.g. things like xsi:TYPE etc)? Any idea of saving space has to be done on the basis of a study of high volumes of representatively diverse data. Saving 10 bytes is not interesting, but saving 10Gb/minute in a large data processing system is. I will go out on a limb and say that 'style' has no place in good engineering, only good engineering does - correctness, performance, maintainability etc. With all that in mind - if the community wants to make the appropriate analysis of data and propose a more space-efficient schema, I am not against it. But the needs of correctness (= patient safety) must be satisfied. - thomas beale

