Suggestion wrt XML Archetypes & Templates

Adam Flinton Mon, 03 Dec 2007 10:40:58 +0000

Gerard Freriks wrote:
> Thanks.
>
> But I'm curious in:
> Why?
>
> Why is you solution more safe?


A) You are definitively bookending the string.

This is exactly the same as you do within the ADL e.g.

                ["at0002"] = <
                    description = <"*">
                    text = <"Procedure started date time">
                >

The adl above does not say:

description = *
text = Procedure started due time.

etc.

why is that?

& Would that be the same as:

description =                                 *
text =


Procedure started due time.

?

B) Even worse is the fact that an XML element can contain many text 
children even where it may look like there is just one. This can cause 
all sorts of fun.

e.g.

http://www.informit.com/articles/article.aspx?p=31273&seqNum=12&rl=1

"The text of an element is considered *normalized* when it contains no 
two adjacent Text nodes, as was shown above. In general, deserializing 
an XML document into a DOM will yield normalized elements. However, when 
new Text nodes are inserted into the hierarchy, one can wind up with a 
denormalized element. While completely legal, various XML technologies 
have a difficult time handling denormalized elements. XPath, for 
example, depends on a normalized document tree structure to behave 
properly. Performing an XPath traversal against a document with 
denormalized elements would yield unexpected results. This can be 
prevented using the Node.normalize method, which recursively normalizes 
all ancestor Text nodes. Consider the following Java code:

import org.w3c.dom.*;
void appendText(Document doc, Node elem) {
  int nChildren = elem.getChildNodes().getLength();
  Node text1 = doc.createTextNode("hello ");
  Node text2 = doc.createTextNode("world");
  elem.appendChild(text1);
  elem.appendChild(text2);
  text2.splitText(2);
  assert(elem.getChildNodes().getLength() == nChildren + 3);
  elem.normalize();
  assert(elem.getChildNodes().getLength() == nChildren + 1);
}

As shown in Figure 2.12 
<javascript:popUp('/content/images/chap2_0201709147/elementLinks/02fig12.gif')>,
 
after the call to Text.splitText, there are three new Text node 
children. However, after the call to Node.normalize, the three adjacent 
Text nodes are folded into a single node containing the string "hello, 
world"."



> Why is your solution more efficient?

A) File sizes are smaller/the XML is less verbose.
B) The fact that you know exactly where the string starts and finishes 
means that using Sax etc can be much faster as there is no need to 
normalize.
i.e. at  present you would already have more verbose xml & then the only 
safe option is to always normalize the whole document before processing it.
C) XML attribute values are structural vs a function in most of the XML 
processing languages e.g. XSLT or XPath.

e.g. compare /a/b/@c vs /a/b/text()  or /a/b[@c="bob"] vs /a/b[text() = 
"bob"]

> Why is your solution a better Best Practice?
>
In part for the reasons above.
In part because experiences of failures because of the ambiguities wrt 
the text child in XML have driven people to be pretty careful about 
using text unless you really need to.

If you want a single string containing a value which will not contain 
child elements e.g.

Good use for text child:

some <strong>bold text</strong> in some documentation

Bad use for text child:

at003

Again I would refer you yo your very own ADL which in essence has 
adopted the exact same solution to avoiding an textual ambiguities via 
markup such as:



                ["at0030"] = <
                    description = <"*">
                    text = <"Material used">
                >
                ["at0031"] = <
                    description = <"*">
                    text = <"Procedure comments">
                >
                ["at0032"] = <
                    description = <"*">
                    text = <"Procedure comments">
                >
                ["at0033"] = <
                    description = <"*">
                    text = <"Procedure end date time">
                >

were say the first element above to be rewritten it could be seen as

<at0030 description="*" text="Material used"/>

Adam


**********************************************************************
This message  may  contain  confidential  and  privileged information.
If you are not  the intended  recipient please  accept our  apologies.
Please do not disclose, copy or distribute  information in this e-mail
or take any  action in reliance on its  contents: to do so is strictly
prohibited and may be unlawful. Please inform us that this message has
gone  astray  before  deleting it.  Thank  you for  your co-operation.

NHSmail is used daily by over 100,000 staff in the NHS. Over a million
messages  are sent every day by the system.  To find  out why more and
more NHS personnel are  switching to  this NHS  Connecting  for Health
system please visit www.connectingforhealth.nhs.uk/nhsmail
**********************************************************************

Suggestion wrt XML Archetypes & Templates

Reply via email to