Carriage returns in DV_TEXT not allowed

Colin Sutton Wed, 11 Jan 2012 10:24:24 +1100

Couldn't the text stored in the eHR include HTML paragraph separators, 
replacing Windows or Unix specific line separators?
And HTML escape sequences....


DV_HTMLTEXT?

Regards,
Colin


From: openehr-technical-bounces at openehr.org 
[mailto:[email protected]] On Behalf Of Thomas Beale
Sent: Wednesday, 11 January 2012 1:19 AM
To: openehr-technical at openehr.org
Subject: Re: Carriage returns in DV_TEXT not allowed

On 10/01/2012 10:05, Leonardo Moretti wrote:
If DV_TEXT doesn't allow to use carriage returns, line feeds, or other 
non-printing characters, as stated in 
http://www.openehr.org/releases/1.0.2/architecture/rm/data_types_im.pdf, pag 
29, there is a way to represent short text with minimal formatting characters 
(carriage returns)? Which data type should be used?


It would be interesting to know how many other implementers agree with this 
restriction. It was put in (from memory) in the very early days of modelling, 
based on GEHR, and possibly somewhat on 13606 - nearly 10 years ago!

The idea was that DV_TEXT models a 'text fragment', essentially the idea of a 
word, string of words, sentence or possibly a group of sentences. No CR/LF were 
allowed because this is taken as a paragraph delimiter, and the type 
DV_PARAGRAPH was defined to represent multiple DV_TEXTs making up a long tract 
of text like a report. In proper word processing & publishing, this is correct; 
a 'paragraph' has no CR/LF in it, which is what allows resizing to work 
properly in different screen / form widths.

Additionally, any 'atomic' text item, e.g. a single disease name, single 
sentence etc - which make up the majority of text data within structured data - 
should not have a CR/LF.

This way of thinking may be dated, and it is a good question as to when a piece 
of text can't be a single DV_TEXT. If we stick with the current 
model<http://www.openehr.org/uml/release-1.0.1/Browsable/_9_0_76d0249_1109597816149_873308_762Report.html>
 (and remember, a DV_CODED_TEXT is-a DV_TEXT in openEHR), the openEHR RM is 
imposing a simple word processing model of 'paragraphs' made up of 'text 
fragments'. An alternative would be to allow anything in a DV_TEXT. The 
decision about when you have to have a new DV_TEXT is made on the basis of 
attributes other than the actual string value, i.e.:

 *   hyperlink: if there is a hyperink, it applies to the entire DV_TEXT; 
therefore, if you only want a link to correspond to 2 words, then those 2 words 
= 1 DV_TEXT
 *   formattting: simple formatting like bolding, emphasis (about the same 
level as typical wiki markup) applies to the whole DV_TEXT;
 *   mappings: coded mappings, e.g. ICD code applies to whole DV_TEXT; need to 
use multiple DV_TEXTs if only some words are to have an associated code mapping
 *   formal coding: if a DV_CODED_TEXT is to be used - i.e. when the string 
value is the term for the code from its terminology (not just some mapping), 
then the DV_CODED_TEXT.value can only consist of the exact word string to which 
the code corresponds; more DV_TEXTs have to be added using a DV_PARAGRAPH to 
construct a whole paragraph

The best approach with the current model is:

 *   for atomic text items, e.g. single word/sentence answers to questions, 
single coded terms like names of diseases, procedures etc, use a single 
DV_CODED_TEXT or DV_TEXT.
 *   for a tract of text containing some words that are hyperlinked / coded / 
formatted, use a DV_PARAGRAPH containing multiple DV_TEXTs.
 *   Or else you use a DV_PARSABLE, containing XML, HTML, RTF or whatever you 
like - but ... no guarantee the receiver can read it!

This does not actually solve properly the problem of how CR/LFs are added. If 
we assume one DV_PARAGRAPH = 1 CR/LF (as in word processing) then a report 
needs to consist of multiple DV_PARAGRAPHs, and we don't currently have a data 
type for that. To fix the current model we could add a new type DV_DOCUMENT, 
which contains multiple DV_PARAGRAPHs. Or we could remove the restriction on 
CR/LF on DV_TEXT, but that then would allow CR/LFs to occur in single 
DV_CODED_TEXT strings, which is almost certainly an error. But maybe we just 
assume that software never makes this error?

The real question is: do we want to have any explicit word-processor like model 
of text? 10 years ago, the answer seemed obvious: yes, because there is no 
reliable standard of marked up text (many variants of HTML, XML, wiki markup, 
etc). I am not sure the answer is any different today. From a clinical 
perspective, guaranteeing readability of text in a standard way is paramount.

- thomas

#####################################################################################
This e-mail message has been scanned for Viruses and Content and cleared 
by MailMarshal
#####################################################################################

####################################################################################################################

IMPORTANT NOTICE: This e-mail and any attachment to it are intended only to be 
read or used by the named addressee. 
It is confidential and may contain legally privileged information. No 
confidentiality or privilege is waived or lost 
by any mistaken transmission to you. The CTC is not responsible for any 
unauthorised alterations to this e-mail or 
attachment to it. Views expressed in this message are those of the individual 
sender, and are not necessarily the 
views of the CTC. If you receive this e-mail in error, please immediately 
delete it and notify the sender. You must 
not disclose, copy or use any part of this e-mail if you are not the intended 
recipient.

#####################################################################################################################
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20120111/3e9fcbe6/attachment.html>

Carriage returns in DV_TEXT not allowed

Reply via email to