character sets and languages in openEHR

Thomas Beale Sun, 07 Mar 2004 08:57:40 +1000

A couple of technical questions prior to declaring the 0.9 baseline in 
openEHR:


One of the major openEHR implementors here in Australia has suggested 
moving the attributes 'language' and 'charset' in the class DV_TEXT to 
some higher level class - e.g. COMPOSITION, since almost all the time it 
is the same on DV_TEXT items in a given EHR. We don't think it should be 
that high, since language cannot be guaranteed the same throughout a 
COMPOSITION (in their scheme, you would set the attribute on COMPOSITION 
and then override it on lower nodes if they were different; however, I 
am very wary of this sort of logic - HL7 uses it a lot and it really 
complicates things for developers; at the moment we prefer to avoid it 
completely). One possibility is to move the language attribute to the 
ENTRY class, on the basis that an ENTRY is the minimium indivisible unit 
of information in openEHR (this is true, even for 'large' Entries like a 
microbiology test result). It was initially on DV_TEXT for safety 
reasons - you would always know what language a text fragment is in 
(this is important for words which are the same apearance but different 
meaning in different languages); however, ENTRY is probably just as safe 
from this point of view.

Q: can anyone think of a scenario where there could be multiple 
languages inside an ENTRY?

Character set is more difficult to work out. So far, we have specified 
that Unicode should be used in all strings. This means that in theory 
there is no need to record the character set name (e.g. iso-latin-1, 
iso-greek, etc). However, there is still a need to choose between UTF-8, 
UTF-16 and so on in Unicode. And in any case, I am unsure if all 
implementation technologies implement unicode in strings; is there a 
legacy reason to store non-unicode character set names anyway?

- thomas beale



-
If you have any questions about using this list,
please send a message to d.lloyd at openehr.org

character sets and languages in openEHR

Reply via email to