RE: Diacritical marks: Single character or combined character?

Naz Gassiep Thu, 05 Dec 2013 14:24:57 -0800

Hi, does anyone have any answers to this question?

From: [email protected]
To: [email protected]
Subject: Diacritical marks: Single character or combined character?
Date: Fri, 8 Nov 2013 18:37:29 +1100





Hi all,
I would like to know if there is a best practice or recommendation as to which 
method to use when representing letters with diacritical marks. For example, 
take the following two characters:
ā
ā
They may look the same, however the first is a single character U+0101, while 
the second is a combination of two, the first being regular a (U+0061) and the 
second being the combining macron (U+0304).

In producing content, which is the better to use? When writing in languages 
such as Turkish, there are a limited finite set of diacritical marks, all of 
which are represented in the Unicode character set.

However, when writing statistical formulae, every symbol used, including both 
Latin and Greek characters, can have a circumflex or overline added to it to 
denote a particular meaning. In that case, I found myself using the relevant 
character combined with U+0302 or U+0305 as needed.

Now that I am switching between the two activities (writing stats stuff and 
publishing transliterated content), I find myself unsure as to what the best 
method is, if one is better than the other.

I favour using a single method for all things, and so I am attracted to the 
idea of using combining characters for everything. However, language parsing 
tools for languages where those combined characters are used may be fooled when 
presented with U+0061 combined with U+0304 instead of the usual U+0101.

Any advice or guidance on this issue would be greatly appreciated.

RE: Diacritical marks: Single character or combined character?

Reply via email to