On Sun, May 29, 2016 at 01:13:36PM +0000, Tobias M via Digitalmars-d wrote: > On Sunday, 29 May 2016 at 12:41:50 UTC, Chris wrote: > > Ok, you have a point there, to be precise <sh> is a multigraph (a > > digraph)(cf. [1]). In French you can have multigraphs consisting of > > three or more characters <eau> /o/, as in Irish <aoi> => /i:/. > > However, a phoneme is not necessarily a spoken "character" as <sh> > > represents one phoneme but consists of two "characters" or > > graphemes. <th> can represent two different phonemes (voiced and > > unvoiced "th" as in `this` vs. `thorough`). > > What I meant was, a phoneme is the "character" (smallest unit) in a > spoken language, not that it corresponds to a character (whatever that > means). [...]
Calling a phoneme a "character" is misleading. A phoneme is a logical sound unit in a spoken language, whereas a "character" is a unit of written language. The two do not necessarily have a direct correspondence (or even any correspondence whatsoever). In a language like English, whose writing system was codified many hundreds of years ago, the spoken language has sufficiently diverged from the written language (specifically, in the way words are spelt) that the correspondence between the two is complex at best, downright arbitrary at worst. For example, the 'o' in "women" and the 'i' in "fish" map to the same phoneme, the short /i/, in (common dialects of) spoken English, in spite of being two completely different characters. Therefore conflating "character" and "phoneme" is misleading and is only confusing the issue. As far as Unicode is concerned, it is a standard for representing *written* text, not spoken language, so concepts like phonemes aren't even relevant in the first place. Let's not get derailed from the present discussion by confusing the two. T -- What are you when you run out of Monet? Baroque.
