From: "Pim Blokland" <[EMAIL PROTECTED]>
To: "Unicode mailing list" <[EMAIL PROTECTED]>
Sent: Wednesday, May 28, 2003 11:45 AM
Subject: Re: Dutch IJ, again


> Philippe Verdy schreef:
> 
> > i+j is a single combined Dutch ij character only if its not
> followed by a vowel
> 
> This is not true; where did you get that idea?
> It almost always IS a diphtong (cf words like bijen, vrijaf, zijig)
> except where the i and the j happen to be in separate syllables
> (bijou, bijectie).

Do you mean that there is no possible inference rule ? I didnot want to be exaustive 
there, because your sample words where ij is a diphtong effectiely can be exceptions 
(or the two other words may be exceptions to the "normal" Dutch rules). I'm not a 
Dutch expert to be affirmative, I just wanted to give an idea with an example of such 
a possible rule.

Well, it may appear that in general "ij" is always a single diphtong, unless there's 
an hypenation candidate between two syllables. In that case the problem becomes as 
complex as determining syllable breaks for hyphenation.

For now there does not seem to exist a clear definition of what could be a good 
localized breaker for grapheme clusters, as it also implies an analysis of syllables 
in Dutch or other languages (for now, only abjads and Asian scripts seem to have a 
normalized algorithm for the determination of such grapheme clusters, and there 
remains a lot of work to do with alphabetized languages, which seem to use letters in 
a way much more complex than expected).

Still I'm not convinced that the explicit "ij" diphtong is really different from an i 
+ j pair for Dutch, which uses a lexical-based approach (so the combined character 
"ij" may just be there only for compatibility with some legacy usages, as most 
rendering of Dutch text does not allow a reader to make a difference between a 
combined ij cluster and separated i+j letters; the separation does not come from 
letters themselves but from the lexical knowledge of the reader).

The special typographic case of inter-letter spacing for justification is not dramatic 
(because other typographic rules also require that no excessive spacing is used.) 
Exception to this case is the usage of artificially expanded text where the 
typographic effect is used as a way to emphasize a title or mark, and it is very near 
from a logographic design, where the form rather than the semantic is considered more 
important (is such usage still text ? Shouldn't this be excluded from Unicode 
standardization as it requires a necessary markup out of the scope of Unicode, to 
handle this case as a form of typographic *art* ?).

It would be interesting to analyze the way UCA behaves for the collation of Dutch 
text...

Reply via email to