On Wed, 30 Mar 2016 23:42:20 +0200, Philippe Verdy  wrote:

> Note that the single letter "ij" in Dutch is often undistinctable from "ÿ", 
> which is also commonly found as a convenient substitute in many old documents 
> not encoded with Unicode but with ISO8859-1 : this has a caveat because the 
> capitalization would produce "Y" (in ISO8859-1), possibly followed by a 
> combining diaeresis (in Unicode-encoded documents) instead of "IJ" (more 
> correct but not perfect) or the "IJ" letter (best choice).

Almost regularly also the uppercase ‘IJ’ was represented as a ‘Y’ in Dutch 
pre-computer text and signing.

Sadly to say, with its excluding three French characters (ÿ, œ, Œ)—and missing 
four Finnish ones—Latin-1 was not what could have been called a Western 
European charset, even though the euro sign could not be anticipated.

> 
> The use of "ÿ" in Dutch should also be considered as an orthographic fault, 
> and it should be corrected into "ij" (to solve the capitalization problem), 
> but there are occurences in Dutch of "ÿ" which is correct (notably in 
> borrowed French toponyms such as "L’Haÿ-les-Roses")
> 
> There may be similar examples in Belgium with French toponyms, but I suspect 
> that those Belgian-French toponyms have their own Dutch "officialized" 
> variant which would be preferable without borrowing the Belgian-French 
> orthography, so that they will not need "ÿ", and they will likely use "ij" 
> instead, meaning that the autocorrection of "ÿ" from possible Belgian-French 
> toponyms into "ij" will also be correct for Dutch-Belgian toponyms ; it may 
> also be correct for French-French toponyms like "L’Haÿ-les-Roses" transformed 
> into "L’Haij-les-Roses" in Belgian-Dutch, or "L’HAIJ-LES-ROSES" if capitalized, 
> if autocorrected this way; it would however be incorrect to replace there the 
> "ij" (or IJ) letter by the two letters "ij" (or "IJ") without the orthographic 
> ligature...
> 
> By curiosity, I looked into the Dutch Wikipedia to see how they wrote 
> "L’Haÿ-les-Roses" and they don't transform the French "ÿ" into some Dutch "ij" 
> (and they don't have any other "officialized" Dutch orthography.
> 
> For this reason, the autocorrection of the "ÿ" letter into the "ij" letter in 
> Dutch is disabled by default (even if it would be needed to look into old 
> documents encoded with ISO8859-1).
> 
> The situation is more complex for the autocorrection of the "ij" digram 
> (extremely frequent in old documents encoded with ISO8859-1) into the plain 
> "ij" letter, which seems to be active in various wordprocessors (but which 
> causes problems with borrowed non-Dutch names).

Yet another example of how autocorrection-based functioning designed to keep in 
use outdated keyboard layouts is at risk of running into a mess.
> 
> 
> 2016-03-30 23:19 GMT+02:00 Philippe Verdy :
> 
> > In my opinion, the Dutch IJ/ij "ligature" is not really a ligature and should 
> > be treated exactly like Æ/æ or Œ/œ as a plain single letter.

I fully agree that these are all plain letters. Consistently, Unicode encoded 
them all as such: LATIN CAPITAL LETTER I J, LATIN CAPITAL LETTER A E, LATIN 
CAPITAL LETTER O E. The misleading “LIGATURE” names have been enforced by ISO, 
and subsequently partially corrected by Unicode on the request of the mainly 
concerned NB. ‘IJ’ too is considered a letter in Dutch. In French, the 
administrative POV is that ‘Œ’ and ‘OE’ are equivalent, and that has been 
agreed by a representative of the linguistic authority.

The point is that (1) one cannot ask people to use letters that are not on 
their keyboard, (2) one cannot ask software providers to add them in the layout 
driver while they arenʼt printed on keycaps, and (3) one cannot ask 
manufacturers to add them on the keyboard as long as that is not specified by 
any official standard. But all that shall now change.

Same problem (presumably) on Dutch keyboards, and here again things should soon 
be ipmroved, when the future revised ISO/IEC 9995 includes a compose key, at 
least on Right Alt + Space. Such a gateway can be added without altering the 
space bar, which is the one key that does not need to be engraved, and behind, 
all characters of the current script can be added without sticking anything 
more on the keycaps.

> >
> > The use of IJ/ij (encoded as separate letters) is a actually an 
> > orthographic fault, that a ligature will not help resolve.

As of the actual meaning of “ligature”, see above, but you are completely right.

> >
> > Thanks, the decomposition of the "IJ" letter or "ij" into separate letters is 
> > only a compatibility decomposition, but it is not canonically equivalent.

That will help improve the cited Wikipedia article. Correcting documentation is 
actually a precondition for users to dare type U+0132/U+0133.

> >
> > In such as case, the "ij" letter is soft-dotted also in Dutch and the two 
> > dots disappear when it has diacritics above.
> >
> > For Lithuanian, the "ij" letter is not soft-dotted, but effectively 
> > hard-coded (meaning also that it is really a ligature, and that the 
> > single-letter should not be used at all, but encoded as i+j with a possible 
> > joiner...). In such a case, using the single letter "IJ/ij" meant only for 
> > Dutch is also an orthographic fault. But this also means that when you add 
> > diacritics in Lithuanian, you'll need to encode explicit dots (like in 
> > Turkish) to keep these dots !

The oopsie is that in some implementations, this way you get two stacked dots 
plus the other diacritic…
We can only hope that this is now fixed.

Marcel

Reply via email to