On Tue, 11 Aug 2015 21:27:27 +0200 (CEST) Marcel Schneider <[email protected]> wrote:
> Iʼve tried to just remove the parentheses and let the string. This > was compiled, but the keyboard test showed that in the keyboard > driver DLL, UTF-16 strings with SMP characters arenʼt handled as > such. Each surrogate code unit is considered as a single character > even when itʼs followed by a trailing one. Only the code unit > corresponding to the shift state (modification number) is taken, no > matter if itʼs only a surrogate and the other half comes next. This is exactly what one should expect. The data is an array of UTF-16 code units rather than a UTF-16 string. Moreover, it was probably written as UCS-2. I believe it is the application that has the job of stitching the surrogate pairs together. > Is this the reason why a Unicode character cannot be represented > alternatively as a 32 bit integer on Windows? They are, from time to time. There's a Windows message that delivers a supplementary character rather a UTF-16 code unit, and fairly obviously they have to be handled as such when performing font lookups. I've a suspicion that this message hit an interoperability problem. A program that can handle pairs of surrogates but predates the message will not work with the more recent message. Therefore using the message type is deferred until applications can handle it. Therefore applications don't need to handle it, and don't. Therefore the message type doesn't get used. > Being UTF-16, the OS > could handle a complete surrogates pair in one single 32 bit integer. > Couldn't this be performed on driver level by modifying a program and > updating this when the driver is installed? You really talking about a parallel set of routines. I suspect the answer is that Microsoft don't want to work on extending a primitive keyboarding system when TSF is available. You want to use dead keys. Why? Is it not that they are the only mechanism you have experience of. Better systems can be built, in which one sees what one is doing. Is it not much better to type 'e' and then a circumflex, and see the 'e' and then the 'e' with a circumflex? Dead keys are an imitation of a limitation of typewriter technology. If I was typing cuneiform, I'd much rather type 'bi<COMMIT>' and see the growing sequence 'b', 'bi', '<CUNEIFORM SIGN BI>' as I typed. (What you have for a <COMMIT> key is your choice.) TSF lets one do this. A simple extension of the keyboard definition DLLs generated by MSKLC does not. What you should be pressing for is a usable tutorial on how to do this in TSF. > If yes, we must modify the interface so that keyboard driver DLLs are > really read in UTF-16. And/or we must find another compiler. > > Must the Windows driver be compiled by a Microsoft compiler? The compiler is not the issue. The point is that the 16-bit code exists, and programs that use the 16-bit API exist. Language upgrades may make supplementary characters easier to use in programs, but that is all. They don't change existing binary interfaces. Richard.

