On Tue, 13 Oct 2015 12:17:43 +0200 Philippe Verdy <[email protected]> wrote:
> 2015-10-13 8:36 GMT+02:00 Richard Wordingham < > [email protected]>: > > For > > example, a MSKLC keyboard will deliver a supplementary character in > > two WM_CHAR messages, one for the high surrogate and one for the low > > surrogate. > I have not tested the actual behavior in 64-bit versions of Windows : > is the message field of the WM_CHAR returned by the 64-bit version > of the API still requires returning two messages and not a single one > if that field has been extended to 64-bit ? In Unicode applications, WM_CHAR still delivers one UTF-16 codepoint. I suspect if delivers just one byte in multibyte 'ANSI' encodings. There is a WM_UNICHAR message that delivers whole Unicode characters, but reportedly Microsoft does not use it. > The actual behavior is also tricky as the basic layouts built with > MSKLC will have its character data translated "transparently" to > other "OEM" encodings according to the current input code page of the > console (using one of the codepage mapping tables installed > separately): the transcoder will also need to translate the 16-bit > Unicode input from WM_CHAR messages into the 8-bit input stream used > by the console, and this translation will need to read both > surrogates at once before sending any output. This only applies to 'ANSI' applications. I am not aware of any ANSI codepages that contain supplementary characters. For a Unicode application, no translation from Unicode occurs. Richard.

