On Fri, Feb 23, 2018 at 3:25 AM, Michael Van Canneyt via Lazarus
<lazarus@lists.lazarus-ide.org> wrote:
>
>
> On Fri, 23 Feb 2018, R0b0t1 via Lazarus wrote:
>
>> On Fri, Feb 23, 2018 at 2:29 AM, Ondrej Pokorny via Lazarus
>> <lazarus@lists.lazarus-ide.org> wrote:
>>>
>>> OK, you mean it's a wiki page issue. I just didn't understand how we
>>> could
>>> solve it in Lazarus :)
>>>
>>
>> I am interested in this thread because I was under the impression that
>> UTF-8 support in Windows is fundamentally broken and should not be
>> used (it interferes with the C libraries).
>>
>> Not to take the thread offtopic, but can anyone comment on this in
>> practice?
>
>
> Where did you get that from ?
>
> You can perfectly use UTF8 in FPC code, but when calling a windows API, you
> should a) convert UTF8 to UTF16 (or WideString). If you use the correct
> types,
>    the compiler will do it for you most of the time.
> b) Use the *W variant of a Windows system call.
>

The combination of those is the largest part of why
http://utf8everywhere.org/ and many independent developers recommend
avoiding Window's implementation of UTF-8. It doesn't end up doing you
any good, because you typically can not set the whole system to use
UTF-8 (because it is broken).

The brokenness (described at
https://social.msdn.microsoft.com/Forums/vstudio/en-US/e4b91f49-6f60-4ffe-887a-e18e39250905/possible-bugs-in-writefile-and-crt-unicode-issues?forum=vcgeneral)
is due to the UTF-8 codepage causing Windows to report multibyte
characters as a single character, and stdio assuming one byte per
character.

The Chinese/Japanese mappings apparently had these problems as well,
but a workaround was added.

Cheers,
     R0b0t1
-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to