On Mon, 16 Nov 2009 12:36:30 +0300, Walter Bright <[email protected]> wrote:

Denis Koroskin wrote:
I'd like to raise 2 issues for a discussion.
First, Phobos makes calls to different functions, based on the OS we are running on (e.g. CreateFileA vs. CreateFileW) and I wonder if it's *really* necessary, since Microsoft has a Unicode Layer for those Operating Systems. All an application needs to do to call W API on those OS'es is link with unicows.lib (which could be a part of Phobos). It does nothing on Win2k+ and only triggers on 9x OS family.
 A very good overview of it is written here:
http://msdn.microsoft.com/en-us/goglobal/bb688166.aspx

The unicows doesn't do anything more than what Phobos does in attempting to translate unicode into the local code page. All that using unicows would do is cause confusion and installation problems as the user would have to get a copy of unicows and install it. unicows doesn't exist on default Windows 9x installations.

There is simply no advantage to unicows.



End-users don't have to worry about it at all. They will just use W functions all the time and unicows will trigger and translate UTF16 strings into ANSI strings automatically on those operating systems. The change would be transparent for them. There is also a redistributable version of unicows, so those users who want to deploy their software on Win9x could use it and don't force manual install of the .dll.

I was about to propose a drop of Win9x support initially, but thought it might get hostile reception...

Second, "A" API accepts ansi strings as parameters, not UTF-8 strings. I think this should be reflected in the function signatures, since D encourages distinguishing between UTF-8 and ANSI strings and not store the latter as char[]. LPCSTR currently resolves to char*/const(char)*, but it could be better for it to be an alias to ubyte*/const(ubyte)* so that user couldn't pass unicode string to an API that doesn't expect one. The same is applicable to other APIs, too, for example, how does C stdlib co-operate with Unicode? I.e. is core.stdc.stdio.fopen() unicode-aware?

Calling C functions means one needs to pass them what the host C system expects. C itself doesn't define what character set char* is. If you use the Phobos functions, those are required to work with unicode.

Since char*/char[] denotes a sequence of Unicode characters in D, I see no reason for the API that works with ANSI characters to accept it. For example, there is a std.windows.charset.toMBSz() function that returns an ANSI variant of a Unicode string. I think it might be preferred for it to return ubyte sequence instead of char sequence.

Ideally, I'd like to see all the function that aren't guarantied to work with UTF-8 strings to accept ubyte*/ubyte[] instead.

Reply via email to