Andrej Mitrovic Wrote: > On Wed, Jul 28, 2010 at 12:34 AM, Sean Kelly <[email protected]> wrote: > > > Sean Kelly Wrote: > > > > > > I think it's Windows integration that's the problem, on OSX I get: > > > > > > [H][a][l][l][?][?][,][ ][V][?][?][r][l][d][!] > > > [H][a][l][l][å][,][ ][V][ä][r][l][d][!] > > > > > > which is essentially correct. The only difference between this and doing > > the same thing in C and using printf() in place of write() is that both > > lines display correctly in C. I think printf() must be detecting partial > > UTF-8 characters and buffering until the complete chunk has arrived. > > Interestingly, the C output can't even be broken by badly timed calls to > > fflush(), so the buffering is happening at a fairly high level. I'd be > > interested in seeing the same thing in write() at some point. > > > > Ah, write() already works that way. It was the brackets that were screwing > > things up. > > > > You are right about printf(), I'm getting the correct output with this code: > > import std.stdio, std.stream; > > void main() { > string str = "Hall\u00E5, V\u00E4rld!"; > foreach (dchar c; str) { > printf("%c", c); > } > writeln(); > } > > Hallå, Värld! > > Should I file this as a Windows bug for DMD?
Yes. I looked into this briefly, and after a bit of googling, it looks like fwide() isn't implemented on Windows (unless Walter had done this himself in the DMC libraries). See here: http://blogs.msdn.com/b/michkap/archive/2009/06/23/9797156.aspx If I change std.stdio.LockingTextWriter.put(C)(C c) to always use the version(Windows) code for a 32-bit argument it *almost* works correctly. Instead of garbage, the Unicode characters are a lowercase o with an accent above (U+01A1 I believe) and an uppercase sigma (U+01A9). I'll have to spend some more time later trying to figure out why it's these characters and not the intended ones. I wouldn't think that endian issues should be relevant, but that's the only thing I've come up with so far.
