http://d.puremagic.com/issues/show_bug.cgi?id=10668
--- Comment #4 from Matt Carter <[email protected]> 2013-07-19 08:24:57 PDT --- (In reply to comment #2) > Well... what did you think it was going to print? you have a utf-8 sequence. > char c = s[0]; will extract the first code*point* of your unicode. You want > the > first code*unit*. > > http://www.fileformat.info/info/unicode/char/a3/index.htm > EG: � is the codepoint "AE" > In UTF8 it is represented by the sequence: [0xC2, 0xA3] > > When you write "char c = s[0];", you are extracting the first codeunit, which > is 0xC2. When you pass this to to writeln, what will happen will mostly depend > on your locale/codepage. If it is set to UF8 (CP65001 on windows), then it > will > print the "unknown character", since it you passed an incomplete sequence. > > The correct code you want is: > dchar c = s.front; > > (remember to include std.array to front). > > Another alternative, is to simply work from the ground up with dstrings. > > module main; > > import std.stdio; > > void main(string[] args) { > dstring s = "���"; > writeln(s); // Output: ��� > > dchar c = s[0]; > writeln(c); // Output: � > > writeln(s[0]); // Output: � > } > > Do you have access to "The D Programming Language"? It has the best > introduction to unicode/UTF I've read. Thanks for the response! Yeah, I converted my project to use dstrings on the off chance it worked after posting, lo-behold this is the fix it seems. I plan on eventually getting the book, although I've read some bad reviews regarding the e-book/kindle version, so I'm having to wait a little longer to get a hard copy. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
