Re: Best way to read/write Chinese (GBK/GB18030) files?
On Wednesday, 22 March 2023 at 15:23:42 UTC, Kagamin wrote: https://dlang.org/phobos/std_stdio.html#rawWrite It's really amazing, it succeeded. Thank you! ```cpp auto b="test.txt";//gbk void[]d=read(b); stdout.rawWrite(d); ```
Re: Best way to read/write Chinese (GBK/GB18030) files?
https://dlang.org/phobos/std_stdio.html#rawWrite
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Tuesday, 14 March 2023 at 09:20:54 UTC, Kagamin wrote: I guess if your console is in gbk encoding, you can just write bytes with stdout.write. Thank you for your reply, but only display bytes, not gbk text.
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Monday, 13 March 2023 at 00:32:07 UTC, zjh wrote: Thank you for your reply, but is there any way to output `gbk` code to the console? I guess if your console is in gbk encoding, you can just write bytes with stdout.write.
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Monday, 13 March 2023 at 15:50:37 UTC, Steven Schveighoffer wrote: What is required is an addition to the `std.encoding` module, to allow such an encoding. Thank you for your information.
Re: Best way to read/write Chinese (GBK/GB18030) files?
On 3/12/23 8:32 PM, zjh wrote: On Sunday, 12 March 2023 at 20:03:23 UTC, 0xEAB wrote: ... Thank you for your reply, but is there any way to output `gbk` code to the console? What is required is an addition to the `std.encoding` module, to allow such an encoding. Encodings are simply translating some encoding (e.g. utf) to another (e.g. gbk). If you look at `std.encoding` you can get an idea of what it might require. It will take some effort and especially some help from a knowledgeable user (such as yourself). -Steve
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Sunday, 12 March 2023 at 20:03:23 UTC, 0xEAB wrote: ... Thank you for your reply, but is there any way to output `gbk` code to the console?
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Sunday, 12 March 2023 at 00:54:53 UTC, zjh wrote: On Saturday, 11 March 2023 at 19:56:09 UTC, 0xEAB wrote: If you desire to use other encodings, how about using ubyte + ubyte[]? There is no example. To read binary data from a file and dump it into another, you do: ```d import std.file : read, write; void[] data = read("infile.txt"); write("outfile.txt", data); ``` To write binary data to a file: ```d import std.file : write; ubyte[] data = [0xA0, 0x0A, 0x30, 0x01, 0xFF, 0x00, 0xFE]; write("myfile.txt", data); ``` `data` could contain GBK encoded text, for example. (Just don’t use `"Unicode literals"`.)
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Saturday, 11 March 2023 at 19:56:09 UTC, 0xEAB wrote: If you desire to use other encodings, how about using ubyte + ubyte[]? There is no example. An example should be added in an obvious position. I tried for a long time, but couldn't output `gbk`, and I finally gave up.
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Friday, 10 March 2023 at 07:16:32 UTC, zjh wrote: `D language` is too unfriendly for Chinese users! You can't even write `gbk` files. D’s char + string types are Unicode. To quote the tour, “In D, *all* strings are Unicode strings”. If you desire to use other encodings, how about using ubyte + ubyte[]?
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Friday, 10 March 2023 at 06:19:38 UTC, zjh wrote: `D language` is too unfriendly for Chinese users! You can't even write `gbk` files.
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Friday, 10 March 2023 at 02:48:43 UTC, John Xu wrote: ```d module chinese; import std.stdio : writeln; import std.conv; import std.windows.charset; int main(string[] argv) { auto s1 = "中文";//utf8 字符串 writeln("word:"~ s1); //乱的 writeln("word:" ~ to!string(toMBSz(text(s1; //转后就正常了 writeln("Hello D-World!"); return 0; } ```
Re: Best way to read/write Chinese (GBK/GB18030) files?
I found this: https://github.com/meatatt/exCode/blob/master/source/excode/package.d There is mention of unicode/GBK conversion, maybe it could be helpful Thanks for quick answers. Now I found I can read both UTF8 and UTF-16LE chinese file: string txt = std.file.read(chineseFile).to!string; and write to UTF8 file: std.file.write(utf8ChineseFile, txt); But still need figure out how to read/write GBK directly.
Re: Best way to read/write Chinese (GBK/GB18030) files?
On Tuesday, 7 March 2023 at 01:45:27 UTC, John Xu wrote: I'm new to dlang. I didn't find much tutorials on internet about how to read/write Chinese easily. std.encoding doesn't seem to support GBK or GB18030: "Encodings currently supported are UTF-8, UTF-16, UTF-32, ASCII, ISO-8859-1 (also known as LATIN-1), ISO-8859-2 (LATIN-2), WINDOWS-1250, WINDOWS-1251 and WINDOWS-1252." Then what is best way to read GBK/GB18030 contents ? Even GBK/GB18030 file names ? I found this: https://github.com/meatatt/exCode/blob/master/source/excode/package.d There is mention of unicode/GBK conversion, maybe it could be helpful
Re: Best way to read/write Chinese (GBK/GB18030) files?
On 3/6/23 8:45 PM, John Xu wrote: I'm new to dlang. I didn't find much tutorials on internet about how to read/write Chinese easily. std.encoding doesn't seem to support GBK or GB18030: "Encodings currently supported are UTF-8, UTF-16, UTF-32, ASCII, ISO-8859-1 (also known as LATIN-1), ISO-8859-2 (LATIN-2), WINDOWS-1250, WINDOWS-1251 and WINDOWS-1252." It appears that encoding is not supported. There is a scant mention of it, in the BOM detection. But I don't think there's any mechanism to encode/decode it. Then what is best way to read GBK/GB18030 contents ? Even GBK/GB18030 file names ? D has direct bindings to C, so possibly using a C library. I don't see anything jumping out at me from code.dlang.org -Steve
Best way to read/write Chinese (GBK/GB18030) files?
I'm new to dlang. I didn't find much tutorials on internet about how to read/write Chinese easily. std.encoding doesn't seem to support GBK or GB18030: "Encodings currently supported are UTF-8, UTF-16, UTF-32, ASCII, ISO-8859-1 (also known as LATIN-1), ISO-8859-2 (LATIN-2), WINDOWS-1250, WINDOWS-1251 and WINDOWS-1252." Then what is best way to read GBK/GB18030 contents ? Even GBK/GB18030 file names ?