Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-22 Thread zjh via Digitalmars-d-learn

On Wednesday, 22 March 2023 at 15:23:42 UTC, Kagamin wrote:

https://dlang.org/phobos/std_stdio.html#rawWrite



It's really amazing, it succeeded. Thank you!
```cpp
auto b="test.txt";//gbk
void[]d=read(b);
stdout.rawWrite(d);
```



Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-22 Thread Kagamin via Digitalmars-d-learn

https://dlang.org/phobos/std_stdio.html#rawWrite


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-14 Thread zjh via Digitalmars-d-learn

On Tuesday, 14 March 2023 at 09:20:54 UTC, Kagamin wrote:

I guess if your console is in gbk encoding, you can just write 
bytes with stdout.write.



Thank you for your reply, but only display bytes, not gbk text.


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-14 Thread Kagamin via Digitalmars-d-learn

On Monday, 13 March 2023 at 00:32:07 UTC, zjh wrote:
Thank you for your reply, but is there any way to output `gbk` 
code to the console?


I guess if your console is in gbk encoding, you can just write 
bytes with stdout.write.


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-13 Thread zjh via Digitalmars-d-learn
On Monday, 13 March 2023 at 15:50:37 UTC, Steven Schveighoffer 
wrote:


What is required is an addition to the `std.encoding` module, 
to allow such an encoding.




Thank you for your information.


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-13 Thread Steven Schveighoffer via Digitalmars-d-learn

On 3/12/23 8:32 PM, zjh wrote:

On Sunday, 12 March 2023 at 20:03:23 UTC, 0xEAB wrote:

...


Thank you for your reply, but is there any way to output `gbk` code to 
the console?




What is required is an addition to the `std.encoding` module, to allow 
such an encoding.


Encodings are simply translating some encoding (e.g. utf) to another 
(e.g. gbk). If you look at `std.encoding` you can get an idea of what it 
might require.


It will take some effort and especially some help from a knowledgeable 
user (such as yourself).


-Steve


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-12 Thread zjh via Digitalmars-d-learn

On Sunday, 12 March 2023 at 20:03:23 UTC, 0xEAB wrote:

...


Thank you for your reply, but is there any way to output `gbk` 
code to the console?




Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-12 Thread 0xEAB via Digitalmars-d-learn

On Sunday, 12 March 2023 at 00:54:53 UTC, zjh wrote:

On Saturday, 11 March 2023 at 19:56:09 UTC, 0xEAB wrote:

If you desire to use other encodings, how about using ubyte + 
ubyte[]?



There is no example.


To read binary data from a file and dump it into another, you do:

```d
import std.file : read, write;

void[] data = read("infile.txt");
write("outfile.txt", data);
```

To write binary data to a file:

```d
import std.file : write;

ubyte[] data = [0xA0, 0x0A, 0x30, 0x01, 0xFF, 0x00, 0xFE];
write("myfile.txt", data);
```

`data` could contain GBK encoded text, for example. (Just don’t 
use `"Unicode literals"`.)




Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-11 Thread zjh via Digitalmars-d-learn

On Saturday, 11 March 2023 at 19:56:09 UTC, 0xEAB wrote:

If you desire to use other encodings, how about using ubyte + 
ubyte[]?



There is no example. An example should be added in an obvious 
position.
I tried for a long time, but couldn't output `gbk`, and I finally 
gave up.




Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-11 Thread 0xEAB via Digitalmars-d-learn

On Friday, 10 March 2023 at 07:16:32 UTC, zjh wrote:

`D language` is too unfriendly for Chinese users!
You can't even write `gbk` files.


D’s char + string types are Unicode.
To quote the tour, “In D, *all* strings are Unicode strings”.

If you desire to use other encodings, how about using ubyte + 
ubyte[]?


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-09 Thread zjh via Digitalmars-d-learn

On Friday, 10 March 2023 at 06:19:38 UTC, zjh wrote:


`D language` is too unfriendly for Chinese users!
You can't even write `gbk` files.


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-09 Thread zjh via Digitalmars-d-learn

On Friday, 10 March 2023 at 02:48:43 UTC, John Xu wrote:


```d
module chinese;
import std.stdio : writeln;
import std.conv;
import std.windows.charset;

int main(string[] argv)
{
auto s1 = "中文";//utf8 字符串
writeln("word:"~ s1); //乱的
writeln("word:" ~ to!string(toMBSz(text(s1; //转后就正常了
writeln("Hello D-World!");
return 0;
}
```


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-09 Thread John Xu via Digitalmars-d-learn
I found this: 
https://github.com/meatatt/exCode/blob/master/source/excode/package.d


There is mention of unicode/GBK conversion, maybe it could be 
helpful


Thanks for quick answers. Now I found I can read both UTF8 and 
UTF-16LE

chinese file:
string txt = std.file.read(chineseFile).to!string;

and write to UTF8 file:
std.file.write(utf8ChineseFile, txt);

But still need figure out how to read/write GBK directly.



Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-06 Thread ryuukk_ via Digitalmars-d-learn

On Tuesday, 7 March 2023 at 01:45:27 UTC, John Xu wrote:
I'm new to dlang. I didn't find much tutorials on internet 
about how to read/write Chinese easily. std.encoding doesn't 
seem to support GBK or GB18030:


"Encodings currently supported are UTF-8, UTF-16, UTF-32, 
ASCII, ISO-8859-1 (also known as LATIN-1), ISO-8859-2 
(LATIN-2), WINDOWS-1250, WINDOWS-1251 and WINDOWS-1252."


Then what is best way to read GBK/GB18030 contents ? Even 
GBK/GB18030 file names ?


I found this: 
https://github.com/meatatt/exCode/blob/master/source/excode/package.d


There is mention of unicode/GBK conversion, maybe it could be 
helpful


Re: Best way to read/write Chinese (GBK/GB18030) files?

2023-03-06 Thread Steven Schveighoffer via Digitalmars-d-learn

On 3/6/23 8:45 PM, John Xu wrote:
I'm new to dlang. I didn't find much tutorials on internet about how to 
read/write Chinese easily. std.encoding doesn't seem to support GBK or 
GB18030:


"Encodings currently supported are UTF-8, UTF-16, UTF-32, ASCII, 
ISO-8859-1 (also known as LATIN-1), ISO-8859-2 (LATIN-2), WINDOWS-1250, 
WINDOWS-1251 and WINDOWS-1252."


It appears that encoding is not supported.

There is a scant mention of it, in the BOM detection. But I don't think 
there's any mechanism to encode/decode it.




Then what is best way to read GBK/GB18030 contents ? Even GBK/GB18030 
file names ?





D has direct bindings to C, so possibly using a C library. I don't see 
anything jumping out at me from code.dlang.org


-Steve


Best way to read/write Chinese (GBK/GB18030) files?

2023-03-06 Thread John Xu via Digitalmars-d-learn
I'm new to dlang. I didn't find much tutorials on internet about 
how to read/write Chinese easily. std.encoding doesn't seem to 
support GBK or GB18030:


"Encodings currently supported are UTF-8, UTF-16, UTF-32, ASCII, 
ISO-8859-1 (also known as LATIN-1), ISO-8859-2 (LATIN-2), 
WINDOWS-1250, WINDOWS-1251 and WINDOWS-1252."


Then what is best way to read GBK/GB18030 contents ? Even 
GBK/GB18030 file names ?