Re: Reading unicode chars..

Ali Çehreli via Digitalmars-d-learn Tue, 02 Sep 2014 10:15:36 -0700

On 09/02/2014 07:06 AM, seany wrote:

How do I read unicode chars that has code points \u1FFF and higher from
a file?


file.getcw() reads only part of the char, and D identifies this
character as an array of three or four characters.

Importing std.uni does not change the behavior.

Thank you.


One way is to use std.stdio.File just like you would use stdin and stdout:

import std.stdio;

void main()
{
    string fileName = "unicode_test_file";
    doWrite(fileName);
    doRead(fileName);
}

void doWrite(string fileName)
{
    auto file = File(fileName, "w");
    file.writeln("abcçdef");
}

void doRead(string fileName)
{
    auto file = File(fileName, "r");

    foreach (line; file.byLine) {        // (1)
        foreach (dchar c; line) {        // (2)
            writeln(c);
        }

        import std.range;
        foreach (c; line.stride(1)) {    // (3)
            writeln(c);
        }
    }
}

Notes:

1) To avoid a common gotcha, note that 'line' is reused at everyiteration here. You must make copies of portions of it if you need to.


2) dchar is important there

3) Any algorithms that turns a string to a range does expose decodeddchars. Here, I used stride.

Ali

Re: Reading unicode chars..

Reply via email to