Re: Reading binary streams with decoding to Unicode

Nicholas Wilson via Digitalmars-d-learn Mon, 15 Oct 2018 13:00:36 -0700

On Monday, 15 October 2018 at 18:57:19 UTC, Vinay Sajip wrote:

On Monday, 15 October 2018 at 17:55:34 UTC, Dukc wrote:
This is done automatically for character arrays, whichincludes strings. wchar arrays wil iterate by UTF-16, anddchar arrays by UTF-32. If you have a byte/ubyte array youknow to be unicode-encoded, convert it to char[] to iterate bycode points.
Thanks for the response. I was looking for something where Idon't have to manage buffers myself (e.g. when handlingbuffered file or socket I/O). It's really easy to find thisfunctionality in e.g. Python, C#, Go, Kotlin, Java etc. but I'msurprised there doesn't seem to be a ready-to-go equivalent inD. For example, I can find D examples of opening files andreading a line at a time, but no examples of opening a file andreading Unicode chars one at a time. Perhaps I've just missedthem?


import std.file : readText;
import std.uni : byCodePoint, byGrapheme;

// or import std.utf : byCodeUnit, byChar /*utf8*/, byWchar/*utf16*/, byDchar /*utf32*/, byUTF /*utf8(?)*/;

string a = readText("foo");

foreach(cp; a.byCodePoint)
{
    // do stuff with code point 'cp'
}

foreach(g; a.byGrapheme)
{
    // do stuff with grapheme 'g'
}

Re: Reading binary streams with decoding to Unicode

Reply via email to