On Monday, 15 October 2018 at 18:57:19 UTC, Vinay Sajip wrote:
On Monday, 15 October 2018 at 17:55:34 UTC, Dukc wrote:
This is done automatically for character arrays, which includes strings. wchar arrays wil iterate by UTF-16, and dchar arrays by UTF-32. If you have a byte/ubyte array you know to be unicode-encoded, convert it to char[] to iterate by code points.

Thanks for the response. I was looking for something where I don't have to manage buffers myself (e.g. when handling buffered file or socket I/O). It's really easy to find this functionality in e.g. Python, C#, Go, Kotlin, Java etc. but I'm surprised there doesn't seem to be a ready-to-go equivalent in D. For example, I can find D examples of opening files and reading a line at a time, but no examples of opening a file and reading Unicode chars one at a time. Perhaps I've just missed them?

import std.file : readText;
import std.uni : byCodePoint, byGrapheme;
// or import std.utf : byCodeUnit, byChar /*utf8*/, byWchar /*utf16*/, byDchar /*utf32*/, byUTF /*utf8(?)*/;
string a = readText("foo");

foreach(cp; a.byCodePoint)
{
    // do stuff with code point 'cp'
}

foreach(g; a.byGrapheme)
{
    // do stuff with grapheme 'g'
}

Reply via email to