byCodePoint for a range of chars
Given a range with element type char, what's the best way of iterating over it by code-point, without filling an array first? Related to this: What's the status of std.utf and std.encoding? The comments in std.encoding say that some functions supersede their std.utf counterparts.
Re: byCodePoint for a range of chars
On Tue, 20 May 2014 17:59:07 +, John Colvin wrote: Given a range with element type char, what's the best way of iterating over it by code-point, without filling an array first? Related to this: What's the status of std.utf and std.encoding? The comments in std.encoding say that some functions supersede their std.utf counterparts. Foreach on narrow strings automatically decodes, so it's as simple as: // assume UTF-8 encoded char[] myData = ... foreach (dchar codePoint; myData) ...
Re: byCodePoint for a range of chars
On Tuesday, 20 May 2014 at 17:59:09 UTC, John Colvin wrote: Given a range with element type char, what's the best way of iterating over it by code-point, without filling an array first? Related to this: What's the status of std.utf and std.encoding? The comments in std.encoding say that some functions supersede their std.utf counterparts. FWI, Walter just wrote byDchar, that does what you want: https://github.com/D-Programming-Language/phobos/pull/2043 It's about to be merged.
Re: byCodePoint for a range of chars
On Tuesday, 20 May 2014 at 18:06:09 UTC, Justin Whear wrote: Foreach on narrow strings automatically decodes, so it's as simple as: // assume UTF-8 encoded char[] myData = ... foreach (dchar codePoint; myData) ... I think the point of his question is if you have an actual non-array range of chars, in which case foreach does NOT decode. It simply iterates.
Re: byCodePoint for a range of chars
On Tuesday, 20 May 2014 at 19:58:17 UTC, monarch_dodra wrote: On Tuesday, 20 May 2014 at 17:59:09 UTC, John Colvin wrote: Given a range with element type char, what's the best way of iterating over it by code-point, without filling an array first? Related to this: What's the status of std.utf and std.encoding? The comments in std.encoding say that some functions supersede their std.utf counterparts. FWI, Walter just wrote byDchar, that does what you want: https://github.com/D-Programming-Language/phobos/pull/2043 It's about to be merged. Sweet! I was vaguely aware of this but didn't know what the progress was. Thanks :) In the mean time I hand-rolled my own, but I'm sure Walter's is infinitely better so I'll swap over ASAP.