I need to manage Unicode text, but in many cases I have lot of 7-bit or 8-bit ASCII text to process, and this has lead to this discussion, so since some time thanks to Jonathan Davis we have an efficient translate() again:

http://d.puremagic.com/issues/show_bug.cgi?id=7515


The s2 array generated by this code is a dchar[] (if array() becomes pure you are probably able to assign type s2 as dstring):

string s = "test string"; // UTF-8, but also 7-bit ASCII
dchar[] s2 = map!(x => x)(s).array(); // Uses the Id function

To produce a char[] (or string, using assumeUnique), you are free to use a cast:

auto s3 = map!(x => cast(char)x)(s).array();

But D casts are unsafe, and one thing I'm learning from Haskell is how important is to give types to your code to prevent bugs. So maybe an AsciiString wrapper (a subtype of string) range can be invented for Phobos. Its consructor verifies the input is a 7-big ASCII and its "front" method yields chars, so map.array() gives a char[]:

astring a1 = "test string"; // enforced 7-bit ASCII
char[] s4 = map!(x => x)(s).array();

This makes some algorithms working on ASCII text cleaner and safer, avoiding the need for casts.

Is creating something like this possible and appreciated for Phobos?

Bye,
bearophile

Reply via email to