Re: Checking function parameters in Phobos

Lars T. Kyllingstad Wed, 20 Nov 2013 02:52:20 -0800

On Wednesday, 20 November 2013 at 00:01:00 UTC, AndreiAlexandrescu wrote:

(c) A variety of text functions currently suffer because wedon't make the difference between validated UTF strings andpotentially invalid ones.

I think it is fair to always assume that a char[] is a validUTF-8 string, and instead perform the validation whencreating/filling the string from a non-validated source.

Take std.file.read() as an example; it returns void[], but has avalidating counterpart in std.file.readText().

I think we should use ubyte[] to a greater extent for data whichis potentially *not* valid UTF. Examples include interfacingwith C functions, where I think there is a tendency towardsalways translating C char to D char, when they are in fact notequivalent. Another example is, again, std.file.read(), whichcurrently returns void[]. I guess it is a matter of taste, but Ithink ubyte[] would be more appropriate here, since you canactually use it for something without casting it first.

The transition from string to ubyte[] is already made simple bystd.string.representation. We should offer an equally simple andconvenient way to do the opposite transformation. In one of mycurrent projects, I am using this function:


  inout(char)[] asString(inout(ubyte)[] data) @safe pure
  {
    auto s = cast(typeof(return)) data;
    import std.utf: validate;
    validate(s);
    return s;
  }

This could easily be written as a template, to accept widerencodings as well, and I think it would be a nice addition toPhobos.


Lars

Re: Checking function parameters in Phobos

Reply via email to