Current std.utf is bit messy and lacks attributes, so I wrote a patch.
This patch passes Phobos's unittests.

Changes:

* Remove UtfError

UtfError has been depreacated since Phobos 0.140 (from revision log on dsource).
I think removing UtfError is no problem.

* Add @safe, @trusted, pure and nothrow attributes

I think Unicode operations should be @safe and pure, but dependent functions are not.
So, some functions are @trusted and not pure.

* char version of stride

I removed assert because the comment says "0xFF meaning s[i] is not the start of of UTF-8 sequence.".
Until now, my library checked 0xFF :(

* validate

Add constraint.

* toUTF* functions

Unify the argument type using 'in'.
Current implementation is mixed with "in char[]" and "const(char)[]".

Remove some functions that take string, wstring and dstring.
The body of these functions call validate only. Need?

* count supports dchar

I wrote following code in my library.

static if (is(Char == dchar))
    immutable num = text.length;
else
    immutable num = text.count();

Why doesn't count support dchar?

In addition, Why does count depend walkLength?
count's call graph is:

std.utf.count -> std.range.walkLength -> std.array.empty, front, popFront -> std.utf.stride

This seems to be weird. I think count itself calculates the total number of code points and walkLength depends count is more better. The patch doesn't include this proposal.

What do you think?


Masahiro

Attachment: utf.patch
Description: Binary data

_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Reply via email to