Re: Thin UTF8 string wrapper

Joseph Rushton Wakeling via Digitalmars-d-learn Sat, 07 Dec 2019 09:35:49 -0800

On Saturday, 7 December 2019 at 15:57:14 UTC, Jonathan M Daviswrote:

There may have been some tweaks to std.encoding here and there,but for the most part, it's pretty ancient. Looking at thehistory, it's Seb who marked some if it as being a replacementfor std.utf, which is just plain wrong.

Ouch! I must say it was a surprise to read, precisely becausestd.encoding seemed weird and clunky. Good to know that it'smisleading.

Unfortunately that adds to the list I have of weirdly misleadingdocs that seem to have crept in over the last months/years :-(

std.utf.validate does need a replacement, but doing so getspretty complicated. And looking at std.encoding.isValid, I'mnot sure that what it does is any better from simply wrappingstd.utf.validate and returning a bool based on whether anexception was thrown.

Unfortunately I'm dealing with a use case where exceptionthrowing (and indeed, anything that generates garbage) ispreferred to be avoided. That's why I was looking for a functionthat returned a bool ;-)

Depending on the string, it would actually be faster to usevalidate, because std.encoding.isValid iterates through theentire string regardless. The way it checks validity is alsocompletely different from what std.utf does. Either way, someof the std.encoding internals do seem to be an alternateimplementation of what std.utf has, but outside of std.encodingitself, std.utf is what Phobos uses for UTF-8, UTF-16, andUTF-32, not std.encoding.


Thanks -- good to know.

I did do a PR at one point to add isValidUTF to std.utf so thatwe could replace std.utf.validate, but Andrei didn't like theimplementation, so it didn't get merged, and I haven't gottenaround to figuring out how to implement it more cleanly.

Thanks for the attempt, at least! While I get the reasons it wasrejected, it feels a bit of a shame -- surely it's easier to do amore major under-the-hood rewrite with the public API (and tests)already in place ... :-\

Re: Thin UTF8 string wrapper

Reply via email to