On Friday, February 07, 2014 23:01:46 Meta wrote: > On Friday, 7 February 2014 at 22:57:26 UTC, Jonathan M Davis > > wrote: > > On Friday, February 07, 2014 20:43:38 Dmitry Olshansky wrote: > >> 07-Feb-2014 20:29, Andrej Mitrovic пишет: > >> > On Friday, 7 February 2014 at 16:27:35 UTC, Andrei > >> > > >> > Alexandrescu wrote: > >> >> Add a bugzilla and let's define isValid that returns bool! > >> > > >> > Add std.utf.decode() to that as well. IOW, it should have an > >> > overload > >> > which returns a status code > >> > >> Much simpler - it returns a special dchar to designate bad > >> encoding. And > >> there is one defined by Unicode spec. > > > > Isn't that actually worse? Unless you're suggesting that we > > stop throwing on > > decode errors, then functions like std.array.front will have to > > check the > > result on every call to see whether it was valid or not and > > thus whether they > > should throw, which would mean extra overhead over simply > > having decode throw > > on decode errors. validate has no business throwing, and we > > definitely should > > add isValidUnicode (or isValid or whatever you want to call it) > > for validation > > purposes. Code can then call that to validate that a string is > > valid and not > > worry about any UTFExceptions being thrown as long as it > > doesn't manipulate > > the string in a way that could result in its Unicode becoming > > invalid. > > However, I would argue that assuming that everyone is going to > > validate their > > strings and that pretty much all string-related functions > > shouldn't ever have > > to worry about invalid Unicode is just begging for subtle bugs > > all over the > > place IMHO. You're essentially dealing with error codes at that > > point, and I > > think that experience has shown quite clearly that error codes > > are generally a > > bad way to go. Almost no one checks them unless they have to. I > > think that > > having decode throw on invalid Unicode is exactly what it > > should be doing. The > > problem is that validate shouldn't. > > > > - Jonathan M Davis > > You could always return an Option!char. Nullable won't work > because it lets you access the naked underlying value.
How is that any better than returning an invalid dchar with a specific value? In either case, you have to check the value. With the exception, code doesn't have to care. If the string is invalid, it'll get a UTFException, and it can handle it appropriately, but having to check the return value just adds overhead (albeit minimal) and is error-prone, because it generally won't be checked (and if it is checked, it complicates the calling code, because it has to do the check). Code that doesn't want to risk a UTFException being thrown can validate up front - and that validator function return bool and _not_ throw. But having decode not throw is going to be error-prone. It also doesn't help performance- wise, because it still has to do all of the same validity checks as it decodes. It's just that instead of throwing, it returns an error value. I really think that having decode throw on invalid Unicode is the right decision, and I don't see what we gain by making it not throw. - Jonathan M Davis
