On Saturday, July 8, 2017 5:16:51 PM MDT kdevel via Digitalmars-d-learn wrote: > Yesterday I noticed that std.uri.decodeComponent does not > 'preserve' the > nullity of its argument: > > 1 void main () > 2 { > 3 import std.uri; > 4 string s = null; > 5 assert (s is null); > 6 assert (s.decodeComponent); > 7 } > > The assertion in line 6 fails. This failure gave rise to a more > general > investigation on strings. After some research I found that one > "cannot implicitly convert expression (s) of type string to bool" > as in > > 1 void main () > 2 { > 3 string s; > 4 bool b = s; > 5 } > > Nonetheless in certain boolean contexts strings convert to bool > as here: > > 1 void main () > 2 { > 3 import std.stdio; > 4 string s; // equivalent to s = null > 5 writeln (s ? true : false); > 6 s = ""; > 7 writeln (s ? true : false); > 8 } > > The code prints > > false > true > > to the console. This lead me to the insight, that in D there are > two > distinct kinds of empty strings: Those having a ptr which is null > and > the other. It seems that this ptr nullity not only determines > whether > the string compares equal to null in an IdentityExpression [1] > but also > the result of the above mentioned conversion in the boolean > context. > > I wonder if this distinction is meaningful and---if not---why it > is > exposed to the application programmer so prominently. > > Then today I found this piece of code > > 1 void main () > 2 { > 3 string s = null; > 4 string t = ""; > 5 assert (s is t); > 6 } > > which, according to the wording in [1] > > "For static and dynamic arrays, identity is defined as > referring to > the same array elements and the same number of elements." > > shall succeed but its assertion fails [2]. I anticipate the > implementation compares the ptrs even in the case of zero > elements. > > A last example of 'deviant behavior' I found is this: > > 1 import std.stdio; > 2 import std.file; > 3 void main () > 4 { > 5 string s = null; > 6 try > 7 mkdir (s); > 8 catch (Exception e) > 9 e.msg.writeln; > 10 > 11 s = ""; > 12 try > 13 mkdir (s); > 14 catch (Exception e) > 15 e.msg.writeln; > 16 } > > Using DMD v2.073.2 the first expression terminates the programm > with a > segmentation fault. With 2.074.1 the program prints > > : Bad address > : No such file or directory > > I find that a bit confusing. > > [1] https://dlang.org/spec/expression.html#identity_expressions > [2] https://issues.dlang.org/show_bug.cgi?id=17623
A dynamic array in D is essentially struct DynamicArray(T) { size_t length; T* ptr; } That's not _exactly_ what it is at the moment (it actually does stuff with void* rather than templates unfortunately), but essentially, that's what it is and what it behaves like. In the case of dyanamic arrays, null is a dynamic array whose ptr is null and whose length is 0. The empty property for arrays checks whether the length of the array is 0. So, any array with a length of 0 (regardless of its ptr) is considered empty. The is expression checks for bitwise equality. So, arr is null checks for whether the array has a null ptr and a 0 length. In _most_ circumstances, that's equvialent to checking that the array's ptr is null, but if you do something screwy with unitialized memory, then you could end up with a ptr value of null and a non-zero length, and arr is null would be false. The == expression, on the other, hand checks that the elements are equal. So, it does something similar to if(lhs.length != rhs.length) return false; for(size_t i = 0; i < lhs.length; ++i) { if(lhs.ptr[i] != rhs.ptr[i]) return false; } return true; So, if the lengths are 0, no iterating happens, and the two arrays are considered equal. This means that a null array is equal to any other empty array, regardless of the value of ptr. It's also why I would consider arr == null to be a code smell. IMHO, if you want to check for empty, then you should use the empty property or check length directly, since those are clear about your intent, whereas with arr == null you always have the question of whether they should have used an is expression or whether they were simpy checking for an empty array. If you understand all of this, it is perfectly possible to write code which treats null arrays as distinct from empty arrays. However, it's _very_ easy to get into a situation where you have an empty array rather than a null one. Pretty much as soon as you do anything to a null array other than pass it around or compare it, trusting that it's still null can get error-prone. And that's why a number of folks think that it's just plain error-prone to try and treat null arrays as special - but some folks who understand the issues continue to do so anyway, because they know enough to make it work and consider the distinction valuable. Personally, I think that it can make sense to have a function explicitly return null to indicate something, but beyond that, I'd actually consider using std.typecons.Nullable to make the whole thing clear, even if it is a bit dumb to have to wrap a nullable type in a Nullable to treat it as null. As for conversions to bool, not much implcitly converts to bool - dynamic arrays included. However, conditional expressions in if statements, loops, ternary expressions, and assertions actually insert an invisible, explicit cast. So, even though the conversion _looks_ implicit, it's actually explicit. So, if(cond) { } is actually if(cast(bool)cond) { } For user-defined types, that means that the way to affect how they're treated in condition expressions is to overload opCast to bool. For, built-in types, the result varies depending on how it was decided to casting that type to bool would work. For pointers, cast(bool)ptr becomes ptr !is null which makes a lot of sense. Unfortunately, because dynamic arrays were just pointers in C, D has historically treated dynamic arrays as pointers under certain circumstances and implictly converted them to value of their ptr property. Fortunately, in many cases, that has been fixed, and the compiler has gotten stricter. Unforunately, however, it is still the case that casting a dynamic array to bool checks its ptr value for null. This works fine if you know what you're doing but is frequently surprising to folks and is arguably error-prone. It _was_ temporarily fixed at one point by deprecating using arrays in conditional expressions, but some major D contributors (Andrei included) who understood how to correctly treat null, dynamic arrays as special did not like the change, and it was reverted. So, basically, you should be _very_ wary of ever using a dynamic array in a conditional expression directly. If you know what you're doing, it can be done correctly, but it's error prone, and it's arguably a code smell, because folks reading your code don't necessarily know that you know what you're doing well enough to get it right. - Jonathan M Davis