On Fri, Oct 18, 2013 at 02:04:41PM -0400, Jonathan M Davis wrote: > On Friday, October 18, 2013 10:38:12 H. S. Teoh wrote: [...] > > IMO, distinguishing between null and empty arrays is bad > > abstraction. I agree with D's "conflation" of null with empty, > > actually. Conceptually speaking, an array is a sequence of values of > > non-negative length. An array with non-zero length contains at least > > one element, and is therefore non-empty, whereas an array with zero > > length is empty. Same thing goes with a slice. A slice is a view > > into zero or more array elements. A slice with zero length is empty, > > and a slice with non-zero length contains at least one element. > > There's nowhere in this conceptual scheme for such a thing as a > > "null array" that's distinct from an empty array. This distinction > > only crops up in implementation, and IMO leads to code smells > > because code should be operating based on the conceptual behaviour > > of arrays rather than on the implementation details. > > In most languages, an array is a reference type, so there's the > question of whether it's even _there_. There's a clear distinction > between having null reference to an array and having a reference to an > empty array. This is particularly clear in C++ where an array is just > a pointer, but it's try in plenty of other languages that don't treat > as arrays as pointers (e.g. Java).
To me, these are just implementation details. Conceptually speaking, D arrays are actually slices, so that gives them reference semantics. Being slices, they refer to zero or more elements, so either their length is zero, or not. There is no concept of nullity here. That only comes because we chose to implement slices as pointer + length, so implementation-wise we can distinguish between a null .ptr and a non-null .ptr. But from the conceptual POV, if we consider slices as a whole, they are just a sequence of zero or more elements. Null has no meaning here. Put another way, slices themselves are value types, but they refer to their elements by reference. It's a subtle but important difference. > The problem is that D put the length on the stack alongside the > pointer, making it so that D arrays are sort of reference types and > sort of not. The pointer is a reference type, but the length is a > value type, making the dynamic array half and half. If it were fully a > reference type, then there would be no problem with distinguishing > between null and empty arrays. A null array is simply a null reference > to an array. But since D arrays aren't quite reference types, that > doesn't work. [...] I think the issue comes from the preconceived notion acquired from other languages that arrays are some kind of object floating somewhere out there on the heap, for which we have a handle here. Thus we have the notion of null, being the case when we have a handle here but there's actually nothing out there. But we consider the slice as being a thing right *here* and now, referencing some sequence of elements out there, then we arrive at D's notion of null and empty being the same thing, because while there may be no elements out there being referenced, the handle (i.e. slice) is always *here*. In that sense, there's no distinction between an empty slice and a null slice: either there are elements out there that we're referring to, or there are none. There is no third "null" case. There's no reason why we should adopt the previous notion if this one works just as well, if not better. I argue that the second notion is conceptually cleaner, because it eliminates an unnecessary distinction between an empty sequence and a non-existent sequence (which then leads to similar issues one encounters with null pointers). T -- Answer: Because it breaks the logical sequence of discussion. / Question: Why is top posting bad?
