At 2:27 AM -0500 12/17/05, Gordon Henriksen wrote:
I find it useful to distinguish between unassigned and undefined (null).

I think that the whole point of having "undef" in the first place was to indicate when a container wasn't assigned to yet, and hence has no useable value. Also, that explicitly assigning undef to a variable is returning it to the state it was before it was first assigned a value. When an undef is used in a larger expression, that is an anonymous container which is undefined.

Put another way, both an undef or SQL null are what we call the state of a container for which we don't yet know the value we want to store in it.

Undef means "don't know", which is distinct from "zero", because in the latter case we explicitly have a value of zero.

The fact we have undef as distinct from zero is a huge plus of Perl and friends over C, where you have to use some actual number (often -1) to mean "this isn't actually a number". But undef by design is outside the domain of numbers, and everything else, distinct. Very, very useful, and I hate to see that compromised.

"None" is very often a valid value, especially for primitive types, and
especially where databases are involved. i.e., the range of a variable might
be {undef, -2^31..2^31-1}.

Yes, and I have no problem with 'none' aka 'no explicit value'. What I have a problem with is undef being considered equal to zero or the empty string.

In my experience:

  99 + undef -> 99         # Permissive. Stable. Useful. [Perl]
  99 + undef -> undef      # Pedantic. Error-prone. Annoying. [SQL, C# 2.0]
  99 + undef -> die        # Anal retentive. Crash-prone. Enfuriating.
[Obj-C]
  99 + undef is impossible # Ill-advised. Unusable. [C#, C]

I find null propagation frustrating; it's more useful that my code keep data
rather than to throw it away on the theory that "undef means maybe, and
anything combined in any fashion with maybe results in more maybe".

Well, that theory seems the most logical in practice. The whole point of having different words for undef and zero is because they mean different things.

I just
wind up writing defined(expr)?expr:0 over and over to avoid throwing away
the other part of the expression.

Your example shows non-strategic means of defaults. Its much more concise to say "expr // 0" instead of "expr.defined ?? expr !! 0"; the defaulting adds only 3 characters plus spaces. The // and //= operators were created intentionally so that explicit defaulting can be done in a very concise manner. In fact, they were even back-ported to Perl 5.9.x+.

FYI, SQL:2003 has something similar, but with a few more characters, "COALESCE"; so Perl's "expr // 0" is SQL:2003's "COALESCE(expr,0)", but that COALESCE takes N args, returning the first not-null one, like a chain of //; Oracle calls that, or the 2 arg variant anyway, NVL().

The key thing is that elegant defaulting solutions exist, elegant because <expr> appears exactly once, so they can be taken advantage of.

The two third and fourth options are just progressively more destructive
forms of the same logic. Succinctly, 'use crash_on_every_use_of_undef' is an
pragma I'd want to opt out of almost globally.

No, its just crash_on_dirty_code.

And there is the 'no strict undef' pragma if you really want it.

An unassigned variable is very different, and is a compile-time concept.
Static flow control can find accesses of not definitely assigned local
variables, like this:

  my Animal $pet;
  given $kind {
      when 'dog': $dog = new Dog;
      when 'cat': $pet = new Cat;
      when 'none': $pet = undef;
  }
  return $pet;

Static flow control analysis can see that, where $kind not in ('dog', 'cat',
'none'), $pet will not be definitely assigned in the return statement. To
ensure definedness, there must be a default case. Perhaps $pet's
compiler-supplied default value is okay, but the programmer's intent isn't
explicit in the matter. Note that in the case of $kind == 'none', $pet's IS
assigned: It's assigned undef.

While flow control analysis requires some additional work to avoid reliance
on default values, I find that work to be less than the work debugging the
bugs introduced because such checks aren't performed in the first place. It
also allows for very strong guarantees; i.e., "I know this variable cannot
be undefined because I never assign undef to it, and the compiler would tell
me if I accessed it without assigning to it."

This is what 'use strict' should evolve toward, in my mind.

If you see 'never-assigned' and 'assigned-but-unset' to be distinct in practical use, then maybe we need to add a method/property to all containers that is used like .defined(), such as .unassigned() ... but I can't say its needed.

-- Darren Duncan

Reply via email to