Something else I've been thinking about, as a tangent to the
relational data models discussion, concerns Perl's concept of
"undef", which I see as being fully equivalent to the relational
model's concept of "null".
The root question of the matter is, what does "undef" mean to you?
To me, it means "unknown", utterly and completely.
Therefore, it does not make logical sense for any expression to
return a defined value if it expects defined arguments and is given
undefined ones.
For example:
$foo = 4 + 3; # 7
$bar = 5 + 0; # 5
$baz = 6 + undef; # error
What Perl 5 and Pugs both currently do in the last example is
automagically convert the undef to a zero and put 6 in $baz.
Likewise, with string examples:
$f = 'hello' ~ 'world'; # 'helloworld'
$r = 'one' ~ ''; # 'one'
$z = 'beer' ~ undef; # error
But Perl 5 (equivalent) and Pugs will cast the undef as an empty string.
I see the behaviour of Perl 5, which Pugs currently emulates, as
being very, very wrong, and should be changed in Perl 6.
An undefined value is NOT the same as zero or an empty string
respectively; the latter two are very specific and defined values,
just like 7 or 'foo'.
Undef, by definition, is different from and non-equal to everything
else, both any defined value, and other undefs.
Therefore, I propose that the default behaviour of Perl 6 be changed
or maintained such that:
0. An undefined value should never magically change into a defined
value, at least by default.
1. Any expression that expects a defined value as an argument, such
as typical mathematical or string operations, and gets an undefined
argument, will as a whole have undef as its value, or it will fail.
Examples are the expressions "$anything + undef" and "$anything ~
undef".
1a. If such an expression will always return a value, the value is undef.
1b. If the expression is allowed to fail, it can do that instead.
2. Any boolean-returning expression should return undef or false or
fail if given an undef.
2a. At the very least, "undef <equality-test-op> undef" should NEVER
return true, because an unknown quantity can not be claimed to be
equal to an unknown quantity. Rather, the defined() method, which is
analagous to 'IS NOT NULL', and such things are the proper way to
test if a variable is unknown.
2b. As a pseudo-exception, while undef/unknown values are
conceptually all unequal to each other, they should all sort
together; eg, calling sort() on an array of values where some are
defined and some not, should group all the undefs together. I leave
it up to discussion as to whether they should sort before or after
all the defined values, but one of those choices should be picked for
predictability.
3. In specific debugging situations, displaying an undefined value
could print some user-visible text, such as '<undef>' (similar to
Pugs' <obj:Foo>), so it is clearly distinguishible from a defined but
empty string. But I'm not a stickler for some other specific
solution.
4. Plain assignments or value passing will still work the same as
before; it is perfectly valid to pass around values whose values we
know we definitely don't know or possibly don't know, if we're not
trying to actually use them. This includes assignment into composite
type attributes.
5. In any situation where a developer wants an undefined value to
become a zero or empty string or something else, they should say so
explicitly, such as with:
$foo = undef // 0;
$bar = undef // '';
$baz = undef // $MY_DEFAULT;
The fact is, that in any normal program, using an undefined value as
if it were a defined one is a bug. Normally there will be a point
where such a variable should be tested for definedness and either be
given a default value explicitly or fail. Checking your input at the
gates is good programming practice.
Going further, I propose perhaps that the standard math and string
etc functions simply throw exceptions if given undefined input,
similarly to when one tries to call a function with a mis-matching
argument signiture. Bring it to a programmer's immediate attention
that an undef is invalid for the operation, so they can fix it. A
simple warning that then merrily has them go on their way with a slap
on the wrist is too weak.
But if you don't decide to make the undefined value warnings fatal
like I suggest, then my earlier (#1) suggestion of returning undef
should be what is done when the program is allowed to merrily
continue, rather than returning a defined value. This is because a
defined value doesn't actually make sense.
Now, in the spirit of TMTOWTDI, such as for people that like to turn
strictures or warnings off, I suggest that there can be an optional
feature, perhaps a pragma or better a core module, where a developer
can say that they want undefs to automatically become zero in numeric
contexts or an empty string in string contexts, or false in boolean
contexts, etc. But they should have to explicitly activate that
feature, like saying "treat undef as 0 in all my code", and this
treating would not happen by default.
Alternately, the meta-class that usual or standard classes are based
on could include a property or trait or something that lets users
explicitly say what happens when a container/variable of that class
is undefined and one tries to use it in a defined context; eg:
By default:
submethod value_when_undef() { return undef; }
For a number class, could be overridden with:
submethod value_when_undef() { return 0; }
Or with a string:
submethod value_when_undef() { return ''; }
So users could get such defaulting behaviour automatically, but it
doesn't happen by surprise because they still explicitly overwrote
that method.
But this method is better yet because users can then get that
functionality with arbitrary other classes that Perl 5-ish has no
magical conversion for.
Installing a method like that is like SQL's "DEFAULT" clause in its
domain/data-type definitions.
Note that what value_when_undef() actually returns is an object.
But still, the default action should be that undef never becomes
anything magically, which aids in avoiding bugs. Such as when you're
using a value that you thought you tested but actually didn't.
Automatic changes of undef into defined are non-intuitive, and can
confuse people who don't expect them. Less is more.
Having users explicitly set defaults with //, or by setting a
defaults method, makes the program more self describing, as you can
see what it is doing in the code, and it isn't doing anything without
saying so.
My suggestions should not make Perl slower or more difficult to use.
They should in fact make it easier to use. And not significantly
more verbose.
Feedback?
FYI, I feel more strongly about this issue than about the other
relational things I mentioned, since the undef thing is more low
level and very pervasive, not to mention quite simple to fix.
-- Darren Duncan