Something else I've been thinking about, as a tangent to the relational data models discussion, concerns Perl's concept of "undef", which I see as being fully equivalent to the relational model's concept of "null".

The root question of the matter is, what does "undef" mean to you?

To me, it means "unknown", utterly and completely.

Therefore, it does not make logical sense for any expression to return a defined value if it expects defined arguments and is given undefined ones.

For example:

 $foo = 4 + 3; # 7
 $bar = 5 + 0; # 5
 $baz = 6 + undef; # error

What Perl 5 and Pugs both currently do in the last example is automagically convert the undef to a zero and put 6 in $baz.

Likewise, with string examples:

 $f = 'hello' ~ 'world'; # 'helloworld'
 $r = 'one' ~ ''; # 'one'
 $z = 'beer' ~ undef; # error

But Perl 5 (equivalent) and Pugs will cast the undef as an empty string.

I see the behaviour of Perl 5, which Pugs currently emulates, as being very, very wrong, and should be changed in Perl 6.

An undefined value is NOT the same as zero or an empty string respectively; the latter two are very specific and defined values, just like 7 or 'foo'.

Undef, by definition, is different from and non-equal to everything else, both any defined value, and other undefs.

Therefore, I propose that the default behaviour of Perl 6 be changed or maintained such that:

0. An undefined value should never magically change into a defined value, at least by default.

1. Any expression that expects a defined value as an argument, such as typical mathematical or string operations, and gets an undefined argument, will as a whole have undef as its value, or it will fail. Examples are the expressions "$anything + undef" and "$anything ~ undef".

1a. If such an expression will always return a value, the value is undef.

1b. If the expression is allowed to fail, it can do that instead.

2. Any boolean-returning expression should return undef or false or fail if given an undef.

2a. At the very least, "undef <equality-test-op> undef" should NEVER return true, because an unknown quantity can not be claimed to be equal to an unknown quantity. Rather, the defined() method, which is analagous to 'IS NOT NULL', and such things are the proper way to test if a variable is unknown.

2b. As a pseudo-exception, while undef/unknown values are conceptually all unequal to each other, they should all sort together; eg, calling sort() on an array of values where some are defined and some not, should group all the undefs together. I leave it up to discussion as to whether they should sort before or after all the defined values, but one of those choices should be picked for predictability.

3. In specific debugging situations, displaying an undefined value could print some user-visible text, such as '<undef>' (similar to Pugs' <obj:Foo>), so it is clearly distinguishible from a defined but empty string. But I'm not a stickler for some other specific solution.

4. Plain assignments or value passing will still work the same as before; it is perfectly valid to pass around values whose values we know we definitely don't know or possibly don't know, if we're not trying to actually use them. This includes assignment into composite type attributes.

5. In any situation where a developer wants an undefined value to become a zero or empty string or something else, they should say so explicitly, such as with:

 $foo = undef // 0;
 $bar = undef // '';
 $baz = undef // $MY_DEFAULT;

The fact is, that in any normal program, using an undefined value as if it were a defined one is a bug. Normally there will be a point where such a variable should be tested for definedness and either be given a default value explicitly or fail. Checking your input at the gates is good programming practice.

Going further, I propose perhaps that the standard math and string etc functions simply throw exceptions if given undefined input, similarly to when one tries to call a function with a mis-matching argument signiture. Bring it to a programmer's immediate attention that an undef is invalid for the operation, so they can fix it. A simple warning that then merrily has them go on their way with a slap on the wrist is too weak.

But if you don't decide to make the undefined value warnings fatal like I suggest, then my earlier (#1) suggestion of returning undef should be what is done when the program is allowed to merrily continue, rather than returning a defined value. This is because a defined value doesn't actually make sense.

Now, in the spirit of TMTOWTDI, such as for people that like to turn strictures or warnings off, I suggest that there can be an optional feature, perhaps a pragma or better a core module, where a developer can say that they want undefs to automatically become zero in numeric contexts or an empty string in string contexts, or false in boolean contexts, etc. But they should have to explicitly activate that feature, like saying "treat undef as 0 in all my code", and this treating would not happen by default.

Alternately, the meta-class that usual or standard classes are based on could include a property or trait or something that lets users explicitly say what happens when a container/variable of that class is undefined and one tries to use it in a defined context; eg:

By default:

 submethod value_when_undef() { return undef; }

For a number class, could be overridden with:

 submethod value_when_undef() { return 0; }

Or with a string:

 submethod value_when_undef() { return ''; }

So users could get such defaulting behaviour automatically, but it doesn't happen by surprise because they still explicitly overwrote that method.

But this method is better yet because users can then get that functionality with arbitrary other classes that Perl 5-ish has no magical conversion for.

Installing a method like that is like SQL's "DEFAULT" clause in its domain/data-type definitions.

Note that what value_when_undef() actually returns is an object.

But still, the default action should be that undef never becomes anything magically, which aids in avoiding bugs. Such as when you're using a value that you thought you tested but actually didn't.

Automatic changes of undef into defined are non-intuitive, and can confuse people who don't expect them. Less is more.

Having users explicitly set defaults with //, or by setting a defaults method, makes the program more self describing, as you can see what it is doing in the code, and it isn't doing anything without saying so.

My suggestions should not make Perl slower or more difficult to use. They should in fact make it easier to use. And not significantly more verbose.

Feedback?

FYI, I feel more strongly about this issue than about the other relational things I mentioned, since the undef thing is more low level and very pervasive, not to mention quite simple to fix.

-- Darren Duncan

Reply via email to