[Tutor] SENTINEL, & more

spir ☣ Sat, 29 May 2010 01:31:42 -0700

Hello,

from the thread: "class methods: using class vars as args?"

On Sat, 29 May 2010 11:01:10 +1000
Steven D'Aprano <[email protected]> wrote:

> On Fri, 28 May 2010 07:42:30 am Alex Hall wrote:
> > Thanks for all the explanations, everyone. This does make sense, and
> > I am now using the
> > if(arg==None): arg=self.arg
> > idea. It only adds a couple lines, and is, if anything, more explicit
> > than what I was doing before.
> 
> You should use "if arg is None" rather than an equality test.
> 
> In this case, you are using None as a sentinel value. That is, you want 
> your test to pass only if you actually receive None as an argument, not 
> merely something that is equal to None.
> 
> Using "arg is None" as the test clearly indicates your intention:
> 
> The value None, and no other value, is the sentinel triggering special 
> behaviour
> 
> while the equality test is potentially subject to false positives, e.g. 
> if somebody calls your code but passes it something like this:
> 
> class EqualsEverything:
>     def __eq__(self, other):
>         return True
> 
> instead of None.

I'll try to clarify the purpose and use of sentinels with an example. Please, 
advanced programmers correct me. A point is that, in languages like python, 
sentinels are under-used, because everybody tends to une None instead, or as 
all-purpose sentinel.

Imagine you're designing a kind of database of books; with a user interface to 
enter new data. What happens when an author is unknown? A proper way, I guess, 
to cope with this case, is to define a sentinel object, eg:
    UNKNOWN_AUTHOR = Object()
There are many ways to define a sentinel; one could have defined "=0" or 
"=False" or whatever. But this choice is simple, clear, and secure because a 
custom object in python will only compare equal to itself -- by default. 
Sentinels are commonly written upercase because they are constant, predefined, 
elements.

Say, when users have to deal with an unknown author, they press a special 
button or enter a special valuen, eg '*', the software then silently converts 
to UNKNOWN_AUTHOR. Now, cases of unknown authors all are *marked* with the same 
mark UNKNOWN_AUTHOR; this mark only happens in this very case, thus only means 
this. In other words, this is a clear & safe *semantic* mark.

Later, when the application operates on data, it can compare the value stored 
in the "author" field, to catch the special mark case UNKNOWN_AUTHOR. Eg

class Book(Object):
    ...
    AUTHOR_DEFAULT_TEXT = "<unknown>"
    def write(self):
        ...
        if self.author is UNKNOWN_AUTHOR:
            author_text = Book.AUTHOR_DEFAULT_TEXT
            ...

Hope I'm clear. In the very case of UNKNOWN_AUTHOR, it would hardly have any 
consequence to use "==", instead of "is", as relational operator for 
comparison. Because, as said above, by default, custom objects only compare 
equal to themselves in python. But
* This default behaviour can be overriden, as shown by Steven above.
* Using "is" clarifies your intent to the reader, including yourself.
* Not all languages make a difference between "==" and "is". (Actually, very 
few do it.) Good habits...

=== additional stuff -- more personal reflexion -- critics welcome ===

Sentinels belong to a wider category of programming elements, or objects, I 
call "marks". (Conventional term for this notion welcome.) Marks are elements 
that play a role in a programmer's model, but have no value. What is the value 
of NOVICE_MODE for a game? of the SPADE card suit? of the character 'ø'? These 
are notions, meaning semantic values, that must exist in an application but 
have no "natural" value -- since they are not values semantically, unlike a 
position or a color.
In C, on could use a preprocessor flag for this:
   #define NOVICE_MODE
   ...
   #ifdef NOVICE_MODE ... #endif
NOVICE_MODE is here like a value-less symbol in the program: precisely what we 
mean. But not all languages have such features. (Indeed, there is a value 
behind the scene, but it is not accessible to the programmer; so, the semantics 
is correct.)

Thus, we need to _arbitrarily_ assign marks values. Commonly, natural numbers 
are used for that: they are called "nominals" (--> 
http://en.wikipedia.org/wiki/Nominal_number) precisely because they act like 
symbol names for things that have no value.
The case of characters is typical: that 'ø' is represented by 248 is just 
arbitrary; we just need something, and software can only deal with values; so, 
we need a value. the only subset of a character set that is not arbitrarily 
ordered is precisely the suite of digits: because they are used to compose 
ordinals, which themselves form the archetype of every order.

In the case of card suits, I could define an independant mark for each suit. 
But the 4 of them also build a whole, namely the set of card suits. For such a 
notion, some languages introduce a builtin feature; for instance Pascal has 
"enumerations" for this 
(http://en.wikipedia.org/wiki/Enumeration_%28programming%29):
    var suit : (clubs, diamonds, hearts, spades);
A side-advantage of a nominal enumeration is that, each mark silently mapping 
to an ordinal number, marks happen to be ordered. Then, it's possible to 
compare them for inequality like in the game of bridge: 
clubs<diamonds<hearts<spades. Enumerations are thus mark _sequences_. Pascal 
calls this an ordinal type.

Pascal also has a notion of mark set (not collection set like in python). A bit 
too complicated to introduce here, maybe.

An interesting exercise is to define, in and for python, practicle types for 
isolated marks (sentinels), mark sequences (enumerations), and mark sets.

Hope it's clear,

Denis
________________________________

vit esse estrany ☣

spir.wikidot.com
_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

[Tutor] SENTINEL, & more

Reply via email to