numification and stringification of objects

Juerd Sun, 25 Sep 2005 17:24:48 -0700

Whenever possible, object should have useful numeric and string
representations. These are generally lossy, but this is not a problem, because
a scalar stays a scalar even after being used in a certain context, and the
object isn't lost.


When a protocol or data format that already has a string format is represented
as an object, it should of course evaluate to its common string form when used
in string context. Good examples of this can be found in the LWP package.

    Class           Num                   Str

    HTTP::Headers   number of headers     "Foo: bar{crlf}Bar: baz{crlf}"
    HTTP::Status    200                   "HTTP/1.1 200 OK"
    HTTP::Date      universal epochtime   "Sun, 06 Nov 1994 08:49:37 GMT"       
 
    HTML::Form      number of elements    "<form ...>...</form>"

One must be careful NOT to pick a certain numification or stringification just
because a certain number or string found in the object will be useful. For code
to be understandable, the numification or stringification must really BE what
the object represents. Again, LWP provides good examples.

    Class           Num                   Str
    
    HTTP::Request   -                     "GET / HTTP/1.1{crlf}..."
    LWP::UserAgent  -                     -

There's no single obvious meaningful number that represents HTTP::Request, but
a careless designer could try and guess that people would be interested in the
HTTP version number, the number of headers, or the number of bytes in the body.
It should therefor produce a warning when it's used in numeric context. What it
returns, is mostly irrelevant but I'd go as far as returning a random number,
just to avoid that people actually do this. (This is no problem. Compare it to
Perl 5's habit of returning the memory address.) There is, however, a good
string representation of an HTTP message. Whether or not this includes the body
is irrelevant at this point, but if it's know, it probably should. It can
hopefully do so lazily.

An UserAgent object has no single obvious meaningful number that it represents,
and it's hard to express a machine as a string too. Still, someone who feels a
need to use every feature that Perl provides, might use the number of requests
and the last requested URL, thinking these would be very popular. In a good
design, it shouldn't be a number or a string at all, because it would lead to
non-obvious code and would require a comment or diving into documentation, and
then an explicit method name serves both ease of programming and readability
much better.

However, I do think there should be some kind of useful stringification for ALL
objects, because objects are often printed for debugging purposes. But I
suggest that this be a global method that all objects implicitly inherit from,
and not be defined in the object itself. This helps to make all these
stringified-for-debugging strings look the same (one programmer could for
example perhaps implement a coloured scheme) and to enable us to make using
them fatal. Because every object may have its own attributes or even other
calculations that will be useful for debugging, there must be a way to specify
which ones are used. I think a simple method that returns a string is most
appropriate. 

One example of what this debugging output could be is:

    { LWP::UserAgent(aen3kx) }

aen3kx being the id of the object, and {} being simple delimiters to visually
group. Another example, this time with some attributes that a certain method in
LWP::UserAgent told Perl to use:

    { LWP::UserAgent(c23hee) libwww-perl/6.00; 200 }

Or, for example a database connection object:

    { DBI(938eo) connected; dbi:SQLite:foo.db; in transaction }

But, as arrays do have a useful way to be represented as a string:

    element1 element2 element3

and not { Array(123abc) }.

It's all very Perl 5-y, and that's because that is a good way to do it: the
default is useful for debugging, but you can specify different stringification
in case there's an obvious way to stringify. It just gets much less scary to
actually do override stringification.

In any case, I do think that everything should be explicitly fetchable as well
as implicitly. This means that I want .as_html, and not just the HTML::Element
in string context. The debugging info mentioned above could be .as_debug, for
example, and then we could get { Array(123abc) 0..15 contiguous } from an Array
object. I personally like to work without the as_ prefix, so that I get
.celcius and .html instead of .as_celcius and .as_html.

Always, string context should be primarily concerned with use, not debugging or
storage (serialization). Whether this use is for presentation to the user,
sending over a wire or storage, the object can't and shouldn't know. It should
have one that is most important and very obviously connected to the object, and
that one should be used in all stringification. Most objects won't ever need to
be presented to the user, and others won't be part of a protocol. In fact, I
cannot think of any object class that would have multiple possible
stringifications, and none of them obviously more important than others.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

numification and stringification of objects

Reply via email to