On 7/13/2011 1:23 PM, Jukka K. Korpela wrote:
I don’t see that biologists use the word “life” in any confusing manner comparable to the Unicode confusion around “character.” “Life” isn’t really a central concept in biology, and its use in biology hardly differs much from everyday use. Defining “life” might be a problem to philosophers, politicians, etc., but not that much in biology.

Actually, I picked that example advisedly. First of all, to say that "Life" isn't really a central concept in biology is more than a little misleading. Biology is usually defined as "the science concerned with the study of life." The fact that "life" is hard to pin down exactly and cannot really be used as
on an axiomatic definitional basis is part of the problem.

And to say it isn't used in "any confusing manner" is also problematical. In fact, the definition of what is and isn't life is a serious issue for virologists and exobiologists, at least. If you use the usual criteria for defining living organisms (metabolism, homeostasis, growth, response to stimuli, reproduction, and adaptation through natural selection), viruses and
viroids fail on several of those criteria.

So a virus is to life, kind of like a control code is to a character. ;-)


You might try “species” instead.

Nah. The point was about "What is it about?"

Character encoding is about characters. But if one tries to force too clean a definition on "character", one gets into trouble. As Asmus was at pains to point out, the character encoders are essentially engaged in an operational discovery process regarding "what characters there are". That in turn leads to a definition by enumeration: What
characters are consists of the list of what characters there are.

One can then go back over the list looking for recurring attributes, in an attempt to organize and classify the resulting zoo. But the results tend not to make any axiomatic sense, both because of the complexity of writing systems through history (which Asmus alluded to), together with the fact that all kinds of technical artifacts got added to the zoo, many of which have little or nothing to do with writing systems
per se. (OBJECT REPLACEMENT CHARACTER, anyone?)

Early on, the Unicode Standard tried to provide a clear scope for the standard by enumeration of some of the characteroids that we didn't consider candidates for encoding as characters. But that list has been whittled away, as musical symbols
were added, for example, and more recently large sets of pictographs and the
first set of symbols for a shorthand system. What it really comes down to
is that "characters" will end up being those "things" that somebody
persistent wants to embed as units in a digital plain text string, and which the gatekeepers in
the character encoding committees consider not too crazy to standardize.

But to get a more reasonable comparison, consider “force” and “energy” in physics. They are surely very different from the everyday meanings. When an ad says that some drink is “low energy,” it hardly makes much sense physically without clarification. But in physics, people need not worry about such issues. Physics does not deal much with things where the varying everyday meanings of “force” and “energy” could be confused with the physical meanings.

I considered physics for a comparison, too. For that matter: "matter", "space", and "time".
One could argue the comparison, but it doesn't work as well, IMO. The formal
definitions are all embedded in mathematics in a way that "character" is not.

Although the mindboggling things that happen to "force", "energy", "matter",
"space" and "time" at the Planck scale do bring to mind the Unicode concept
of a "noncharacter". ;-)


But in the Unicode Standard, in the discussion around it, and in applying it, uses of “character” in everyday sense are common and essential.

Yep, no quarrel there.

--Ken


Reply via email to