On Thu, 24 Jan 2013 20:05:41 -0300
Andrés Sanhueza <[email protected]> wrote:

Do you think that a "end of story" symbol may be feasible/useful?

My position is that the attempt to encode such "semantics" that are defined on the whole text level is a mistake. In fact, it is a common mistake that keeps surfacing in proposals or tentative proposals.

When Unicode encodes semantics, it's on the level of individual symbols. If there were a recognized notation that defined an "end of text" symbol, then you could encode that in Unicode, and expect that to be rendered with ordinary stylistic variations (governed by font selection - with the font not selected just for that symbol, but once, for all aspects of that notation).

Such a use would then be analogous to something like the integral sign, which has a (small) range of customary and conventional shapes, e.g. upright or slanted, bulky or slender, which fall into what anyone would consider stylistic variations. The precise variation is usually selected by choice of font not just for the integral, but a whole set of other mathematical symbols as well (the full notation in fact).

Placing a symbol of some sort at the end of a text is a fairly widespread convention, but there is no agreement on any set or range of customary shapes for that purpose. In a way, that makes this convention less a notation, but something different. In some ways it's more similar to the way that languages may agree on representing the concept "house" as a known, albeit with completely different sets of shapes ("house", "Haus", "hus", "maison" etc.).

For languages, those representations would be called spellings, and I think that's the appropriate concept as well for the "end of story" convention. Rather than conceiving of it as a single "character" with a range of "glyphs", it's a convention on the whole text level that is customarily expressed by different spellings (choice of abstract or pictorial symbol).

Just as Unicode does not unify spellings, the different choices of symbol for "end of story" should remain disunified. Each user of the convention decides on an appropriate character or symbol for the purpose. (Another analogy would be list item markers which are equally not unified into a generic control code with "glyph variants", but are separate characters).

Because the semantic of the convention is not directly represented / representable on the encoding level, there's also no need to encode multiple characters of different shapes such as "end of story-1", "end of story-2" etc. Instead, like the use of "." or "," for decimal point, the semantic of "end of story" comes from context. Whenever a symbol is placed consistently at the end of every story in a collection, that symbol acquires the "end of story" semantics.

There are cases where Unicode has duplicated characters (using the same shape) based on which convention they happen to be used with. All these duplications are problematic in many contexts, however well intentioned they may have been. These cases make poor precedents and must be properly understood as the exceptions they are. The general encoding principle in Unicode remains that Unicode does not encode spelling - which means that symbols and other characters can be put into new contexts and there acquire new semantics to the human reader - without requiring the addition of dedicated code points.

With this, we can turn back to the original question. Should an "end of story" character be encoded? The answer must be negative. However, if particular shapes have been in widespread enough use for that purpose, but are not yet encoded in Unicode as their own symbol, then encoding such symbols for general use would be appropriate.

Some of the more fancy symbols used for "end of story" on the other hand might be better implemented as private use characters. For example the use of corporate logos at the end of magazine articles.

A./

Reply via email to