Per continued: > I know it's not a name. My question was *why* control characters don't > *have* names like > > CONTROL CHARACTER NULL > CONTROL CHARACTER START OF HEADING > CONTROL CHARACTER START OF TEXT > etc. > > It would be so obvious to have it like that, so I assume there is some > specific reason not to, but I still can't figure it out. For me there is > not less reason for these characters to have names than any others, so > for me it's like Linear B characters didn't have names, and I got the > answer "no problem, they have aliases, so that's OK!" This is just > strange to me. If names aren't needed, why do almost all characters have > them?
Ah, so this is a "Why is the sky blue?" kind of question. ;-) And perhaps the correct response is then a Just So story... Once upon a time, there was an ISO framework for character encoding. Officially his name was ISO 2022 Information technology -- Character code structure and extension techniques. But we'll think of him as the troll that lives under the bridge and just call him "2022" for short. Now 2022 had his favorite collection of code points that he kept in buckets under the bridge. But he was very, very particular about how he organized his collection. All the code points 00 to 1F had to go in the bucket labeled "C0", and all the code points 20 to 7E had to go in the bucket labeled "G0" (or "GL" -- sometimes the troll would get confused). He had other, even bigger code points, too, but we can save those for another story. 2022 said all the code points in the "G0" bucket could get names. In fact they could get lots of names, if they wanted. So 2022 also starting collecting sets of characters, where all those names were written down. Sometimes he would "escape" to one set and admire all those pretty names, and then he would "escape" to another set and admire other pretty names. 2022 was a great admirer of escaping, by the way, as well as pretty names. But the code points in the "C0" bucket were different. 2022 insisted that those code points weren't like the ones in the "G0" bucket, and they couldn't have names at all. Indeed, these were very odd code points -- 2022 called them "control functions". Sometimes when the troll took one out of the "C0" bucket and examined it, it did one thing, but the next time it might do something completely different. Only 2022's friend, the troll named 6429 living under the next bridge to the north, really understood what they might be doing from one week to the next. One day an aspiring young wizard named Unicode was crossing the bridge. As an aspiring young wizard, he was rather observant. And he noticed that there was a troll living under the bridge and that that troll had stolen all the code points and was hoarding them in strangely labeled buckets under the bridge. Being a wizard and all, he knew that it was his duty to slay the troll and free all the code points. So he set about writing down the appropriate spell in his brand new spellbook. Now Unicode was a very egalitarian wizard -- it just seemed right to him that all code points should be able to have names, and it would be better if each one had just one, unique name. That way, none of them would get jealous of all the names some other code point had acquired, and besides, each code point would know its name and could come when you called it. So in the first version of Unicode's spellbook, he wrote the spell down just that way. He called his spell "Unicode 1.0", because, well, it was his spell, after all, and the very first complete spell that he would be trying to use. 00 could be called "NULL" and 01 could be called "START OF HEADING", just like 20 could be called "SPACE" and 2D could be called "HYPHEN-MINUS". You may be wondering why Unicode would use such odd names for all the code points, but then there is no accounting for the whims of wizards, I guess. Well, once Unicode had finished writing down the "Unicode 1.0" spell, he started casting it on the troll: Shazaaaam! Ffffppfft! To Unicode's surprise, the spell only partly worked, but then fizzled. The troll had been badly hurt, but he was still limping around under the bridge, and he still clung tightly to his buckets of code points. Unicode looked around to see what the problem could be, and noticed that there was a warlock at the other end of the bridge. It was an infamous warlock who had taken to calling himself "10646", and from all appearances he was *also* trying to cast a spell to kill the troll and free all the code points. Apparently, casting the two spells at the same time had resulted in interference in the ley lines. That was why neither spell had fully worked, and was why the troll 2022 was still limping around with his code point buckets. The wizard Unicode headed across the bridge to speak to the warlock 10646: "Look, we both want to slay that troll and free his code points. Why don't we team up and cast synchronized spells?" But 10646 was a suspicious warlock. He wasn't sure that *all* of the code points could be freed safely. Who knows what mischief they might get up to if left on their own. "Those code points that the troll keeps in the C0 bucket are very dangerous," said 10646. "We can't let them just be like all the others and get ordinary names. After all, they seem to do different things in alternate weeks, and if we give them regular names, they might come when we call them, even if they are doing the wrong things that week." The wizard Unicode heaved a sigh. That seemed so silly to him. But after all, it was important to kill the troll and save all the code points. So he pulled out his quill and scratched lines through all the names for the code points from the C0 bucket in his spellbook, and decided he would call the revised spell "Unicode 1.1". It was only a little different from his first spell -- but it is important to keep track of these things. Spells can be dangerous things, after all. "How does this look to you, Master Warlock?" he asked. And 10646 nodded his cautious approval at the revision. So then the wizard Unicode and the warlock 10646 started casting their spells together. Shazaamaazama! Pockety spoketi! Keeeraack! The troll 2022 was dead! His buckets fell out of his grasp, and all the code points were freed! But the ones that rolled out of the C0 bucket didn't have names, because Unicode had scratched out all of their names in the Unicode 1.1 spell he cast, just so the warlock 10646 wouldn't interfere by casting a counterspell for them. And that is why control characters don't have names. _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

