Doug Ewell wrote as follows. >Kenneth Whistler <kenw at sybase dot com> wrote:
[snipped] >> These animals are more like U+FFFC -- they are internal anchors >> that should not be exported, as there is no general expectation >> that once exported to plain text, a receiver will have sufficient >> context for making sense of them in the way the originator was >> dealing with them internally. >> [snipped] >This moves the entire issue out of the realm of poor support and into >the big, dark, scary cavern of pre-deprecation. > >Unicode 3.0 doesn't say exactly what Ken says. Unicode 3.0 (p. 326) >says the annotation characters should only be used under "prior >agreement between the sender and the receiver because the content may be >misinterpreted otherwise." Fine, no problem; those are the same rules >that apply to the PUA. Ken, though, seems to say they shouldn't be >exported at all, and furthermore they shouldn't even have been encoded >in the first place, except that the noncharacters (which explicitly >mustn't be interchanged) hadn't been invented yet. It occurs to me that it is possible to introduce a convention, either as a matter included in the Unicode specification, or as just a known about thing, that if one has a plain text Unicode file with a file name that has some particular extension (any ideas for something like .uof for Unicode object file) that accompanies another plain text Unicode file which has a file name extension such as .txt, or indeed other choices except .uof (or whatever is chosen after discussion) then the convention could be that the .uof file has on lines of text, in order, the name of the text file then the names of the files which contains each object to which a U+FFFC character provides the anchor. For example, a file with a name such as story7.uof might have the following lines of text as its contents. story7.txt horse.gif dog.gif painting.jpg The file story7.uof could thus be used with a file named story.txt so as to indicate which objects were intended to be used for three uses of U+FFFC in the file story7.txt, in the order in which they are to be used. I have used .gif and .jpg graphics files for my example, but the format could be left open so that a Java class file or anything else could be used as the object that is anchored within the document. There is no obligation that the first part of the file name of the .uof file and of the .txt file should be the same, yet that would typically be a useful thing to do. I can imagine that such a widely used practice might be helpful in bridging the gap between being able to use a plain text file or maybe having to use some expensive wordprocessing package. I am not saying that this suggestion fully solves all of the possible implications of rendering and so forth. I am simply suggesting that having such a convention would be a useful facility. Such a convention, because it uses a special file extension, would not intrude upon the right of anybody to devise their own convention. As this concerns the U+FFFC character and the Unicode Technical Committee is due to meet next week, I think it might be helpful if this idea is discussed before the meeting as a straightforward idea like this might mean that the possibility to exchange U+FFFC characters at all if people want to do so is not lost. >Everybody will welcome the new conventional, graphical-type characters >and scripts that are coming with Unicode 4.0. What are those please? William Overington 14 August 2002

