I was interested to read the comment by Rick McGowan. Thank you for your note. I found the MARC system described at the following place on the web.
http://www.loc.gov/marc/ It is very interesting and I have started to read about it. I looked back at what I had written and found the following. quote Books in libraries are often classified with a code consisting of digits and a full stop character. For example, the number 515.53 is on a label which is still on the spine of a book which I bought in a sale of withdrawn books from a library. So, if U+E0002 were used to introduce a tag for the library book classification code, then a sequence starting with U+E0002 and using some other tag characters could be used to classify the subject matter of any document which is stored in computerized form. end quote I also found the following about the Dewey Decimal Classification system. http://www.oclc.org/dewey/about/ I realize in rereading what I wrote in the light of the comment by Rick that I may well have not expressed my meaning correctly. My intention was to convey the meaning of the type of use as in the following example. Suppose that there is a plain text document written in Cyrillic script. If at the start of that document there is a U+E0001 character then some tag characters indicating the language and then a U+E0002 character and then the characters U+E0036 U+E0030 U+E0038 then someone could look at the document using a suitable computer system and find out from the few plane 14 characters at the start of the document in which particular language the document is written and also that the general topic area of the document is inventions and patents. This being because 608 is the Dewey Decimal Classification for inventions and patents. However, in an ordinary document viewing package, the tags would not be displayed, so they would not get in the way. My suggestion about using International Standard Book Numbers with a tag type code, which could perhaps be U+E0003, perhaps needs looking at further. Does the tag code mean "This is the start of the text of the book with the following ISBN number" or does it mean "Here is a reference to an ISBN for a book to which I am referring"? Can the two meanings be distinguished, perhaps by putting a tag R after the U+E0003 and in front of the tag digits for a reference to the book and not using a tag R if the use is at the start of the text of the electronic book itself? Or how? There are possibilities for progress here, provided that tags are continued, on the basis of being reserved for use in particular protocols, and provided that the Unicode Technical Committee is willing to consider the defining of additional tag types at some time in the future. My suggestion for U+E0004 could be very useful. Suppose that the haiku which I included at the end of the document had an International Literary Work Number, if such a system of International Literary Work Numbers comes into existence in the future. I could produce a plain text file which starts with U+E0004 and a number of tag characters and then the text of the haiku. I could place that file somewhere on the web. Search engines might log it. If then someone is writing an article about the topic of poetry and Unicode, then he or she might refer to that haiku and include a tag encoded reference to it, using its International Literary Work Number. A reader of that document could decide to have a look at the text and could then search the internet for the text of the haiku, knowing that the search is made easier due to the fact that the International Literary Work Number is unique to that haiku, whereas searching for Phaistos Disc might not find it at all, or might find it as but one of many search engine matches for the term Phaistos Disc. All of these things and maybe many more will be possible if tag characters are not fully deprecated and the possibility of defining more tag character types exists. In my posting I wrote the following. quote Perhaps all of plane 14 needs to be declared an area considered as deprecated in general terms, yet where codes for use with particular protocols can be defined by the Unicode Technical Committee, so that the potential for using such futuristic developments and encoding them within the Unicode framework is preserved? end quote I feel that that is the way forward. In some ways it would be a compromise, yet it is more than a compromise, it is a far-reaching forward-looking policy option which would both protect the present mainstream use of Unicode whilst also providing for futuristic possibilities within the context of conveying information in Unicode compatible files in a precise, formally-defined manner. At present, characters are either regular Unicode codes or Private Use Area codes. This could be changed so that characters are either regular Unicode codes, or reserved Unicode codes or Private Use Area codes, with reserved Unicode codes all being in plane 14. William Overington 15 February 2003

