Peter Constable included the following in his post.

>As for
>PUA, many people have their own plans regarding U+F300..U+F3FF. For my own
>part, my plans for U+F300..U+F3FF almost certainly do not involve padlock
>symbols.

Thank you for your email.

As is well known, the Unicode Consortium will not endorse any code point
allocations in the Private Use Area and everyone has the right to allocate
none, some or all code points in the Private Use Area as he or she chooses,
and to publish them if he or she so chooses.

This is an interesting situation.  If one views the situation from the
inside looking out, then it becomes impossible for there to be any certainty
as to what is the intended meaning of a code point from the Private Use Area
which is used in a Unicode plain text file on the basis of examining the
code points.

However, if one views the situation from the outside looking in, a somewhat
different situation arises.

Suppose that I define a .eut file format to be structurally a Unicode plain
text file with the added feature that all code points that are within the
Unicode Private Use Area are defined to have the meanings which I give them
in my eutocode set of code point allocations.

So, a .eut file could be a rigorously defined file format, just as is .bmp
or .png.  If a wordprocessing package were to have a selection option for
reading in files of a .eut format, then there would be no confusion
whatsoever about the meaning of, say, a U+E707 character: it would be a ct
ligature.

Now, suppose I define a .uto file format to be structurally a Unicode plain
text file with the added feature that all code points that are within the
U+F3.. block of the Private Use Area have the meanings of a set of codes
called Courtyard Codes, and all other code points that are within the
Private Use Area have an undefined meaning, unless a sequence of some of the
Courtyard Codes has indicated from which type tray all subsequent Private
Use Area codes which are not in the U+F3.. block are to be regarded as
coming.

A wordprocessing package could be programmed by its manufacturer to accept
input in .uto file format, with accuracy of meaning for every code point
used in the file, even if some Private Use Area code points were used to
have two different meanings in two parts of the same document.

----

I like to imagine an analogy of the way that Unicode code points can be
defined as if there is a large kitchen table which is plane 0.  Onto most
parts of the table, pieces of coloured paper are laid, always taking care
that no piece of paper overlaps any other piece of paper, so that the table
surface is only covered by one thickness of paper.  On an area about one
tenth of the total area of the table is an area called the Private Use Area,
and here paper can be piled.  Perhaps 500 sheets of paper could be piled
upon this area.  So, if someone says, of some particular place on the
surface of the table "What colour is the paper?" then for parts of the table
that are not in the Private Use Area, the colour of the paper can be stated.
However, for the Private Use Area, the colour of the paper cannot be stated
with certainty.  It depends upon which piece of paper is being viewed at any
one time.  Suppose, however, that the people who are placing the paper onto
the Private Use Area agree amongst themselves that they like the look of
that nice yellow square of paper that takes up a small part of the Private
Use Area and will voluntarily avoid placing any paper on top of it.  One
would then end up with a Private Use Area that has coloured paper piled up
all over it, except for in one small area where there is a yellow square.
The net effect would be that the area covered by the yellow square would be
as uniquely defined as to the colour of paper upon it as anywhere not in the
Private Use Area.

Now, the question that naturally arises is as follows.  Will all end users
agree to keep the U+F3.. area only for the Courtyard Codes?  Who knows?  I
suggest however that it is possible that they will, because I hope that,
when they consider the matter, that people will feel that it is to their own
advantage to do so.

I feel that if everybody who wishes to make definitions into the Private Use
Area learned of the existence of the Courtyard Codes and finds that the
features that it could provide for them are extremely useful and may, in
time, become built into widely used software packages, then they might well
do so.

What would this take?

Ease of use.  Where a wordprocessing package or a desktop publishing package
or whatever has an option for reading in a Unicode plain text file it would
also have an option for reading in a .uto file.  The Courtyard Codes would
need to be well defined, publicly available, free to use and free of legal
entanglements.  Please note that I have chosen the name Courtyard Codes for
the system as Courtyard and Codes are two English words, not words specially
coined.  I got the idea of using the word Courtyard from the notion of a
courtyard garden.  If people like to think of the imagery of a courtyard
garden with various items within it which is a nice place to be in, then
fine.

Potential benefit.  The Courtyard Codes provide facilities which may be
useful for fairly simple widely available software.

Here are the codes which I have defined so far, except for the
classification codes and the padlock codes which I have previously
published.  Hopefully other codes will be added gradually.  However, I feel
that as this topic is current I shall take the opportunity to post those
codes that I have already defined in the hope that end users of the Unicode
system around the world may become aware of them, have a look at them and
hopefully feel that avoiding making any definition in the U+F3.. block of
the Private Use Area would be to their advantage, so that they keep open the
possibility of using Courtyard Codes in conjunction with their own use of
the Private Use Area.

Please know, for the avoidance of doubt, that although I am carrying out my
research for the eutocode system and am elsewhere defining uses of codes
within the Private Use Area for specific purposes, including graphics,
ligatures such as ct and long s ligatures, mouse events and push button
pushes on a hand held infra-red control device of a multimedia television
and embedding 1456 object code into a Unicode plain text file, I am not
asking that those codes are not overlapped.  So, although I use U+E707 to
mean a ct ligature within eutocode, I am fully expecting and am entirely
happy that other people define U+E707 to mean something else.  I am simply
asking end users for the U+F3.. block not to be overlapped please if that is
possible, as by end users keeping the U+F3.. block to have one meaning, then
all end users can use the features provided by the Courtyard Codes in
conjunction with any character sets that they design in the Private Use
Area, with the hope that software packages will in the future understand
those codes and all uses of the Private Use Area can be classified using the
classification codes.  I feel that if end users choose to have this way of
using the U+F3.. block widely accepted amongst themselves, then that will be
to everybody's advantage.

Readers might like to know that eutocode is being designed primarily for use
in applications involving the broadcasting of digital multimedia on digital
television channels.  The DVB-MHP (Digital Video Broadcasting - Multimedia
Home Platform) system, details at the http://www.mhp.org website, uses Java
for broadcasting software and Java uses Unicode, and those Java programs
could be written so as to accept .eut and .uto file formats as data, so my
initiative in asking end users to agree to trying to have the classification
system and basic formatting codes will hopefully have far reaching
implications for the use of Private Use Area codes with software that
recognizes these formatting codes.

Please note that the formatting codes are not detailed as to width of table
cells or specific fount and so on.  The idea is as if someone has gone to a
print shop and generally explained to the printer what he or she is looking
to achieve with the layout.  The person at the print shop then does his or
her best with what he or she has available.  Courtyard codes are not
intended to be a full markup system, they are intended to be a fairly basic
system that is highly portable, yet which does have scope to provide layout
effects which are both practical and indeed potentially quite stylish.

Courtyard Codes could also be used with just regular Unicode, with no
Private Use Area codes except for the Courtyard Codes themselves.  This
usage would allow stylish layout to be achieved using an almost plain text
file.

----

U+F3A2 PLEASE LIGATE THE NEXT TWO CHARACTERS
U+F3A3 PLEASE LIGATE THE NEXT THREE CHARACTERS
U+F3A4 PLEASE LIGATE THE NEXT FOUR CHARACTERS

U+F3A8 PLEASE SWASH THE NEXT PRINTABLE ITEM
U+F3A9 PLEASE ALTERNATIVE SWASH THE NEXT PRINTABLE ITEM

In the event of requesting a swash version of a ligature where the ligature
is requested using U+F3A2 or U+F3A3 or U+F3A4, the swash request code
precedes the ligature request code.
.
Swash versions of any character can be requested, not just ligatures: indeed
a swash request for a ligature is thought to be a potentially rare
occurrence.

If a U+F3A8 is obeyed and there is no swash character available, then the
ordinary version of the letter is displayed.

The U+F3A9 code point is because some founts may have two swash versions of
a particular letter.  The U+F3A9 code allows access to the second swash
version of a particular letter.  If an alternative swash character cannot be
displayed, the U+F3A9 code acts as if it were a U+F3A8 code.

----

Here are eight code points for signalling italic and bold.  The idea is that
the processor will have two Boolean variables, ITALIC and BOLD, which are by
default false.

U+F3C0 PLAIN - ITALIC:=false; BOLD:=false;
U+F3C1 ITALIC - ITALIC:=true; BOLD:=false;
U+F3C2 BOLD - ITALIC:=false; BOLD:=true;
U+F3C3 BOLD ITALIC - ITALIC:=true; BOLD:=true;
U+F3C4 REMOVE ITALIC - ITALIC=false;
U+F3C5 ADD ITALIC - ITALIC=true;
U+F3C6 REMOVE BOLD - BOLD=false;
U+F3C7 ADD BOLD - BOLD=true;

Here are some codes for type face choice.

U+F3C8 PLEASE USE DEFAULT FACE
U+F3C9 PLEASE USE SERIFED FACE
U+F3CA PLEASE USE SANSERIF FACE
U+F3CB PLEASE USE ORNATE FACE
U+F3CC PLEASE USE FORMAL SCRIPT FACE
U+F3CD PLEASE USE INFORMAL SCRIPT FACE
U+F3CE PLEASE USE MONOSPACED FACE

Here are some codes for formatting.

U+F3D0 LEFT ALIGN
U+F3D1 RIGHT ALIGN
U+F3D2 CENTRE
U+F3D3 JUSTIFY
U+F3D4 SINGLE COLUMN
U+F3D5 DOUBLE COLUMN FOR THE REST OF THIS PAGE

U+F3D8 TABLE START
U+F3D9 TABLE END
U+F3DA START THE NEXT TABLE ROW
U+F3DB START THE NEXT TABLE COLUMN IN THE PRESENT ROW

Here are some codes for text colour.

U+F3E0 BLACK
U+F3E1 BROWN
U+F3E2 RED
U+F3E3 ORANGE
U+F3E4 YELLOW
U+F3E5 GREEN
U+F3E6 BLUE
U+F3E7 MAGENTA
U+F3E8 GREY
U+F3E9 WHITE
U+F3EA CYAN
U+F3EB PINK
U+F3EC DARK GREY
U+F3ED LIGHT GREY
U+F3EE LAVENDER
U+F3EF MINT

Here are some codes for type sizes.

U+F3F0 DEFAULT SIZE
U+F3F1 6 POINT
U+F3F2 8 POINT
U+F3F3 10 POINT
U+F3F4 12 POINT
U+F3F5 14 POINT
U+F3F6 18 POINT
U+F3F7 24 POINT
U+F3F8 30 POINT
U+F3F9 36 POINT
U+F3FA 48 POINT
U+F3FB 60 POINT
U+F3FC 72 POINT
U+F3FD 96 POINT
U+F3FE 144 POINT
U+F3FF 192 POINT

I hope that these Courtyard Codes will be of interest to end users.  I am
hoping to add some more codes gradually and then hopefully add a full
document to the web at http://www.users.globalnet.co.uk/~ngo which is our
family webspace in England.

William Overington

24 May 2002

















Reply via email to