Re: The golden ligatures collection ct ligature code in use.

William Overington Tue, 04 Jun 2002 07:04:19 -0700

>> I then formatted the text in PowerPoint to 200 points, italic and green.
>>
>> So, it appears that SC UniPad used in conjunction with Word and
PowerPoint can be used to prepare elegant presentations in the languages of
the world. Wow!


SC UniPad provides excellent inputting facilities for Unicode code points,
making available a selection of virtual keyboards for various languages and
scripts.  However, it does not seek to have a display other than one fount
in one small size in one colour, black.  PowerPoint however does have
display capabilities in many founts, many sizes and many colours, yet is
quite tedious when it comes to entering text which uses accented characters.
However, there were problems over using a copy and paste from SC UniPad to
PowerPoint.  So I tried a copy and paste from SC UniPad to Word and then
carried out a copy and paste from Word to PowerPoint.  The display was still
small and was still black and the fount, though different, was not that
different.  So, to complete my experiment I formatted the text which was now
in PowerPoint using a 200 point size, setting italics and setting the colour
as green, thus satisfying myself completely that the complete process from
keying in the text in SC UniPad to viewing a PowerPoint presentation was
possible.  Having done that I was then confident that, as long as a correct
TrueType fount for the required script is supplied for the PowerPoint
program to use, that elegantly set out PowerPoint presentations in various
scripts van be produced straightforwardly by keying in the text in SC UniPad
using a virtual keyboard, then copy and pasting it to Word, then copying and
pasting it from Word to PowerPoint, then formatting the text for size and
colour of lettering using the facilities of PowerPoint and producing a
PowerPoint display.  The finished product would be a PowerPoint file, with
the use of SC UniPad in its preparation not apparent to the end user of the
PowerPoint presentation.

----

>> U+E7C1 WATERMARK-LIKE MEMORY THAT A WHOLE LIGATURE WAS ORIGINALLY USED
FOR THE FOLLOWING LIGATURE

The idea is as follows.

Firstly, the background.  In the light of The Respectfully Experiment, in
the way that Mr James Kass utilised the golden ligatures collection code for
a ct ligature, U+E707, to designate, within a fount which he himself
authors, the glyph for a ct ligature which is normally accessed indirectly
using a sequence of characters, thereby also allowing direct access to the
glyph as a U+E707 character, I feel that there is scope for both indirect
and direct access to coexist using the same founts, with advantages for both
methods, not being conflicting methods of using ligatures yet being
complementary methods of using ligatures.  For example, for work using
sorting, indexing a book, authoring a dictionary and so on one would ideally
use a c ZWJ t approach whereas for situations where someone does not have
the more modern facilities available, or is just setting, say, one page of
text in a black letter face so as to produce a page of text suitable for
printing out and framing as a picture, using the golden ligatures collection
codes would not be unacceptable.  Thus founts could be fully modern, yet
also have a standardized way of assigning code points to the glyphs used for
ligatures.  Indeed, this approach would also be helpful for people with
older equipment, as ligature characters could be entered using whole code
points and then a standard software utility could be used to convert the
resulting file to a format where all of the ligatures were broken down into
the indirect format using ZWJ characters.  It seems to me to be a very
beneficial solution all round.  If a fount designer needed a very special
ligature not in the standard set of regular Unicode, then he or she could
still resort to using the Private Use Area.

Ideally these code points would be part of regular Unicode.  I am aware that
current policy is not to add any more ligature codes to regular Unicode,
yet, in view of this new approach of using code points for whole ligatures
in conjunction with the ZWJ method, then maybe the matter might reach the
agenda again and, in the light of this new scientific evidence, the matter
be reconsidered.  If the matter were reconsidered in this manner, then
perhaps a number of ligature characters, using some or all of those in the
golden ligatures collection, together with any others that the committee
thought it desirable to include, such as those used for calligraphy, might
be added into the U+FB.. block.  This would then allow fount designers to
standardize on official Unicode and ISO codes, producing rigorous founts,
with this extra facility, for the future.

I feel that it is idle to speculate as to whether the committees will
actually consider this matter, or as to what they will or will not agree, or
as to the likelihood of whether they will agree and so on.  The important
matter is what actually happens.  The fact of the matter is that, in the
light of the golden ligatures collection list having been published and Mr
James Kass using the code for a ct ligature from the list in conjunction
with an OpenType fount, there is new scientific evidence available now which
was not available at the time when the decision not to encode any further
ligature characters was made.  Thus the decision that led to the present
policy was based on evidence available at the time and not on the present
evidence.  I feel that it is important to specifically note that, after the
event of this new scientific evidence becoming available, that the Unicode
Consortium has not, as far as I am aware, made any statement as to whether
it will or will not consider again the matter of ligatures: not that I would
myself expect the Unicode Consortium to make any such statement, my
expectation is that, if the Unicode Consortium at some future time receives
a formal proposal then it will consider any such proposal at that time in
the light of the scientific evidence available at that time.

Suppose please for this next section that a large collection of such
ligatures has been encoded in the U+FB.. block.  In the event of someone
posting a document to the Unicode list and including a ligature character in
the posting, suppose that the software system producing the archiving
automatically converts any U+FB.. character into a sequence of single
letters with ZWJ characters between them and stores them in the archive in
that format.  Any end user accessing the archive, perhaps using older
equipment, could request that documents viewed in the archive or saved from
it are not presented in the normal "ZWJ format being used for ligatures"
way, yet in a "U+FB.. codes used for ligatures" way.  This would be a quite
straightforward option for the software system to offer to end users, that
is, ZWJ as the default, U+FB.. block code by special request.

Now, in relation to having a WATERMARK-LIKE MEMORY THAT A WHOLE LIGATURE WAS
ORIGINALLY USED FOR THE FOLLOWING LIGATURE code.  That code would be a
regular Unicode code and would display as zero width and would be ignored as
regards significance in sorting and collating and so on.  My reasoning for
suggesting such a code is that if an archive is taking in ligatures
expressed in ZWJ format and storing them directly and is also taking in
ligatures expressed in U+FB.. format, converting them to ZWJ format and then
storing them, it could possibly be the case that the owner of the database
might like to keep a record of whether the ligature arrived in one form or
the other.  Now, it might be that the owner of the database would not care
how the original coding was made, but he or she might!  So, in order to
provide for the possibility that the owner of such a database did wish to
preserve a record that the original document used a whole ligature code
rather than a ZWJ sequence, I suggested the WATERMARK-LIKE MEMORY THAT A
WHOLE LIGATURE WAS ORIGINALLY USED FOR THE FOLLOWING LIGATURE code.  If that
code is ever implemented in regular Unicode it will probably have a
different, shorter, name.  Yet for this discussion and for experiments,
where experimental software needs to have clearly commented source code,
such a name for the code point is not unreasonable.

So, suppose that someone posts a message to the Unicode list containing the
word astrolabe including a ligature for the st.  Please note that the st
ligature is U+FB06.  For the purpose of this discussion let us please use
WLMTAWL to stand for the WATERMARK-LIKE MEMORY THAT A WHOLE LIGATURE WAS
ORIGINALLY USED FOR THE FOLLOWING LIGATURE code point value.

My thinking is that if the word astrolabe arrived as asZWJtrolabe then it is
stored as asZWJtrolabe in the archive, yet if it arrived as aU+FB06rolabe
then it is stored as aWLMTAWLsZWJtrolabe in the archive.  Thus either method
of sending the st ligature can be used, both methods result in the archive
storing alphabetically sortable text and in addition the fact that a whole
ligature character was used in the original document is recorded in the
archive.

The archive files could, if it were so desired, be searched by a specially
written program by the database manager so as to find out the answer to such
a question as the following.

For all of the ligature codes used in postings to the Unicode list, how many
were sent using ZWJ codes and how many were sent using U+FB.. codes?

In order to find the answer to this question the software would simply look
for ZWJ occurrences and determine whether or not a WLMTAWL code was present
immediately preceding the first character of the ligature sequence.

So, my idea for a WATERMARK-LIKE MEMORY THAT A WHOLE LIGATURE
WAS ORIGINALLY USED FOR THE FOLLOWING LIGATURE code is basically quite
straightforward and could be easily used to good advantage.  However, its
use would not be obligatory, so that if, say, a database manager has no
interest in whether the original of a document used a ZWJ sequence or a
U+FB.. code for a ligature, then the WATERMARK-LIKE MEMORY THAT
A WHOLE LIGATURE WAS ORIGINALLY USED FOR THE FOLLOWING LIGATURE code need
not be used at all in that particular database application.

Naturally, it would be best if such a code were part of regular Unicode and,
at some future time, if more ligatures are encoded in regular Unicode then
maybe it would be added as part of the same process as the adding of the
ligatures, yet, thinking that perhaps some people might like to try out some
programming experiments with the technique now, I suggested a particular
code within the Private Use Area in the hope that if various people try out
such programming experiments, then hopefully any files produced could be
interchanged from experimenter to experimenter as part of the research
process: also, suggesting a particular code does provide a stepping stone so
that an experimenter has a definite place to start.

----

The golden ligatures collection documents are available on the web at the
following address.

http://www.users.globalnet.co.uk/~ngo/golden.htm

William Overington

4 June 2002

Re: The golden ligatures collection ct ligature code in use.

Reply via email to