Here is my proposed writeup, so far, on the oversized sigma (΁�)
character.  I got a response from Markus Kuhn who suggested increased
unification in order to avoid context dependencies in legacy->Unicode
conversion (the legacy charsets being more a collection of glyphs than
anything else), but from looking at the items that are already
approved for Unicode 3.2, I believe we already have to deal with such
context dependencies, and I would rather stick to one design.

The main open issue relates to the oversize bra and ket symbols in HP
Math8; that character set seems to include the capability to
synthesize oversized bra and ket symbols (〈 ... 〉) using the
oversized sigma middle and diagonal glyphs, plus one additional glyph
looking like the sigma middle reversed.  At this point I haven't
complicated the proposal by trying to add those characters.


EXTENSIBLE SUMMATION GRAPHIC SYMBOL FOR UNICODE

  H. Peter Anvin
  Transmeta Corporation
  3940 Freedom Circle
  San Jose CA 95054
  [EMAIL PROTECTED]

  2 June 2001

Format:   Plain text with line breaks
Encoding: UTF-8


STATUS

  Being developed


ABSTRACT

  A set of symbols for creating an arbitrary large summation sign on a
  monospaced terminal.  This proposal, in conjunction with the STIX
  [1] and SUPPLEMENTAL TERMINAL GRAPHICS FOR UNICODE [2] proposals,
  complete the DEC Technical Character Set (TCS)[3][4], thus allowing
  terminal emulators and terminal applications currently using this
  character set to migrate to Unicode, which will promote
  interoperability of terminal emulators with other Unicode
  applications and with each other.


INTRODUCTION

  The DEC VT100 series of terminals was one of the first
  implementations of the ECMA-48/ISO 6429[5] terminal standards, and
  quickly became among the most widely used and, perhaps more
  important, emulated terminals ever.  An uncountable number of
  emulation programs for this series of terminals have been written,
  and are still in very wide use today.

  It is highly desirable for the full capabilities of these terminals
  to be available in Unicode, both for implementing an emulator for
  the legacy encoding on a Unicode-based system, and for migrating
  applications to using Unicode encodings for these symbols.

  The STIX[1] and SUPPLEMENTAL TERMINAL GRAPHICS FOR UNICODE
  (STGU)[2], both slated for inclusion in Unicode 3.2, include most of
  the necessary symbols not yet included in Unicode 3.1[6], however,
  the symbols for the extensible summation character present in the
  DEC Technical Character Set[3][4] is not included.  A similar
  character group is present in the HP Math8[7] character set, available
  among others on the widely used HP LaserJet series of printers.

  This proposal aims to complete that omission.


PROPOSED NEW CHARACTERS

  Proposed character names should be changed as needed to conform to
  UTC and WG2 naming rules or conventions.  References to STIX or STGU
  are based on allocations as of 2 June 2001, and are subject to
  change.

  These symbols can be combined to form an upper case Greek letter
  sigma (�΁�, U+03A3) of any square size, 2x2 or larger, on a monospaced
  terminal.

  Unifications are discussed later in this document.

  #1. LARGE SUMMATION SYMBOL TOP LEFT

    This symbol represents the upper left corner of the summation
    symbol.  It joins to the right with #2 or #3, and joins diagonally
    down-right with #4 or #5, or downward with #7.

    This corresponds to 03/01 in the DEC TCS.

  #2. LARGE SUMMATION SYMBOL UPPER HORIZONTAL EXTENSION

    This symbol represents the middle of the upper horizontal stroke
    of the summation symbol, and can be extended indefinitely.  It
    joins to the left with #1 or itself and to the right with #3 or
    itself.

    This corresponds to 02/03 in the DEC TCS.

  #3. LARGE SUMMATION SYMBOL TOP RIGHT

    This symbol represents the upper right corner of the summation
    symbol, and joins to the left with #2 or #1.

    This corresponds to 03/05 in the DEC TCS.

  #4. LARGE SUMMATION SYMBOL UPPER DIAGONAL EXTENSION

    This symbol represents the upper diagonal part of the summation
    symbol, and can be extended indefinitely.  It joins diagonally
    up-left with #1 or itself, joins diagonally down-right with #5 or
    itself, or joins downward with #6.

    This corresponds to 03/03 in the DEC TCS.

  #5. LARGE SUMMATION SYMBOL MIDDLE

    This symbol represents the middle of the summation symbol when the
    size is an odd number of characters.  It joins diagonally up-left
    with #1 or #4 and diagonally down-left with #6 or #7.

    This corresponds to 03/07 in the DEC TCS.

  #6. LARGE SUMMATION SYMBOL LOWER DIAGONAL EXTENSION

    This symbol represents the lower diagonal part of the summation
    symbol, and can be extended indefinitely.  It joins diagonally
    up-right with #5 or itself, joins diagonally down-left with #7 or
    itself, or joins upward with #4.

    This corresponds to 03/04 in the DEC TCS.

  #7. LARGE SUMMATION SYMBOL BOTTOM LEFT

    This symbol represents the lower left corner of the summation
    symbol.  It joins to the right with #8 or #9, and joins diagonally
    up-right with #5 or #6, or upward with #1.

    This corresponds to 03/02 in the DEC TCS.

  #8. LARGE SUMMATION SYMBOL BOTTOM HORIZONTAL EXTENSION

    This symbol represents the middle of the lower horizontal stroke
    of the summation symbol, and can be extended indefinitely.  It
    joins to the left with #7 or itself and to the right with #9 or
    itself.

    This corresponds to 02/03 in the DEC TCS.

  #9. LARGE SUMMATION SYMBOL BOTTOM RIGHT

    This symbol represents the lower right corner of the summation
    symbol, and joins to the left with #7 or #8.

    This corresponds to 03/06 in the DEC TCS.


SAMPLE USAGE

  A 5x5 summation symbol can be constructed using the following
  symbols (blank squares contain a whitespace character, normally
  U+0020):

  #1 #2 #2 #2 #3
     #4
       #5
     #6
  #7 #8 #8 #8 #9

  Similarly, a 6x6 summation symbol can be constructed using the
  following symbols:

  #1 #2 #2 #2 #2 #3
     #4
        #4
        #6
     #6
  #7 #8 #8 #8 #8 #9


UNIFICATIONS

  Using the specific glyphs from the DEC TCS character set, #2 and #6
  could be unified with either U+2500 or U+23AF; however, such
  unification would be inappropriate for the HP Math8 version of these
  symbols, which do not align the horizontal part of the summation
  symbol with the middle of the character cell.  In order to provide a
  representation which is applicable for both these character sets, I
  recommend that the full set of nine characters are encoded
  independently.


REFERENCES

[1] Unicode Consortium Document L2/00-033R, STIX Math Symbols, 9 Feb 2000.

[2] Supplemental Terminal Graphics for Unicode, Frank da Cruz, 31
    March 2000.  Available at
    <ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt>.

[3] Digital Equipment Corporation, Installing and Using the VT420 Video
    Terminal EK-VT420-UG.002, Maynard, MA, 1988.  Available in
    reproduction at <http://vt100.net/docs/vt420-uu/>.

[4] DEC Technical Character Set, VT100.net, Paul Williams;
    <http://vt100.net/charsets/technical.html>.

[5] ECMA-48, ECMA, currently in Fifth Edition, June 1991,
    <http://www.ecma.ch/ecma1/STAND/ECMA-048.HTM>.

[6] The Unicode Standard, Version 3.0, Addison-Wesley, 2000 as amended
   by Unicode Standard Annexes (UAX) 9, 11, 13, 14, 15, 19 and 27,
   <http://www.unicode.org/unicode/reports/index.html>.

[7] PCL 5 Comparison Guide, Hewlett-Packard, HP part number 5961-0510,
    October 1992 PCL Symbol Set id: 8M.


-- 
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to