Re: [NTG-context] unicode and out-of-box usability

Hans Hagen Sun, 04 Jan 2004 13:58:31 -0800

At 18:59 02/01/2004, you wrote:

I've been struggling through, trying to learn Unicode in ConTeXt. It's
been instructive, at least. (Hope to make a MyWay about it...)

good

There are a few weird things that made it difficult to learn, and I was
wondering if someone could help explain why things are the way they are.

In unic-ini:
\chardef\utfunihashmode=0 % 1 = enabled

Actually, if I understand things correctly, '1' means "disabled", which
is what I preferred, having not yet created any unicode vectors. So the
internal documentation there seems wrong, and I would argue the default
case (0) makes it harder for beginners.


hm, did you look at the unic-001 etc files? the trick is in fast and efficient
expansion without the need to define lots of named glyphs

More confusingly, in font-uni:

forget about that one, although it's called unicode, it's actually a mechanism for the many vectors derived from unicode / related to unicode but not entirely i.e. cjk fonts

\def\enableunicodefont#1%
  {\definefontsynonym[\s!Unicode][\getvalue{\??uc#1\c!file}]%
   \def\unicodescale             {\getvalue{\??uc#1\c!schaal}}%
   \def\unicodeheight            {\getvalue{\??uc#1\c!hoogte}}%
   \def\unicodedepth             {\getvalue{\??uc#1\c!diepte}}%
   \def\unicodedigits            {\getvalue{\??uc#1\c!conversie}}%
   \def\handleunicodeglyph       {\getvalue{\??uc#1\c!commando}}%
%%%%%%%%%%% NEXT LINE
   \enableregime[unicode]% the following \relax's are realy needed
   \doifvalue{\??uc#1\c!interlinie}\v!ja\setupinterlinespace\relax
   \getvalue{\??uc#1\c!commandos}\relax}

The \enableregime[unicode] runs in direct opposition with the
\enableregime[utf] that normally goes at the start of (some of my)
documents. As it stands, with the regime hard-coded, users have to put an
\enableregime[utf] *after* the font declaration. That's awkward.

so, don't use that mechanism, stick to the utf mechanism

The last proposed change/complaint is back in unic-ini, and came from my
attempts to match the main body font with the unicode font.

\def\utfunifontglyph#1%
  {\xdef\unidiv{\number\utfdiv{#1}}%
   \xdef\unimod{\number\utfmod{#1}}%
   \ifnum#1<[EMAIL PROTECTED]
%%%% \unicodeasciicharacter\unimod
     \char\unimod % \unicodeascii\unimod
   \else\ifcsname\@@univector\unidiv\endcsname
     \csname\doutfunihash{\unidiv}{#1}\endcsname
   \else % so, these can be different fonts !
     \unicodeglyph\unidiv\unimod % no \uchar (yet)
   \fi\fi}

Basically, I'd like to use the \unicodeasciicharacter hook with this
definition:

\def\unicodeasciicharacter{\uchar{0}}

(I'm not certain the above is release-quality code, but I've been testing
it with a stripped down \utfunifontglyph that should be functionally
equivalent.)

play with it and we'll see

Working with the unicode code makes me appreciate that it's really
powerful part of ConTeXt. Thanks, Hans!

how about the following:

there are many font encodings around but none is really complete enough to deal with basic unicode (0/1/2 range)

why not define a new font encoding with characters only so that we can have as many chars as needed in a 0-255 vector, all those special characters (registered, and so) are (1) used seldom, (2) not related to hyphenation and kerning; it is also a way to get rid of some 'ligatures' like --- becoming an emdash (in context and xml we can conformtably directly call symbols, and these may come from a different instance of the font

Hans

_______________________________________________
ntg-context mailing list
[EMAIL PROTECTED]
http://www.ntg.nl/mailman/listinfo/ntg-context

Re: [NTG-context] unicode and out-of-box usability

Reply via email to