Re: [NTG-context] Character names (was: Context 2005.12.19 released)

2005-12-22 Thread Mojca Miklavec
Taco Hoekwater wrote:

 Here's what I can come up with. At least a few are acceptable, like the
 horizontal bar. \textnumero exists, but is only reachable in cyrillic
 encodings (fixable, I guess?), and the greek  vietnamese accents
 are also only usable in the correct encoding. I've used the \text...
 versions of the accents, but perhaps the actual commands are more
 correct (like \' and \~).

 Cheers, Taco

 \starttext
 \definecharacter texthorizontalbar {{--\kern 0pt--}}
 \definecharacter textdong  {\underbar{\dstroke}}

Thanks for those ...

 \NC 0300 COMBINING GRAVE ACCENT \NC \textgrave   \NC \NR
 \NC 0309 COMBINING HOOK ABOVE   \NC \texthookabove   \NC \NR
 \NC 0303 COMBINING TILDE\NC \texttilde   \NC \NR
 \NC 0301 COMBINING ACUTE ACCENT \NC \textacute   \NC \NR
 \NC 0323 COMBINING DOT BELOW\NC \textbottomdot   \NC \NR

I may be wrong, but aren't those used only in combination with other
characters? I don't know if TeX (ConTeXt) can handle this (at least
not yet). When I wrote the list a couple of days ago I forgot about
that fact. If the accent would come before the charecter, this could
be replaced by \buildtextaccent..., but here there's perhaps no
solution without some additional macros. (And since the Vietnamese
seem to be satisfied with viscii and utf for now, supporting cp1258 is
not crucial.)

I double-checked the differences between the existing regimes and the
ones that were automatically produced by a script. The list of regimes
that are ripe for supporting is thus:

cp125[ 0 | *1 | *2 | 3 | 4 | 7 ]
iso-8859-[ *1 | *2 | 3 | 4 | *5 | *7 | 9 | 13 | *15 | 16 ]
*viscii (with glyph names instead of \\u\...)

(The ones marked with a star are already supported, perhaps with some
inconsistencies. Not supported: Hebrew, Arabic, Vietnamese? for cp125X
and Arabic, Thai and Celtic for iso-8859-X.)

I'll send the files (full content is already on my page), but I need
to know how to split/group them (I guess it would be a bad idea to
have one file for each encoding). Should there be one file for
iso-8859 and one for windows encodings? What about those regimes that
are already supported? I would like to move at least the regi-win
(with 8 wrong definitions anyway) to a less discriminating place,
don't know what to do with Greek and Cyrillic.

And another set of questions:
1. Can someone check for (in)consistencies for
greekupsilondiaeresis vs. greekupsilondialytika?
Looks like the same glyph named differently at different places
(functionality may break).

2. What to do with
{\cyrillicGJE}   {\'\cyrillicG} % 0403 CYRILLIC CAPITAL LETTER GJE
{\cyrillicgje}   {\'\cyrillicg} % 0453 CYRILLIC SMALL LETTER GJE
{\cyrillicKJE}   {\'\cyrillicK} % 040C CYRILLIC CAPITAL LETTER KJE
{\cyrillickje}   {\'\cyrillick} % 045C CYRILLIC SMALL LETTER KJE
{\cyrillicgheupturn} {\cyrillicgup} % 0491 CYRILLIC SMALL LETTER GHE WITH UPTURN
Which variant is better?

Would it make sense to define
\definecharacter cyrillicGJE {\buildtextaccent\textacute\cyrillicG}
\defineaccent ' \cyrillicG {\cyrillicGJE}
and then use \cyrillicGJE consistently?

3.
PLEASE FIX:
in enco-def.tex replace \cdots by something (\dots, I suppose, but I'm not sure)
\definecharacter textellipsis {\mathematics\cdots}
(I guess this bug was the reason for changing some definitions in
regimes/encodings elsewhere.)

Should \textellipsis be used for 2026 HORIZONTAL ELLIPSIS or anything else?

4. \softhyphen, \hyphen or \- for 00AD SOFT HYPHEN?

5. Urgently: what to do with quotations (without language
discriminations if possible)?

% 201A SINGLE LOW-9 QUOTATION MARK
\quotesinglebase vs. \lowerleftsingleninequote
% 201E DOUBLE LOW-9 QUOTATION MARK
\quotedblbase vs. \lowerleftdoubleninequote
% 2018 LEFT SINGLE QUOTATION MARK
\quoteleft vs. \upperleftsinglesixquote
% 2019 RIGHT SINGLE QUOTATION MARK
\quoteright vs. \upperrightsingleninequote

% 201C LEFT DOUBLE QUOTATION MARK
\quotedblleft vs. \upperleftdoublesixquote
% 201D RIGHT DOUBLE QUOTATION MARK
\quotedblright vs. \upperrightdoubleninequote

% 2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
\guilsingleleft vs. \leftsubguillemot
 % 203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
\guilsingleright vs. \rightsubguillemot
% 00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
\leftguillemot vs. \greekleftquot
(are Greek quotations treated specially or what is this doing in regi-grk?)
% 00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
\rightguillemot vs. \greekrightquot vs. \prewordbreak\rightguillemot
(in my point of view the last one may be better, but not fair since
it's language dependent: may be OK for French, but not for German or
vice versa; perhaps a language-sensitive macro could be inserted at
this place?)

6. \textnumero, 0x2116 (and perhaps some other characters) should be
added to unicode vector 33.

7. files regi-il1 and regi-win have many inconsistencies. I would like
to suggest to do the following renamings:

windows - cp1252
il1 - 

[NTG-context] Character names (was: Context 2005.12.19 released)

2005-12-21 Thread Mojca Miklavec
Hans Hagen wrote:
 Mojca Miklavec wrote:
 Taco Hoekwater wrote:
 
 New features since 2005.12.18:
 
 * Support for the latin-9 regime (latin-1 + euro)
 
 
 There are some more (automatically generated) regime definitions at
 http://pub.mojca.org/tex/enco/contextbase/
 (only from the glyph names that I was able to extract from the
 existing files, so it's only OK for some of the regimes mentioned
 there).
 
 If possible, I would like to ask for core support for windows-1250
 (perhaps other users may find some other regimes useful as well).
 
 
 just send me the files you feel confident with

(I'll send the good files soon.)

Except Celtic, Thai, Arabic and Hebrew (although the letter names for
Hebrew are almost completely defined) almost all the windows and ISO
regimes are OK, just some glyphs are missing (which are, or at least
were, missing in Unicode vectors as well). If anyone has suggestions
for names for the following characters, 6 additional regimes can be
fully supported:

windows-1251 and iso-8859-5
2116 NUMERO SIGN

windows-1253
0385 GREEK DIALYTIKA TONOS
2015 HORIZONTAL BAR
0384 GREEK TONOS

windows-1258
0300 COMBINING GRAVE ACCENT
0309 COMBINING HOOK ABOVE
0303 COMBINING TILDE
0301 COMBINING ACUTE ACCENT
0323 COMBINING DOT BELOW
20AB DONG SIGN

iso-8859-7
20AF DRACHMA SIGN
037A GREEK YPOGEGRAMMENI
2015 HORIZONTAL BAR
0384 GREEK TONOS
0385 GREEK DIALYTIKA TONOS

iso-8859-10
2015 HORIZONTAL BAR

Mojca
___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Character names (was: Context 2005.12.19 released)

2005-12-21 Thread Taco Hoekwater


Here's what I can come up with. At least a few are acceptable, like the
horizontal bar. \textnumero exists, but is only reachable in cyrillic
encodings (fixable, I guess?), and the greek  vietnamese accents
are also only usable in the correct encoding. I've used the \text...
versions of the accents, but perhaps the actual commands are more
correct (like \' and \~).

Cheers, Taco

\starttext
\definecharacter texthorizontalbar {{--\kern 0pt--}}
\definecharacter textdong  {\underbar{\dstroke}}

\starttabulate[|c|c|]
\NC 0300 COMBINING GRAVE ACCENT \NC \textgrave   \NC \NR
\NC 0309 COMBINING HOOK ABOVE   \NC \texthookabove   \NC \NR
\NC 0303 COMBINING TILDE\NC \texttilde   \NC \NR
\NC 0301 COMBINING ACUTE ACCENT \NC \textacute   \NC \NR
\NC 0323 COMBINING DOT BELOW\NC \textbottomdot   \NC \NR
\NC 037A GREEK YPOGEGRAMMENI\NC \unknownchar \NC \NR  % prime?
\NC 0384 GREEK TONOS\NC \greektonos  \NC \NR
\NC 0385 GREEK DIALYTIKA TONOS  \NC \greekdialytikatonos \NC \NR
\NC 2015 HORIZONTAL BAR \NC \texthorizontalbar   \NC \NR
\NC 20AB DONG SIGN  \NC \textdong\NC \NR
\NC 20AF DRACHMA SIGN   \NC \unknownchar \NC \NR
\NC 2116 NUMERO SIGN\NC \textnumero  \NC \NR
\stoptabulate
\stoptext


Mojca Miklavec wrote:

Hans Hagen wrote:


Mojca Miklavec wrote:


Taco Hoekwater wrote:



New features since 2005.12.18:

* Support for the latin-9 regime (latin-1 + euro)



There are some more (automatically generated) regime definitions at
http://pub.mojca.org/tex/enco/contextbase/
(only from the glyph names that I was able to extract from the
existing files, so it's only OK for some of the regimes mentioned
there).

If possible, I would like to ask for core support for windows-1250
(perhaps other users may find some other regimes useful as well).




just send me the files you feel confident with



(I'll send the good files soon.)

Except Celtic, Thai, Arabic and Hebrew (although the letter names for
Hebrew are almost completely defined) almost all the windows and ISO
regimes are OK, just some glyphs are missing (which are, or at least
were, missing in Unicode vectors as well). If anyone has suggestions
for names for the following characters, 6 additional regimes can be
fully supported:

windows-1251 and iso-8859-5
2116 NUMERO SIGN

windows-1253
0385 GREEK DIALYTIKA TONOS
2015 HORIZONTAL BAR
0384 GREEK TONOS

windows-1258
0300 COMBINING GRAVE ACCENT
0309 COMBINING HOOK ABOVE
0303 COMBINING TILDE
0301 COMBINING ACUTE ACCENT
0323 COMBINING DOT BELOW
20AB DONG SIGN

iso-8859-7
20AF DRACHMA SIGN
037A GREEK YPOGEGRAMMENI
2015 HORIZONTAL BAR
0384 GREEK TONOS
0385 GREEK DIALYTIKA TONOS

iso-8859-10
2015 HORIZONTAL BAR

Mojca
___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Character names (was: Context 2005.12.19 released)

2005-12-21 Thread Hans Hagen

Taco Hoekwater wrote:



\definecharacter texthorizontalbar {{--\kern 0pt--}}
\definecharacter textdong  {\underbar{\dstroke}}


ok, i added those to enco-def.tex (end of file:)

\startencoding[\s!default]

\definecharacter texthorizontalbar {{--\kern\zeropoint--}}
\definecharacter textdong  {\underbar{\dstroke}}

\stopencoding

Hans
___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context