Hello,

Some time ago there was a discussion about extending support for different regimes in ConTeXt. The list of (to-be-)supported regimes probably depends strongly on the implementation (ruby+iconv?). I collected a preliminary list of candidate regimes and possible synonyms (some synonyms are listed there for backward compatibility and have to remain there), leaving out most of eastern encodings (not because they shouldn't be on the list, but because I'm completely ignorant about that).

Hans suggested to post this to the mailing list first to get some useful comments and suggestions.

#####

The following question should probably go in a separate thread, but it's a very similar thematic. In July 2006 Ljubljana will host people from around 85 coutries of the world. One of the very ambitious organizers is dreaming for already a couple of years to print the participant names (on honourable mentions for example, ...) in both latinic transcription and as they are written in original (under an assumption that the names are properly entered in a UTF-8 database). This is probably not possible to do for every single obscure language, but does it in general sound like:
a) Good luck (I don't want to be on your place)!
b) Take a good (commercial) program
c) If you're ready to invest the rest of your time (forget about hobbies!), it's probably doable in LaTeX or ConTeXt until then č) Forget about TeX - it will be possible to solve this problem one day with unicode & one of the new TeX engines. But until then, it's not worth the effort, because any effort you may invest will become obsolete in a couple of years.

To be honest, even some people who will thanslate the materials into the native language, will probably do that with paper, pencil & scanner.

#####


Mojca

And here the encodings:

# ISO
    ISO-8859-1  Western
    ISO-8859-2  Central European
    ISO-8859-3  South European
    ISO-8859-4  Baltic
    ISO-8859-5  Cyrillic
    ISO-8859-6  Arabic
    ISO-8859-7  Greek
    ISO-8859-8  Hebrew Visual
    ISO-8859-8-I Hebrew (???) What is that?
    ISO-8859-9  Turkish
    ISO-8859-10 Nordic
    ISO-8859-11 Thai
    ISO-8859-13 Baltic
    ISO-8859-14 Celtic
    ISO-8859-15 Western
    ISO-8859-16 Romanian

    \defineregimesynonym[il*][iso-8859-*], *=1-16\12
    \defineregimesynonym[latin*][iso-8859-*], *=1-16\12
    \defineregimesynonym[cp819][iso-8859-1]

    % I'm not sure that anyone needs these:
    \defineregimesynonym[iso-ir-100][iso-8859-1]
    \defineregimesynonym[iso-ir-101][iso-8859-2]
    \defineregimesynonym[iso-ir-109][iso-8859-3]
    \defineregimesynonym[iso-ir-110][iso-8859-4]
    \defineregimesynonym[iso-ir-144][iso-8859-5]
    \defineregimesynonym[iso-ir-127][iso-8859-6]
    \defineregimesynonym[iso-ir-126][iso-8859-7]
    \defineregimesynonym[iso-ir-138][iso-8859-8]
    \defineregimesynonym[iso-ir-148][iso-8859-9]
    \defineregimesynonym[iso-ir-157][iso-8859-10]
    \defineregimesynonym[iso-ir-179][iso-8859-13]
    \defineregimesynonym[iso-ir-199][iso-8859-14]
    \defineregimesynonym[iso-ir-203][iso-8859-15]
    \defineregimesynonym[iso-ir-226][iso-8859-16]

    % backward compatibility
    \defineregimesynonym[iso88595][iso-8859-5]

(recode also recognises "arabic", "greek", "cyrillic", "hebrew" as an alias for those encodings: I don't if this is a good idea as there are other charset operating with the same language groups as well)

# APPLE
    MacArabic
    MacCeltic
    MacCentralEuropean
% CentEur, CentralEurope or CentralEuropean? or all of them?
    MacChineseSimplified
    MacChineseTraditional
    MacCroatian
    MacCyrillic
    MacDevanagari
    MacDingbats
    MacFarsi
    MacGaelic
    MacGreek
    MacGujarati
    MacGurmukhi
    MacHebrew
    MacIcelandic
    MacInuit
    MacJapanese
    MacKeyboard
    MacKorean
    MacRoman
    MacRomanian
    MacSymbol
    MacThai
    MacTurkish
    MacUkrainian

    \defineregimesynonym[MacCE][MacCentralEuropean]
    \defineregimesynonym[mac][MacRoman]
    \defineregimesynonym[maccyr][MacCyrillic]
    \defineregimesynonym[macukr][MacUkrainian]

(I also need some help here: sometimes Mac encodings are defined using adjectives, sometimes using nouns, like Ukraine/Ukrainian. Should only one of them (which?) be used or both of them? On the unicode page, Mac encodings appear twice. The second time under Microsoft/Apple, containing MacCyrillic, MacGreek, MacIceland, MacLatin2, MacRoman, MacTurkish. I didn't really get the point for that.)

# IBM
% essentially the same as under Microsoft, with some minor changes (to be processed manually, if these are to be supported)
# MICROSOFT
    EBCDIC % plenty of them are missing on the web
        cp037
        cp500
        cp875
        cp1026
    PC
        cp437 LatinUS
        cp737 Greek
        cp775 BaltRim
        cp850 Latin1
        cp852 Latin2
        cp855 Cyrillic
        cp857 Turkish
        cp860 Portuguese
        cp861 Icelandic
        cp862 Hebrew
        cp863 CanadaF
        cp864 Arabic
        cp865 Nordic
        cp866 Cyrillic - Russian
        cp869 Greek
        cp874 Thai
    WINDOWS
        cp874  Thai (repeats from some unknown reason)
        cp932  Japanese
        cp936  PRC GBK
        cp949  Korean
        cp950  Chinese

        cp1250 Central European
        cp1251 Cyrillic
        cp1252 Western
        cp1253 Greek
        cp1254 Turkish
        cp1255 Hebrew
        cp1256 Arabic
        cp1257 Baltic
        cp1258 Vietnamese

    \defineregimesynonym[cp125*][windows-125*], *=0-8

    % backward compatibility
    \defineregimesynonym[windows][cp1252]

    % there are some other possibilities:
    % ms-ee, ms-cyrl, ms-ansi, as-greek, ms-turk, ms-hebr, ms-arab, ...
    % anyone thinks that they are needed?

% It is not online in Unicode, but it is somewhere already:
    VISCII
    TCVN
    isoir111

\defineregimesynonym[isoir111][iso-ir-111]

#### Some very confusing part (I should leave it out) ####

# MISC (? probably none of them to be processed)
    AtariST
    cp424 Hebrew
    cp856 Hebrew
    cp1006 Arabic

# NEXT
    NextStep (What's that???)
        next

% Missing in Unicode mapping (online)
    TIS-620 Thai
_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

Reply via email to