On Sat, Nov 30, 2002 at 09:15:45PM +0100, Gour wrote:
> Simon Pepping ([EMAIL PROTECTED]) wrote:
> 
> > I would like to know that too :-) I have not yet found the time to
> > find out how Context deals with encodings. I only have a note that
> > says that one should do \useXMLfilter [utf], and that I should have a
> > look at the xtag-utf (which is input by the above command) or enco
> > files.
> 
> As far as I can see ConTeXt does not understand utf-8 encoding.
> 
> Where did you find this note mentioning utf?

On my computer :-) I collected remarks made on this list in that
document.

> Some time ago I saw a post on DocBook list from Sebastian Rahtz who is 
> considering to rewrite PassiveTex with ConTeXt support instead of LaTeX.

That would be very good; much better than just doing
docbook. Sometimes I think I would better spend my time on such an
effort, but I am afraid it is a huge task.
 
> The question remains, how to do it with multi-lingual document
> encoded in utf-8? 
>
> Any hint?

As is the case more often in open source: do it yourself. Hans has not
taken part in this discussion, so I think he does not feel like
embarking on an effort in this area.

The basic mechanism to make TeX work with encodings is to declare all
characters above 127 active, and map them to a suitable control
sequence. But that only works with single-byte encodings.

xmltex, David Carlisle's XML parser in tex, which is used by
Passivetex, can swallow and interpret utf-8 encoding. I think he
applies the utf-8 rules to the sequences of single bytes. It should be
easy to transfer this to Context, because it should not be macro
package dependent.

The other options are: use an input filter, like the program that was
mentioned in this thread. Or use NTS, the java based TeX
implementation. Currently it does not deal with multibyte encodings
because it is artificially restricted to 256 characters (if I remember
correctly) and because there are no input encoding macro packages for
higher character codes.

Sebastian's PassiveTeX has long mapping tables for unicode to latex
control sequences. These can be translated to context. (And they
could be made to work with NTS.)

While I am writing this, I am beginning to think that copying xmltex's
algorithm to context is the best way to go.

Regards, Simon

-- 
Simon Pepping
email: [EMAIL PROTECTED]

_______________________________________________
ntg-context mailing list
[EMAIL PROTECTED]
http://www.ntg.nl/mailman/listinfo/ntg-context

Reply via email to