... Well, my information was not correct.

There are characters > 127 in the file, like "ř", "š"...

Each char = 1 byte, and as I'm using Windows with CP 1250, the characters are 
displayed correctly.

But I have problem loading them into ConTeXt.

I need to convert the bytes > 127 to UTF sequence, which would be acceptable by 
ConTeXt.

@Thomas:

The table looks nice but there are no entries for CP 1250 to UTF conversion.

I prepared some tables: character conversion and removal of diacritics (see the 
attachment);
maybe it would be handful to include them into ConTeXt somehow.

Best regards,

Lukas


On Fri, 10 Feb 2012 11:57:32 +0100, Philipp Gesang 
<ges...@stud.uni-heidelberg.de> wrote:

On 2012-02-10 11:22, Procházka Lukáš Ing. - Pontex s. r. o. wrote:
Hello,

I have many files with ASCII encoding; this encoding must be kept as these 
files are processed also by another program.

When I work with them in ConTeXt, I need to convert them to UTF.

Not needed, as every ASCII string is a valid UTF8  string:
   “The UTF encoding has several good properties. By far the most
    important is that a byte in the ASCII range 0-127 represents
    itself in UTF. Thus UTF is backward compatible with ASCII.”
    http://doc.cat-v.org/plan_9/4th_edition/papers/utf
You can use them in Luatex without further conversion.

Regards
Philipp



Does Lua (in ConTeXt scope) offer a transformation function or a table of chars 
[ASCII-code] -> [UTF-code] or anything to provide the conversion?

Something like:

\startluacode
  local str = loadFile("a.txt") -- ASCII coded

  str = context.ACSII2UTF(str) -- Or something like this
\stopluacode

Best regards,

Lukas


--
Ing. Lukáš Procházka [mailto:l...@pontex.cz]
Pontex s. r. o.      [mailto:pon...@pontex.cz] [http://www.pontex.cz]
Bezová 1658
147 14 Praha 4

Tel: +420 244 062 238
Fax: +420 244 461 038

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________



--
Ing. Lukáš Procházka [mailto:l...@pontex.cz]
Pontex s. r. o.      [mailto:pon...@pontex.cz] [http://www.pontex.cz]
Bezová 1658
147 14 Praha 4

Tel: +420 244 062 238
Fax: +420 244 461 038

Attachment: Cz2UTF.lua
Description: Binary data

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

Reply via email to