On Thu, Oct 28, 2010 at 08:48:46AM -0400, Tzadik Vanderhoof wrote: > I have a binary data file, in a format used by a relatively ancient program, > which I am trying to convert into something sane. With the help of a Hex > editor I have basically worked out the file format except that it contains > Hebrew characters with an odd encoding. > > All characters are 8 bits. The "standard" 27 consonants (including "final" > consonants) go from hex *80* to *9A*. Then there are *vowels* that seem to > start around hex *9B* or so (I'm guessing right after the standard > consonants end). Then there are *"dotted" consonants* that seem to start at > hex *E0*. > > If I remember correctly, I think this is some sort of DOS encoding (perhaps > connected to the old "Hebrew chip"). Does anyone have a table of this > character mapping or a tool that will translate this mapping into a more > normal Hebrew encoding like Unicode?
It's called CP862 and is supported by iconv. Try: iconv -f cp862 -t utf-8 < inp > out Not sure about the vowels/dotted consonants/etc. -- Didi _______________________________________________ Linux-il mailing list [email protected] http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
