On Thu, 2010-10-28 at 08:46 -0400, Tzadik Vanderhoof wrote: > I have a binary data file, in a format used by a relatively ancient > program, which I am trying to convert into something sane. With the > help of a Hex editor I have basically worked out the file format > except that it contains Hebrew characters with an odd encoding. > > All characters are 8 bits. The "standard" 27 consonants (including > "final" consonants) go from hex 80 to 9A. Then there are vowels that > seem to start around hex 9B or so (I'm guessing right after the > standard consonants end). Then there are "dotted" consonants that seem > to start at hex E0. > > If I remember correctly, I think this is some sort of DOS encoding > (perhaps connected to the old "Hebrew chip"). Does anyone have a table > of this character mapping or a tool that will translate this mapping > into a more normal Hebrew encoding like Unicode?
The encoding looks like CP862 (http://en.wikipedia.org/wiki/Code_page_862 or http://www.ascii-codes.com/cp862.html), modified to incorporate vowels and "dotted" consonants. The standard iconv tool doesn't seem to support this modified CP862 encoding, however it should be simple to whip up a short tool or script for converting from this encoding into one of the Unicode encodings, such as utf-8. --- Omer -- What happens if one mixes together evolution with time travel to the past? See: http://www.zak.co.il/a/stuff/opinions/eng/evol_tm.html My own blog is at http://www.zak.co.il/tddpirate/ My opinions, as expressed in this E-mail message, are mine alone. They do not represent the official policy of any organization with which I may be affiliated in any way. WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html _______________________________________________ Perl mailing list Perl@perl.org.il http://mail.perl.org.il/mailman/listinfo/perl