On Sun, Feb 22, 2009 at 5:01 PM, Dmitry Polevoy <[email protected]> wrote:
> The initial version of hOcr output was created by Rene Rebe (look at history > of \cuneiform-linux\cuneiform_src\Kern\rout\src\html.cpp) and I am not a > specialist with html encoding format. The UTF-8 encoding thing was added by me. The reason it always outputs UTF-8 is that Unicode is the recommended encoding for HTML and it covers all the letters so there is no need to add support for legacy character sets. I guess we could change the html writer function so that you can't pass output charset information to it. Currently the only caller is the Cuneiform command line binary, which always passes UTF-8 as output format. _______________________________________________ Mailing list: https://launchpad.net/~cuneiform Post to : [email protected] Unsubscribe : https://launchpad.net/~cuneiform More help : https://help.launchpad.net/ListHelp

