Package: xml2
Version: 0.4-3.1

The html2 executable generated from this package has a hard coded character set declared as ISO-8859-1.

As follows:

        init(&sax);

        if (1 == argc && !strcmp(name,"html2")) {
                ctxt = htmlCreatePushParserCtxt(&sax,NULL,NULL,0,"stdin",
                                                XML_CHAR_ENCODING_8859_1);
                parseChunk = htmlParseChunk;
                freeCtxt = htmlFreeParserCtxt;
                do_compress_whitespace = 1;
        } else if (1 == argc && !strcmp(name,"xml2")) {
                ctxt = xmlCreatePushParserCtxt(&sax,NULL,NULL,0,"stdin");
                parseChunk = xmlParseChunk;
                freeCtxt = xmlFreeParserCtxt;
                do_ignore_whitespace = 1;

IMO this should now be changed to UTF-8, ie: XML_CHAR_ENCODING_UTF8

Note: A complete solution should probably be to allow the character
set to be chosen from the command line, however, this tiny mod would
be sufficient for a simple workaround using iconv.

--
Rob.                          (Robert de Bath <robert$ @ debath.co.uk>)
                                             <http://www.debath.co.uk/>


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to