On 18.05.2004 00:25, Upayavira wrote:

I use html generator without configuration and xhtml serializer
encoding to UTF-8. Could you tell me where the problem may be?

The remote web page has a specific encoding. I guess the HTML generator is ignoring it and parses the remote webpage probably using UTF-8. I don't know about the details or how to solve it. Maybe you can get jtidy to output XML in a specific encoding that the parser parsing the jtidy output expects.


I've recently tried to change the encoding on JTidy. It doesn't seem to work. I followed it right in in a debugger - the configured locale was set right inside JTidy, but it still outputted ISO-8859-1. No UTF-8.

I'm thinking of extending the HTML generator to use something like NekoHTML (I'm using it right now for a work project, and I reckon it'd be pretty easy to do (like 10 lines of code). So the generator would be configurable as to which tool it uses.

And configuring the parser instead of jtidy is not possible?

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to