Re: output-encoding in HTMLGenerator, please help!

2003-01-14 Thread Joerg Heinicke
Hi Yury, so we agree? The bug is in HTMLGenerator, but the expected encoding isn't UTF-8 (reading from http://www.w3.org/ doesn't work for me (NullPointerException)), but ISO-8859-1 or maybe the default encoding of the JVM. Can you file a bug in bugzilla? Regards, Joerg Yury Mikhienko wrote:

Re: output-encoding in HTMLGenerator, please help!

2003-01-14 Thread Yury Mikhienko
Hi Joerg! Thanx for your reply. The pure Tidy works properly (output stream encoding is the same as the input stream encoding). The problem, from my point of view, is in transformer (or streamer [if xpath is null value]) input stream encoding (HTMLGenerator), because Tidy DOM parser returns KOI

Re: output-encoding in HTMLGenerator, please help!

2003-01-13 Thread Joerg Heinicke
Hello Yuri, I only can confirm the bug in HTML generator. It seems it can not read the KOI8-R encoded file correctly. I tested it with your html snippet saved to a static file. serializer.setOutputProperty(OutputKeys.ENCODING, "KOI8-R"); of course does not help, because that's only the output.

output-encoding in HTMLGenerator, please help!

2003-01-12 Thread Yury Mikhienko
Hi all! Can anyone help me with the following problem: I have a KOI8-R encoded HTML document. After processing this document with HTMLGenerator, in output I have ISO-8859-1 encoded document :(( for example The source document: (from URL: /test) ðÒÉ×ÅÔ! ðÒÉ×ÅÔ! (in sitemap.xmap):