Martin Vysny wrote:
I had the same problem aswell. When you try to save file in notepad.exe as UTF-8, it places 3-byte invisible UTF-8 character at the start of xml file. That is causing that goddamn "Content is not allowed in prolog" message.
That's probably not the problem because Xerces has a custom UTF-8 reader that knows how to skip the BOM. Unless, of course, the application doesn't give the parser the chance to pick the proper java.io.Reader for the input. This can happen when the application constructs an input source with a Reader object instead of an InputStream. For example: Reader reader = new InputStreamReader(stream); InputSource source = new InputSource(reader); In this case, the input stream reader will use the default system encoding, usually ISO Latin 1 on English systems. This is normally ok because every byte (even with the high bit on) is valid in that encoding. All except for the UTF-8 byte order mark which ends up looking like "content [that] is not allowed in [the] prolog". Even constructing an input stream reader with the encoding set to "UTF-8" doesn't help because that will use the Java UTF-8 reader which doesn't understand the BOM. -- Andy Clark * [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
