Either way, you're probably right that entering it as a bug makes sense. That 
way we can track it and investigate further.
G

On Nov 14, 2010, at 8:03 PM, Greg Brown wrote:

> The problem is that, even if the PI specifies UTF-8 for example, the file 
> itself may be saved with a different encoding (so they may not match).
> 
> On Nov 14, 2010, at 8:00 PM, Niclas Hedhman wrote:
> 
>> On Mon, Nov 15, 2010 at 8:52 AM, Greg Brown <[email protected]> wrote:
>>>> Doesn't the XML deserializer you use just work correctly if you pass
>>>> an InputStream instead of a Reader??
>>> 
>>> 
>>> Actually, I think a Reader would work but we don't currently expose that 
>>> API. We use javax.xml.stream.XMLInputFactory#createXMLStreamReader() to 
>>> process the XML, which takes an InputStream as an argument. What we should 
>>> probably do is allow the caller to specify the character set to read (there 
>>> is another version of createXMLStreamReader() that takes both an 
>>> InputStream and a java.nio.charset.Charset).
>> 
>> That is incorrect. XML specification says that the <?xml> processing
>> instruction is in (IIRC) ASCII and it contains the encoding of the
>> rest of the document., such as <?xml version="1.0" encoding="UTF-8"
>> ?>, and compliant parsers should understand this. So, for instance, if
>> the document is in UTF-16, the <?xml?> PI is NOT, and a regular text
>> editor would have problem with handling that. For UTF-8, ISO-8859-X
>> and others, the ASCII encoding coincide so not so obvious.
>> 
>> Cheers
>> -- 
>> Niclas Hedhman, Software Developer
>> http://www.qi4j.org - New Energy for Java
>> 
>> I  live here; http://tinyurl.com/2qq9er
>> I  work here; http://tinyurl.com/2ymelc
>> I relax here; http://tinyurl.com/2cgsug
> 

Reply via email to