[ 
https://issues.apache.org/jira/browse/ANY23-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282027#comment-13282027
 ] 

Peter Ansell commented on ANY23-99:
-----------------------------------

It seems like it would be better to read it all in as UTF-8 as you say and then 
handle the exceptions when the data comes back in via a parser, so they have a 
chance to fix the document. Silent corruption is never good. 

It does violate the general rule to try to write strictly according to the 
specification and read somewhat liberally, within reason, but if people are not 
generally aware of the ASCII encoding rules then it may be more more useful to 
support them than to exclude them.
                
> NQuadsWriter should force ASCII in OutputStream constructor
> -----------------------------------------------------------
>
>                 Key: ANY23-99
>                 URL: https://issues.apache.org/jira/browse/ANY23-99
>             Project: Apache Any23
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.0
>            Reporter: Peter Ansell
>
> The NQuads specification states that all NQuads documents must be ASCII 
> encoded. [1] The current NQuadsWriter(OutputStream) constructor does not 
> enforce this when creating the OutputStreamWriter to wrap up the given 
> outputstream. If it is not enforced, then the users locale will be used to 
> create the OutputStreamWriter, which may not enforce US-ASCII.
> Patch is to replace the constructor with:
>         this( new OutputStreamWriter(os, Charset.forName("US-ASCII")) );
> [1] http://sw.deri.org/2008/07/n-quads/#mediatype

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to