[ 
https://issues.apache.org/jira/browse/DAFFODIL-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Beckerle updated DAFFODIL-2600:
------------------------------------
    Fix Version/s: 3.2.1
                       (was: 3.3.0)

> encoding varies with environment - UTF-8 not properly set somewhere
> -------------------------------------------------------------------
>
>                 Key: DAFFODIL-2600
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2600
>             Project: Daffodil
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: 3.1.0, 3.2.0
>            Reporter: Mike Beckerle
>            Assignee: Steve Lawrence
>            Priority: Major
>             Fix For: 3.2.1
>
>
> DFDL schemas and the behavior of parsers/unparsers are NOT supposed to be 
> dependent on environment variables like LANG.
> Our diagnostic messages might be affected, but infoset contents and data 
> contents should not be. So only negative tests which are checking 
> error/warning messages should be sensitive to environmental things like LANG. 
> However, positive tests fail if UTF-8 is not properly specified 
> environmentally. This is a bug because it means somewhere we're getting a 
> default (environmentally specified) character set encoding, when we should be 
> specifying the encoding. 
> In addition, Daffodil does require that systems are setup to enable Unicode.  
> A clear diagnostic is needed if, when building daffodil, the UTF-8 
> capabilities are not properly setup. This otherwise leads to a long list of 
> errors that are not easily interpreted.
> Note that LANG=en_US isn't sufficient. On some systems unicode/UTF-8 is the 
> default, on others some other charset for en_US.  A portable check here may 
> be somewhat challenging, given that different systems have different defaults 
> (e.g, Linux MINT, vs. Linux Red-Hat, .... and that's just considering Linux.) 
> We know MS-Windows also requires specific UTF-8 configuration. So likely we 
> need a test that
> (1) runs very early or first, so that the error message isn't lost in the mix
> (2) checks that UTF-8 behaviors are working properly for Daffodil, regardless 
> of how that particular operating system variant must be configured to get 
> those settings. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to