[ https://issues.apache.org/jira/browse/DAFFODIL-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517006#comment-16517006 ]
Steve Lawrence edited comment on DAFFODIL-1957 at 6/19/18 12:26 PM: -------------------------------------------------------------------- Looked into this a little, and I think this is actually the correct behavior. The documentPart element has an "encoding" attribute that specifies how the documentPart content should be convertered to bytes. It defaults to UTF-8. So I think the tests just need to be modified to be something liket this: {code:java} <tdml:documentPart type="text" encoding="ISO-8859-1">...Q...</tdml:documentPart>{code} So that ISO-8859-1 encoding is used instead. The encoding specified in the <?xml tag doesn't affect this and should probably still be UTF-8 just for consistency. was (Author: slawrence): Looked into this a little, and I think this is actually the correct behavior. The documentPart element has an "encoding" attribute that specifies how the documentPart content should be convertered to bytes. It defaults to UTF-8. So I think the tests just need to be modified to be something liket this: {code:java} <tdml:documentPart type="text" encoding="ISO-8859-1">...Q...</tdml:documentPart>{code} > Unicode control charachters not handled correctly by TDMLRunner > --------------------------------------------------------------- > > Key: DAFFODIL-1957 > URL: https://issues.apache.org/jira/browse/DAFFODIL-1957 > Project: Daffodil > Issue Type: Bug > Components: TDML Runner > Affects Versions: 2.1.0 > Reporter: Josh Adams > Priority: Minor > > Ran into an issue where passing in a unicode control character, in this case > 0x80, was not being converted to its correct hex value. Instead it ended up > being 0xC280 which is how this character is represented in UTF-8. -- This message was sent by Atlassian JIRA (v7.6.3#76005)