[ 
https://issues.apache.org/jira/browse/DAFFODIL-1386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Beckerle closed DAFFODIL-1386.
--------------------------------------
    Resolution: Won't Fix

Pointless wish list item. 
Nobody asking for this. It's not really feasible to fix, closing. 

> single utf-8 4-byte character becomes surrogate character pairs in scala/java 
> string
> ------------------------------------------------------------------------------------
>
>                 Key: DAFFODIL-1386
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-1386
>             Project: Daffodil
>          Issue Type: Wish
>          Components: Back End
>            Reporter: Michael Beckerle
>            Priority: Major
>
> Recent changes in 1.2.0 to the data input layers removed a feature which is 
> the ability to treat surrogate pair characters as single characters.
> See test_encodingNoError. 
> This test has a TDML representation where a single character in utf-8 that 
> has a 4-byte encoding has to become a surrogate-pair (two codepoints) in a 
> java/scala string, but the data input stream's char iterator on a call to 
> next() returns only 1 codepoint. There is no accomodation in the data input 
> stream layers for the possibility of a single character needing 2 codepoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to