[jira] [Commented] (CB-13570) FileReader#readAsText fails with multi-byte UTF-8 characters

ASF GitHub Bot (JIRA) Thu, 23 Aug 2018 09:35:23 -0700


    [ 
https://issues.apache.org/jira/browse/CB-13570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590484#comment-16590484
 ]


ASF GitHub Bot commented on CB-13570:
-------------------------------------

TimHambourger commented on issue #242: CB-13570 -- (android, ios, windows) 
Handle multi-byte UTF-8 characters that cross a chunk boundary
URL: 
https://github.com/apache/cordova-plugin-file/pull/242#issuecomment-415482339
 
 
   UPDATE: I had inadvertently broken compatibility w/ Java < 8 on Android. 
Just updated with a fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> FileReader#readAsText fails with multi-byte UTF-8 characters
> ------------------------------------------------------------
>
>                 Key: CB-13570
>                 URL: https://issues.apache.org/jira/browse/CB-13570
>             Project: Apache Cordova
>          Issue Type: Bug
>          Components: cordova-plugin-file
>    Affects Versions: 5.0.0, 4.2.0
>         Environment: Tested on:
>  * iOS 10.2
>  * cordova-ios 4.3.0
>  * UIWebView (not WKWebView)
>  * cordova 6.4.0
>  * cordova-plugin-file 4.2.0 and 5.0.0
> (Slightly old cordova version, but the issue seems to be in the plugin)
>            Reporter: Ralf Kistner
>            Priority: Major
>
> `FileReader#readAsText` reads the file in chunks of 256KB. If the file 
> contains a multi-byte UTF-8 character that is split into two separate chunks, 
> reading fails with an encoding error (ENCODING_ERR: 5).
> For many apps this is not an issue. However, if I file is larger than 256KB 
> and contains many multi-byte characters, this is likely to happen.
> I have not experienced this issue on Android yet.
> Code that demonstrates the issue: 
> https://gist.github.com/anonymous/0fdc1ec212be1e29309820477257a0c3
> In the example, the reading will split the '\u0153' character into '...\x01' 
> and '\x53', which fails to decode in UTF-8.
> A workaround is to use readAsArrayBuffer instead, and do the decoding in 
> JavaScript. However, the decoding can be quite slow on iOS where a native 
> TextDecoder is not available.
> One solution would be to make the chunk sizes semi-flexible, to ensure that 
> it ends on a character boundary (make the chunk larger until decoding 
> succeeds).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@cordova.apache.org
For additional commands, e-mail: issues-h...@cordova.apache.org

[jira] [Commented] (CB-13570) FileReader#readAsText fails with multi-byte UTF-8 characters

Reply via email to