[ 
https://issues.apache.org/jira/browse/FLUME-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208218#comment-14208218
 ] 

Johny Rufus commented on FLUME-2538:
------------------------------------

Atlast I was able to find the bug that caused this change 
https://bugs.openjdk.java.net/browse/JDK-7096080

The relevant portion to this bug taken from - 
http://mail.openjdk.java.net/pipermail/core-libs-dev/2011-September/007722.html 
[which is a link mentioned in this bug report]

"Another corner case is how to deal with the old 5-6 bytes byte sequence, 
such as
"fc 80 80 8f bf bf", we are now treating them as 1 malformed utf-8 byte 
sequence, so any
of these 5-6 bytes "old" formed will be treated one malformed character 
and then replaced
by one "\ufffd". But according to the new "best practice" 
recommendation, it probably should
be replaced by 6 \ufffd."

> TestResettableFileInputStream fails on JDK 8
> --------------------------------------------
>
>                 Key: FLUME-2538
>                 URL: https://issues.apache.org/jira/browse/FLUME-2538
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.5.0.1
>            Reporter: Johny Rufus
>            Assignee: Johny Rufus
>             Fix For: v1.6.0
>
>         Attachments: FLUME-2538.patch
>
>
> TestResettableFileInputStream.testUtf8DecodeErrorHandlingReplace fails in JDK 
> 8
> "testUtf8DecodeErrorHandlingReplace(org.apache.flume.serialization.TestResettableFileInputStream)
>   Time elapsed: 6 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<...(���)
> NonUnicode: (�[])
> > but was:<...(���)
> NonUnicode: (�[����]) "
> Charsetdecoder.decode has changed in its behavior, as to how it handles 
> CodingErrorAction.Replace policy 
> Will submit a patch today.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to