[
https://issues.apache.org/jira/browse/FLUME-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208218#comment-14208218
]
Johny Rufus commented on FLUME-2538:
------------------------------------
Atlast I was able to find the bug that caused this change
https://bugs.openjdk.java.net/browse/JDK-7096080
The relevant portion to this bug taken from -
http://mail.openjdk.java.net/pipermail/core-libs-dev/2011-September/007722.html
[which is a link mentioned in this bug report]
"Another corner case is how to deal with the old 5-6 bytes byte sequence,
such as
"fc 80 80 8f bf bf", we are now treating them as 1 malformed utf-8 byte
sequence, so any
of these 5-6 bytes "old" formed will be treated one malformed character
and then replaced
by one "\ufffd". But according to the new "best practice"
recommendation, it probably should
be replaced by 6 \ufffd."
> TestResettableFileInputStream fails on JDK 8
> --------------------------------------------
>
> Key: FLUME-2538
> URL: https://issues.apache.org/jira/browse/FLUME-2538
> Project: Flume
> Issue Type: Bug
> Affects Versions: v1.5.0.1
> Reporter: Johny Rufus
> Assignee: Johny Rufus
> Fix For: v1.6.0
>
> Attachments: FLUME-2538.patch
>
>
> TestResettableFileInputStream.testUtf8DecodeErrorHandlingReplace fails in JDK
> 8
> "testUtf8DecodeErrorHandlingReplace(org.apache.flume.serialization.TestResettableFileInputStream)
> Time elapsed: 6 sec <<< FAILURE!
> org.junit.ComparisonFailure: expected:<...(���)
> NonUnicode: (�[])
> > but was:<...(���)
> NonUnicode: (�[����]) "
> Charsetdecoder.decode has changed in its behavior, as to how it handles
> CodingErrorAction.Replace policy
> Will submit a patch today.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)