[
https://issues.apache.org/jira/browse/FLUME-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057370#comment-14057370
]
Santiago M. Mola commented on FLUME-2215:
-----------------------------------------
I would like to see something like [~adutra]'s second approach implemented. The
character decoding logic should be separated from the byte reading, just as it
is separated in the Java standard library. While the first approach works, it
forces us to solve the problem at every level over and over. For example, I
have a (highly hacky) ResettableDecompressInputStream implementation that takes
a ResettableFileInputStream and decompresses it. After applying the first
patch, I still have to manage character decoding on
ResettableDecompressInputStream.
> ResettableFileInputStream can't support ucs-4 character
> --------------------------------------------------------
>
> Key: FLUME-2215
> URL: https://issues.apache.org/jira/browse/FLUME-2215
> Project: Flume
> Issue Type: Bug
> Affects Versions: v1.5.0
> Reporter: syntony liu
> Priority: Critical
> Labels: patch
> Attachments: FLUME-2215-0-README.txt, FLUME-2215-0.patch,
> FLUME-2215-1-README.txt, FLUME-2215-1.patch
>
>
> ResettableFileInputStream.java:readChar() not handle ucs-4 character. it need
> 2 charBuf. it cause an unexpected termination。
> a temporary solution:
> if (res.isOverflow() && !charBuf.hasRemaining()){
> logger.warn("decoder ucs-4 at postion: {}" , buf.position());
> tmpBuf.clear();
> res = decoder.decode(buf, tmpBuf, isEndOfInput);
> incrPosition( buf.position() - start, false);
> return '?';
> }
--
This message was sent by Atlassian JIRA
(v6.2#6252)