[ 
https://issues.apache.org/jira/browse/FLUME-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057370#comment-14057370
 ] 

Santiago M. Mola commented on FLUME-2215:
-----------------------------------------

I would like to see something like [~adutra]'s second approach implemented. The 
character decoding logic should be separated from the byte reading, just as it 
is separated in the Java standard library. While the first approach works, it 
forces us to solve the problem at every level over and over. For example, I 
have a (highly hacky) ResettableDecompressInputStream implementation that takes 
a ResettableFileInputStream and decompresses it. After applying the first 
patch, I still have to manage character decoding on 
ResettableDecompressInputStream.

> ResettableFileInputStream can't support  ucs-4 character
> --------------------------------------------------------
>
>                 Key: FLUME-2215
>                 URL: https://issues.apache.org/jira/browse/FLUME-2215
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.5.0
>            Reporter: syntony liu
>            Priority: Critical
>              Labels: patch
>         Attachments: FLUME-2215-0-README.txt, FLUME-2215-0.patch, 
> FLUME-2215-1-README.txt, FLUME-2215-1.patch
>
>
> ResettableFileInputStream.java:readChar() not handle ucs-4 character. it need 
> 2 charBuf. it cause an unexpected termination。
>  a temporary solution:
>      if (res.isOverflow() && !charBuf.hasRemaining()){ 
>          logger.warn("decoder ucs-4 at postion: {}" , buf.position()); 
>         tmpBuf.clear();  
>         res = decoder.decode(buf, tmpBuf, isEndOfInput); 
>         incrPosition( buf.position() - start, false); 
>        return '?'; 
>      } 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to