[ 
https://issues.apache.org/jira/browse/CAMEL-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568526#comment-14568526
 ] 

Sergey Sidashov commented on CAMEL-8356:
----------------------------------------

It seems encoding problem with IOConverter still exists. I try to load text 
file in cp1251 encoding, using file component 
(uri=file:C:\addr\in\?charset=cp1251 for example). Then I write bean with 
method:

public static String convertStreamToString(InputStream inputStream) throws 
IOException {
        if (inputStream == null) return null;
        StringBuilder sb = new StringBuilder(2048); // Define a size if you 
have an idea of it.
        char[] read = new char[128]; // Your buffer size.
        try (InputStreamReader ir = new InputStreamReader(inputStream, 
"cp1251")) {
            for (int i; -1 != (i = ir.read(read)); sb.append(read, 0, i));
        } catch (Throwable t) {}
        return sb.toString();
    }
to test conversion from File to InputStream. This stream for some files reads 
all content successfully, but for some files it clips contents of file. It 
seems file reading ends with some characters (for example, in cp1251 encoding, 
file reading ends with characters 'яя'). Camel version 2.15.2, java version 
1.8.0_45.

> IOConverter.toInputStream(file, charset) returns strange behaving stream
> ------------------------------------------------------------------------
>
>                 Key: CAMEL-8356
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8356
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.14.1, 2.15.0
>            Reporter: Stefan Mandel
>            Assignee: Willem Jiang
>             Fix For: 2.14.2, 2.15.0
>
>         Attachments: 
> CAMEL8356-repaired-Test-and-adjusted-converter-imple.patch, 
> IOConverterCharsetTest.java, german.iso-8859-1.txt, german.utf-8.txt
>
>
> Calling IOConverter.toInputStream with either UTF-8 or ISO-8859-1 returns a 
> stream that behaves strange on non-ascii-characters:
> - putting this stream into an InputStreamReader will return false encoded 
> characters
> - a naive new BufferedReader(new InputStreamReader(new FileInputStream(file), 
> charset)) will return the correctly encoded characters.
> I will attach some unit tests for this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to