[ 
https://issues.apache.org/jira/browse/CAMEL-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291186#comment-14291186
 ] 

Volodymyr Sobotovych commented on CAMEL-8191:
---------------------------------------------

I also noticed some incorrectness in description of "charset" option in 
documentation (http://camel.apache.org/file2.html):

Camel 2.9.3: this option is used to specify the encoding of the file, _and 
camel will set the Exchange property with Exchange.CHARSET_NAME with the value 
of this option_. You can use this on the consumer, to specify the encodings of 
the files, which allow Camel to know the charset it should load the file 
content in case the file content is being accessed. Likewise when writing a 
file, you can use this option to specify which charset to write the file as 
well. See further below for a examples and more important details.

The incorrectness is highlighted in _italic_ above. No endpoint (file, ftp, 
sftp) sets Exchange.CHARSET_NAME as illustrated by the output of this test:
{code}
public class FileEncodingTest extends CamelTestSupport {
    @Test
    public void testFileEncoding() {
        template.sendBody("direct:in", "Hi there");
    }

    @Override
    protected RouteBuilder createRouteBuilder() throws Exception {
        return new RouteBuilder() {

            @Override
            public void configure() throws Exception {
                from("direct:in")
                        .log("Charset name header (1): 
${header.CamelCharsetName}")
                        .to("file://output.txt?charset=iso-8859-1")
                        .log("Charset name header (2): 
${header.CamelCharsetName}")
                        .setHeader(Exchange.CHARSET_NAME, 
constant("iso-8859-1"))
                        .log("Charset name header (3): 
${header.CamelCharsetName}");
            }
        };
    }
}
{code}

{code}
[                          main] route1                         INFO  Charset 
name header (1): 
[                          main] SendProcessor                  DEBUG >>>> 
Endpoint[file://output.txt?charset=iso-8859-1] Exchange[Message: Hi there]
[                          main] FileOperations                 DEBUG Using 
Reader to write file: output.txt/ID-wheleph-Lenovo-G570-42931-1422203242220-0-1 
with charset: iso-8859-1
[                          main] GenericFileProducer            DEBUG Wrote 
[output.txt/ID-wheleph-Lenovo-G570-42931-1422203242220-0-1] to 
[Endpoint[file://output.txt?charset=iso-8859-1]]
[                          main] route1                         INFO  Charset 
name header (2): 
[                          main] route1                         INFO  Charset 
name header (3): iso-8859-1
{code}

> Charset is ignored for SFTP producer endpoints
> ----------------------------------------------
>
>                 Key: CAMEL-8191
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8191
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-ftp
>    Affects Versions: 2.12.3, 2.14.1
>         Environment: vso@vso-desktop:/tmp$ uname -a
> Linux vso-desktop 3.13.0-43-generic #72~precise1-Ubuntu SMP Tue Dec 9 
> 12:14:18 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> vso@vso-desktop:/tmp$ java -version
> java version "1.7.0_65"
> OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-0ubuntu0.12.04.1)
> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
> vso@vso-desktop:/tmp$ ssh -v
> OpenSSH_5.9p1 Debian-5ubuntu1.4, OpenSSL 1.0.1 14 Mar 2012
>            Reporter: Volodymyr Sobotovych
>              Labels: charset, sftp
>             Fix For: 2.14.2, 2.15.0
>
>         Attachments: CAMEL-8191.patch
>
>
> For SFTP producer endpoints option "charset" is ignored and the output file 
> is created using platform-default charset (usually UTF-8). 
> The simple Spring context illustrates the issue:
> {code}
> <beans xmlns="http://www.springframework.org/schema/beans";
>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>        xsi:schemaLocation="
>        http://www.springframework.org/schema/beans 
>        http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
>        http://camel.apache.org/schema/spring 
>        http://camel.apache.org/schema/spring/camel-spring.xsd";>
>   <camelContext xmlns="http://camel.apache.org/schema/spring";>
>     <route>
>       <from uri="stream:in?promptMessage=Enter something:" />
>       <to 
> uri="sftp://localhost:22/vso/sandbox?charset=ISO-8859-1&amp;username=fake_sftp_user&amp;password=qwerty"/>
>     </route>
>   </camelContext>
> </beans>
> {code}
> This context defines a route that transfers the string entered by user via 
> SFTP. If the user enters "Müller", I can see 7-byte message in the output 
> directory (because "ü" is represented using 2 bytes in UTF-8). While it 
> should be 6-byte message if the file was encoded in ISO-8859-1.
> This problem affects only SFTP endpoints. File and FTP endpoints treat the 
> "charset" option correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to