PS 

it seems that even on Linux, the platform default
encoding differs by distribution. Redhat distributions
use UTF-8 as default encoding but SuSE seem to use
ISO-8859-1 as default.

And, as I said, users can change it in their shells.

Cheers,
--
Martin Oberhuber
Wind River Systems, Inc.
Target Management Project Lead, DSDP PMC Member
http://www.eclipse.org/dsdp/tm 

> -----Original Message-----
> From: Oberhuber, Martin 
> Sent: Friday, September 21, 2007 11:48 AM
> To: 'Atsuhiko Yamanaka'
> Cc: jsch-users@lists.sourceforge.net
> Subject: RE: [JSch-users] Jsch ChannelSftp and character encodings
> 
> Hello Atsuhiko,
> 
> I'm not a big expert on encodings, but it seems to me
> that unconditionally focing UTF-8 is not the right 
> thing to do, at least for the following reasons:
> 
> 1.) As I understand it, on a UNIX box every user is free 
>     to choose his own encoding. User A could be using UTF-8
>     but user B could be using ISO8859-1 or whatever he prefers.
>     In the shell, the change is made by setting an environment
>     variable.
>     But how would the SSHD know what encoding a user prefers?
>     It's running as root, isn't it? So how would it convert
>     from UTF-8 to the user's preferred encoding?
> 
> 2.) Although RFC seems to recomment UTF-8, it looks like 
>     practical implementation does not use it to recode.
> 
> 3.) Old version of Jsch defaulted to something else, so if
>     files with extended chars were written with old Jsch
>     they cannot be read properly with new Jsch when you 
>     force UTF-8.
> 
> because of all these reasons, I still think the better way
> is to allow client choose the default encoding, as I was
> proposing. If it turns out that UTF-8 is the correct default,
> client can 
>    Channel.setDefaultEncoding("UTF-8");
> otherwise, other client's favorite encoding can be set.
> 
> But as I said, I'm not the big expert on encodings and I'm
> happy to discuss this.
> 
> Cheers,
> --
> Martin Oberhuber
> Wind River Systems, Inc.
> Target Management Project Lead, DSDP PMC Member
> http://www.eclipse.org/dsdp/tm 
> 
> > -----Original Message-----
> > From: Atsuhiko Yamanaka [mailto:[EMAIL PROTECTED] 
> > Sent: Thursday, September 20, 2007 4:58 AM
> > To: Oberhuber, Martin
> > Cc: jsch-users@lists.sourceforge.net
> > Subject: Re: [JSch-users] Jsch ChannelSftp and character encodings
> > 
> > Hi,
> > 
> >    +-From: "Oberhuber, Martin" <[EMAIL PROTECTED]> --
> >    |_Date: Wed, 19 Sep 2007 11:47:37 +0200 _______________________
> >    |
> >    |I'm wondering if anybody thought about the case yet where
> >    |I'd like to transfer files via Sftp, where the file names
> >    |Use non-ASCII foreign language characters, and the character
> >    |Encoding on the local system is different than the remote.
> >    |
> >    |Say, I want to transfer from a Windows box to a Linux box.
> >    |On Windows, my encoding is Cp1252
> >    |On remote Linux, my encoding is UTF-8
> >    |I want to transfer file "m,Mv(Bchte"
> >    |
> >    |Currently, channel always seems to encode Java Unicode Strings
> >    |With Platform default encoding (Cp1252 in my case). On the 
> >    |Remote, file names will not appear as expected.
> > 
> > Yes, it is a bug/incompleteness of jsch.
> > 
> > As far as I have understood, we have to send filenames in UTF-8 over
> > sftp protocol.  For example, its IETF draft[1] has said as follows,
> > 
> >   8.1.1.  Opening a File
> >      Files are opened and created using the SSH_FXP_OPEN message.
> >          byte   SSH_FXP_OPEN
> >          uint32 request-id
> >          string filename [UTF-8]
> >          uint32 desired-access
> >          uint32 flags
> >          ATTRS  attrs
> > 
> > On the other hand, in the current jsch implementation,
> > filenames have been sent in the local default encoding.
> > 
> > I'll fix it in the next version, but it will cause the 
> > troubles for others.
> > It seems to me that OpenSSH(for example, openssh-4.7p1)'s 
> > sftp-server has
> > not implemented such encoding conversion.  So, as for the 
> avobe case, 
> > if the remote host does not use UTF-8, users will get 
> > unexpected results.
> > This is the reason I had not implemented it.
> > 
> >    |To fix this, I think there should be 
> >    |    Channel.setControlEncoding(String encoding)
> >    |So I can specify the encoding to use forr file and
> >    |Path names on the remote. At the time the Java unicode
> >    |String for arguments is converted to byte arrray, it
> >    |Should do so with the default encoding specified by me.
> > 
> > Unfortunately, the client does not have the initiative to choose the
> > encoding and filenames must be sent in UTF-8 according to the RFC.
> > 
> > 
> > [1] http://tools.ietf.org/html/draft-ietf-secsh-filexfer-13
> > 
> > 
> > Sincerely,
> > --
> > Atsuhiko Yamanaka
> > JCraft,Inc.
> > 1-14-20 HONCHO AOBA-KU,
> > SENDAI, MIYAGI 980-0014 Japan.
> > Tel +81-22-723-2150
> >     +1-415-578-3454
> > Fax +81-22-224-8773
> > Skype callto://jcraft/
> > 

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
JSch-users mailing list
JSch-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jsch-users

Reply via email to