[ 
https://issues.apache.org/jira/browse/COMPRESS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543402#comment-13543402
 ] 

Woo Ju Shin commented on COMPRESS-212:
--------------------------------------

The archive is created on Red Hat Linux using tar command. The files that are 
included in this archive is created by cat command with an argument for 
filename which is encoded in "UTF-8". And also the system default encoding is 
specified to be "UTF-8".
                
> TarArchiveEntry getName() returns wrongly encoded name even when you set 
> encoding to TarArchiveInputStream
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-212
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-212
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.4.1
>         Environment: Red Hat Enterprise Linux, MS Windows 7
>            Reporter: Woo Ju Shin
>            Priority: Minor
>
> I have two file systems. One is Red Hat Linux, the other is MS Windows.
> I created a *.tgz file in Red Hat Linux and tried to decompress it in MS 
> Windows using Commons Compress.
> The default system encoding are different. UTF-8 in Red Hat Linux and CP949 
> in MS Windows.
> It seems that the file name encoding follows the default encoding even though 
> when I use the following to untar it.
> FileInputStream fis = new FileInputStream(new File(*.tgz));
> TarArchiveInputStream zis = new TarArchiveInputStream(new 
> BufferedInputStream(fis),encodingOfRedHatLinux);
> while ((entry = (TarArchiveEntry)zis.getNextEntry()) != null)
> {
> entry.getName(); // filename is not UTF-8 it is encoded in CP949 and so the 
> filename isn't consistent
> }
> By referring to this
>     /**
>      * Constructor for TarInputStream.
>      * @param is the input stream to use
>      * @param encoding name of the encoding to use for file names
>      * @since Commons Compress 1.4
>      */
>     public TarArchiveInputStream(InputStream is, String encoding) {
>         this(is, TarBuffer.DEFAULT_BLKSIZE, TarBuffer.DEFAULT_RCDSIZE, 
> encoding);
>     }
> encoding should be used for file names.
> But actually this doesn't seem to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to