Woo Ju Shin created COMPRESS-212:
------------------------------------
Summary: TarArchiveEntry getName() returns wrongly encoded name
even when you set encoding to TarArchiveInputStream
Key: COMPRESS-212
URL: https://issues.apache.org/jira/browse/COMPRESS-212
Project: Commons Compress
Issue Type: Bug
Affects Versions: 1.4.1
Environment: Red Hat Enterprise Linux, MS Windows 7
Reporter: Woo Ju Shin
Priority: Minor
I have two file systems. One is Red Hat Linux, one is MS Windows.
I created a *.tgz file in Red Hat Linux and tried to decompress it in MS
Windows using Commons Compress.
The default system encoding are different. UTF-8 in Red Hat Linux and CP949 in
MS Windows.
It seems that the file name encoding follows the default encoding even though
when I use the following to untar it.
FileInputStream fis = new FileInputStream(new File(*.tgz));
TarArchiveInputStream zis = new TarArchiveInputStream(new
BufferedInputStream(fis),encodingOfRedHatLinux);
while ((entry = (TarArchiveEntry)zis.getNextEntry()) != null)
{
entry.getName(); // filename is not UTF-8 it is encoded in CP949 and so the
filename isn't consistent
}
By referring to this
/**
* Constructor for TarInputStream.
* @param is the input stream to use
* @param encoding name of the encoding to use for file names
* @since Commons Compress 1.4
*/
public TarArchiveInputStream(InputStream is, String encoding) {
this(is, TarBuffer.DEFAULT_BLKSIZE, TarBuffer.DEFAULT_RCDSIZE,
encoding);
}
encoding should be used for file names.
But actually this doesn't seem to work.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira