Support for de/encoding of tar entry names other than plain 8BIT conversion.
----------------------------------------------------------------------------
Key: COMPRESS-183
URL: https://issues.apache.org/jira/browse/COMPRESS-183
Project: Commons Compress
Issue Type: Improvement
Components: Archivers
Affects Versions: 1.3
Reporter: Joao Schim
Fix For: 1.4
Attachments: patch-tar-name-encoding.diff
The names of tar entries are currently encoded/decoded by means of plain 8bit
conversions of byte to char and vice-versa. This prohibits the use of encodings
like UTF8 in the file names. Whether the use of UTF8 (or any other non ASCII)
in file names is sensible is a chapter of its own. However tar archives that
contain files which names have been encoded with UTF8 do float around. These
files currently can not be read correctly by commons-compress due to the
encoding being hardcoded to plain 8BIT only.
The supplied patch allows to use encodings other than 8BIT using a
TarArchiveCodec structure. It does not change the standard functionality, but
adds to it the possibility of using a different encoding.
A method was added to the TarUtilsTest junit test to test the added
functionality.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira