Jens Reimann created COMPRESS-459:
-------------------------------------
Summary: CPIO fails decoding multibyte name entries
Key: COMPRESS-459
URL: https://issues.apache.org/jira/browse/COMPRESS-459
Project: Commons Compress
Issue Type: Bug
Components: Compressors
Affects Versions: 1.17, 1.9
Reporter: Jens Reimann
Having a CPIO archive in (e.g. UTF-8) mode and having a name entry with a name
containing multi-byte characters the decoder fails.
The problem IMHO is the "getHeaderPadCount" method, which assumes a single byte
per character:
{code:java}
public int getHeaderPadCount(){
if (this.alignmentBoundary == 0) { return 0; }
int size = this.headerSize + 1; // Name has terminating null
if (name != null) {
size += name.length();
}
final int remain = size % this.alignmentBoundary;
if (remain > 0){
return this.alignmentBoundary - remain;
}
return 0;
}
{code}
However this may (or may not) be true for UTF-8.
Also it wouldn't be enough to call "String#getBytes(…)" as this might already
transform the underlying bytes.
The proper solution would be to provide the name size, as read from the CPIO
stream, and pass it to the entry.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)