ppkarwasz commented on code in PR #710:
URL: https://github.com/apache/commons-compress/pull/710#discussion_r2380241120
##########
src/main/java/org/apache/commons/compress/archivers/tar/TarUtils.java:
##########
@@ -767,15 +813,24 @@ protected static Map<String, String>
parsePaxHeaders(final InputStream inputStre
final String keyword =
coll.toString(StandardCharsets.UTF_8);
// Get rest of entry
final int restLen = len - read;
+
+ // Validate entry length
+ // 1. Ignore empty keywords
if (restLen <= 1) { // only NL
headers.remove(keyword);
+ // 2. Entry length exceeds header size
} else if (headerSize >= 0 && restLen > headerSize
- totalRead) {
throw new ArchiveException("PAX header value
size %,d exceeds size of header record.", restLen);
} else {
+ // 3. Entry length exceeds configurable file
and link name limits
+ if ("path".equals(keyword) ||
"linkpath".equals(keyword)) {
+ ArchiveUtils.checkEntryNameLength(restLen
- 1, maxEntryPathLength, "TAR");
+ }
final byte[] rest =
IOUtils.readRange(inputStream, restLen);
final int got = rest.length;
if (got != restLen) {
- throw new ArchiveException("Failed to read
PAX header: Expected %,d bytes, read %,d.", restLen, got);
+ throw new EOFException(String.format(
Review Comment:
Here I beg to differ: to me this **is** a textbook case for `EOFException`.
We know the length of a PAX header value and attempt to read it. If the
stream ends before delivering the expected number of bytes, that’s exactly the
scenario `EOFException` is meant to signal. It is very similar to how
`DataInputStream.readLong()` throws `EOFException` if only three bytes are
available.
However, in practice this branch is never hit:
* If the TAR entry itself is truncated, `TarArchiveInputStream` or
`TarFile.BoundedTarEntryInputStream` already throw as soon as EOF is reached
before the declared entry size (line 815)
* If a PAX header entry claims a length larger than what remains in the TAR
entry, the earlier length check (line 809) fails first.
So this code is effectively unreachable except in unit tests. Which means
the choice of exception here doesn’t affect real-world behavior, but I still
believe `EOFException` is semantically the most accurate signal for “expected N
bytes, got fewer.”
https://github.com/apache/commons-compress/blob/7a3d21cbf94d012f769cf1cab7ac4cd629e51685/src/main/java/org/apache/commons/compress/archivers/tar/TarUtils.java#L808-L819
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]