ppkarwasz commented on code in PR #710:
URL: https://github.com/apache/commons-compress/pull/710#discussion_r2380163633
##########
src/main/java/org/apache/commons/compress/archivers/tar/TarFile.java:
##########
@@ -65,7 +66,7 @@ private final class BoundedTarEntryInputStream extends
BoundedArchiveInputStream
BoundedTarEntryInputStream(final TarArchiveEntry entry, final
SeekableByteChannel channel) throws IOException {
super(entry.getDataOffset(), entry.getRealSize());
if (channel.size() - entry.getSize() < entry.getDataOffset()) {
- throw new ArchiveException("Entry size exceeds archive size");
+ throw new EOFException("Truncated TAR archive: entry size
exceeds archive size");
Review Comment:
I switched the exception to `EOFException` to make `TarFile` and
`TarArchiveInputStream` behave consistently, since they share the same test
suite in `MaxNameEntryLengthTest`.
However, you are right: this is not a “real” EOF. With a genuine EOF you can
usually still read the non-truncated part of the entry. Here the check is
performed eagerly in the constructor and prevents reading anything at all.
This raises the question: should we remove this upfront check and instead
let truncation errors surface naturally while reading downstream? I generally
favor eager validation, but only when we know the stream will be fully consumed
before handing control back to the caller. In this case the check applies
unconditionally: it triggers both when we consume entries internally (e.g. PAX
headers) and when users obtain the stream themselves.
One more caveat: `SeekableByteChannel#size()` is not stable by contract. It
may change if the underlying file is modified, so even if this check fails here
it might have succeeded if performed at a later stage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]