Gren Elliot created COMPRESS-674:
------------------------------------
Summary: commons-compress-1.26.1 false positive on detecting
archive
Key: COMPRESS-674
URL: https://issues.apache.org/jira/browse/COMPRESS-674
Project: Commons Compress
Issue Type: Bug
Components: Archivers
Affects Versions: 1.26.1
Environment: Intel running macOS Sonoma - but doubt this is significant
Reporter: Gren Elliot
I’m finding that commons-compress-1.26.1 is recognising a utf-16 text file as a
tar archive – unlike the previous version
This is the code that changed in that release in ArchiveStreamFactory - *public
static String detect(final InputStream in) throws ArchiveException {*
that differs in detection:
{{if (signatureLength >= {_}TAR_HEADER_SIZE{_}) {}}
{{ try (TarArchiveInputStream inputStream = new TarArchiveInputStream(new
ByteArrayInputStream(tarHeader))) {}}
{{ _// COMPRESS-191 - verify the header checksum_}}
{{ _// COMPRESS-644 - do not allow zero byte file entries_}}
{{ __ TarArchiveEntry entry = inputStream.getNextEntry();}}
{{ _// try to find the first non-directory entry within the first 10
entries._}}
{{ __ int count = 0;}}
{{ while (entry != null && entry.isDirectory() && count++ <
{_}TAR_TEST_ENTRY_COUNT{_}) {}}
{{ entry = inputStream.getNextEntry();}}
{{ }}}
{{ if (entry != null && entry.isCheckSumOK() && !entry.isDirectory() &&
entry.getSize() > 0 || count > 0) {}}
{{ return {_}TAR{_};}}
{{ }}}
{{ } catch (final Exception e) { _// NOPMD NOSONAR_}}
{{ _// can generate IllegalArgumentException as well as IOException
auto-detection, simply not a TAR ignored_}}
{{ __ }}}
{{}}}
I feel this is being too lenient. For instance at the last “if” statement, for
the test file, entry is null and count=1. The code suggests it is looking for
the first non-directory entry. It hasn’t found a non-directory entry in our
case.
For instance, the earlier code at least checked that the checksum was OK for
the one entry it checked (it isn’t for our test file…)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)