[
https://issues.apache.org/jira/browse/COMPRESS-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17903842#comment-17903842
]
Gary D. Gregory edited comment on COMPRESS-689 at 12/7/24 10:39 PM:
--------------------------------------------------------------------
Hello [~dcokan]
Thank you for your report.
This issue here is that reading a ZIP with an input stream starts reading the
file at the beginning of the file (obviously), and that the metadata you are
testing is stored at the end of the file in central directory entries.
Unfortunately, we use the same class for both kinds of entries, so it's a bit
confusing. Fortunately, you can use the ZipFile class to get this information.
{code:java}
try (ZipFile zipFile =
ZipFile.builder().setFile("target/zipWithLinks.zip").get()) {
assertTrue(zipFile.getEntry("link").isUnixSymlink(), "'link'
detected but it's not sym link");
assertFalse(zipFile.getEntry("original").isUnixSymlink(),
"'original' detected but it's not sym link");
}
{code}
See {{{}ZipArchiveInputStreamTest.testWriteZipWithLinks(){}}}.
was (Author: garydgregory):
Hello [~dcokan]
Thank you for your report.
This issue here is that reading a ZIP with an input stream starts reading the
file at the beginning of the file (obviously), and that the metadata you are
testing is stored at the end of the file in central directory entries.
Unfortunately, we use the same class for both kinds of entries, so it's a bit
confusing. Fortunately, you can use the ZipFile class to get this information.
{code:java}
try (ZipFile zipFile =
ZipFile.builder().setFile("target/zipWithLinks.zip").get()) {
assertTrue(zipFile.getEntry("link").isUnixSymlink(), "'link'
detected but it's not sym link");
assertFalse(zipFile.getEntry("original").isUnixSymlink(), "'link'
detected but it's not sym link");
}
{code}
See {{{}ZipArchiveInputStreamTest.testWriteZipWithLinks(){}}}.
> Unable to detect symlinks in ZIP
> --------------------------------
>
> Key: COMPRESS-689
> URL: https://issues.apache.org/jira/browse/COMPRESS-689
> Project: Commons Compress
> Issue Type: Bug
> Affects Versions: 1.27.1
> Environment: MacOs M1 Sonoma 14.4
> Reporter: Dawid Iwo Cokan
> Priority: Major
>
> *Context:*
> In my project I need to prepare a ZIP with a data under different paths. Some
> of resources appear there multiple times hence I wanted to improve it so it
> does not appear more than once. I initially thought about using hard links
> and tar archive (I did POC and it works) but it has to be ZIP. So I decided
> to use sym links.
> *Problem:*
> Creating a ZIP with symlinks works (when I unzip it to in my system the link
> is preserved) but when I parse the same ZIP with my code using the same
> version of commons-compress the
> {code:java}
> entry.isUnixSymlink()
> {code}
> always returns false.
>
> Here is a snippet to reproduce the problem:
>
> {code:java}
> @Test
> public void createZIPWithLinks() throws IOException {
> OutputStream output = new FileOutputStream("zipWithLinks.zip");
> try (ZipArchiveOutputStream zipOutputStream = new
> ZipArchiveOutputStream(output)) {
> zipOutputStream.putArchiveEntry(new ZipArchiveEntry("original"));
> zipOutputStream.write("original content".getBytes());
> zipOutputStream.closeArchiveEntry();
> ZipArchiveEntry symlinkEntry = new ZipArchiveEntry("link");
> symlinkEntry.setUnixMode(0120444);
> zipOutputStream.putArchiveEntry(symlinkEntry);
> zipOutputStream.write("original".getBytes());
> zipOutputStream.closeArchiveEntry();
> }
> ZipArchiveInputStream zipInputStream = new ZipArchiveInputStream(new
> FileInputStream("zipWithLinks.zip"));
> ZipArchiveEntry entry;
> int entriesCount = 0;
> while ((entry = zipInputStream.getNextEntry()) != null) {
> boolean isSymLink = entry.isUnixSymlink();
> if ("link".equals(entry.getName())) {
> assertTrue(entry.isUnixSymlink(), "'link' detected but it's not
> sym link");
> } else {
> assertFalse(entry.isUnixSymlink(), "'original' detected but it's
> sym link and should be regular file");
> }
> entriesCount++;
> }
> assertEquals(2, entriesCount);
> } {code}
>
>
> I dig a bit in ZIP specification and tried to understand what's wrong and in
> zipdetails of the archive I can see this:
>
> {quote}0000 LOCAL HEADER #1 04034B50
> 0004 Extract Zip Spec 14 '2.0'
> *0005 Extract OS 00 'MS-DOS'*
> {quote}
>
> I can see the 'isUnixSymLIink()' is checking platform field is UNIX,
> otherwise is not detecting the other information. The platform field in
> ZipArchiveEntry is set in ZipArchiveInputStream line 687 based on local file
> header:
>
> {code:java}
> int off = WORD; // = 4
> current = new CurrentEntry();
> final int versionMadeBy = ZipShort.getValue(lfhBuf, off);
> off += SHORT; // = 20, HEX: 14
> current.entry.setPlatform(versionMadeBy >> ZipFile.BYTE_SHIFT &
> ZipFile.NIBLET_MASK); {code}
>
>
> And here is something I don't understand. I see this reads the 'Extract Zip
> Spec' header which is correct. But then the operation:
>
> {code:java}
> versionMadeBy >> ZipFile.BYTE_SHIFT & ZipFile.NIBLET_MASK {code}
> produces 0 so later the isUnixSymLink() always returns 0 becuase:
>
> {code:java}
> public boolean isUnixSymlink() {
> return (getUnixMode() & UnixStat.FILE_TYPE_FLAG) == UnixStat.LINK_FLAG;
> }
> public int getUnixMode() {
> return platform != PLATFORM_UNIX ? 0 : (int) (getExternalAttributes() >>
> SHORT_SHIFT & SHORT_MASK);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)