[ https://issues.apache.org/jira/browse/TIKA-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aeham Abushwashi updated TIKA-2300: ----------------------------------- Attachment: TIKA-2300.patch Here's a first stab at a patch for discussion.... PackageParser can easily figure out if the zip is encrypted (albeit with an ugly cast!). I figured users may not always want the PackageParser to abandon processing encrypted zip files and opted for adding a metadata flag to indicate the file is encrypted. This maintains backwards compatibility with TIKA-1028, but is it consistent with how Tika reports _partial_ success/failure elsewhere? Also... the change made me realise the rich metadata extracted by the PackageParser for the compressed/inner files never finds its way back up to users through the metadata object. Is this by design? > Can't tell if a zip file is encrypted > ------------------------------------- > > Key: TIKA-2300 > URL: https://issues.apache.org/jira/browse/TIKA-2300 > Project: Tika > Issue Type: Bug > Affects Versions: 1.14 > Reporter: Aeham Abushwashi > Assignee: Tim Allison > Attachments: encrypted_file.zip, TIKA-2300.patch > > > When Tika processes a zip file that is protected with a password, it will > return the list of file names within the zip but no indication (as an > exception or in metadata) that the file is encrypted. > From stepping through the code, I can see that the information needed to > determine whether the archive is encrypted is available inside > ZipArchiveEntry#getGeneralPurposeBit#usesEncryption, but needs to be relayed > back to PackageParser somehow -- This message was sent by Atlassian JIRA (v6.3.15#6346)