[
https://issues.apache.org/jira/browse/TIKA-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746490#comment-16746490
]
Tim Allison commented on TIKA-2818:
-----------------------------------
Sorry, I should have started with: Thank you for raising this issue and sharing
an example file.
Is there any chance you could share a sample file where the first file is
encrypted but the second file is not?
Thank you, again!
> RarParser throws EncryptedDocumentException only when whole archive is
> encrypted
> --------------------------------------------------------------------------------
>
> Key: TIKA-2818
> URL: https://issues.apache.org/jira/browse/TIKA-2818
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.20
> Reporter: Pavel Arnošt
> Priority: Minor
> Attachments: rar4_encrypted_content_only.rar
>
>
> RarParser throws EncryptedDocumentException only if whole archive is
> encrypted. If encryption is on individial files, parser ends with
> org.apache.tika.exception.TikaException: RarParser Exception:
> Caused by: org.apache.tika.exception.TikaException: RarParser Exception
> at org.apache.tika.parser.pkg.RarParser.parse(RarParser.java:99)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159)
> at ... 43 more
> Caused by: com.github.junrar.exception.RarException: ioError
> at com.github.junrar.Archive.getInputStream(Archive.java:525)
> at org.apache.tika.parser.pkg.RarParser.parse(RarParser.java:81)
> ... 48 more
> Caused by: com.github.junrar.exception.RarException: crcError
> at com.github.junrar.Archive.doExtractFile(Archive.java:557)
> at com.github.junrar.Archive.extractFile(Archive.java:498)
> at com.github.junrar.Archive.getInputStream(Archive.java:523)
> ... 49 more
> File encryption should be checked before trying to extract content on line 79
> like this:
> FileHeader header = rar.nextFileHeader();
> if (header.isEncrypted()) {
> throw new EncryptedDocumentException();
> }
> while (header != null && !Thread.currentThread().isInterrupted()) {
> Or maybe insert it into metadata with
> TikaCoreProperties.TIKA_META_EXCEPTION_EMBEDDED_STREAM key? I don't know, but
> current behaviour is not correct (parsing fails).
> Sample document is attached.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)