[
https://issues.apache.org/jira/browse/TIKA-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17475557#comment-17475557
]
Tim Allison edited comment on TIKA-3644 at 1/13/22, 5:26 PM:
-------------------------------------------------------------
It looks like the package-depth detector (the 5 in your config) is not
triggered if the embeddedDocument extractor calls the parse with
outputHTML=false. In the MSOffice parser, outputHTML=true; however in the
OOXMLParser, outputHTML=false. I propose that we change all outputHTML=true
throughout the parsers.
That said, I did get a zip bomb exception when I set the maxDepth to 5 on both
MSOffice and ooxml files.
-Looking at the SecureContentHandler, I'm frankly not certain what the
difference between packageDepth and depth is.- :P
It looks like the difference is that maxPackageDepth should cover embedded
items, where as maxDepth covers all html entities.
was (Author: [email protected]):
It looks like the package-depth detector (the 5 in your config) is not
triggered if the embeddedDocument extractor calls the parse with
outputHTML=false. In the MSOffice parser, outputHTML=true; however in the
OOXMLParser, outputHTML=false. I propose that we change all outputHTML=true
throughout the parsers.
That said, I did get a zip bomb exception when I set the maxDepth to 5 on both
MSOffice and ooxml files.
Looking at the SecureContentHandler, I'm frankly not certain what the
difference between packageDepth and depth is. :P
> OfficeParser can not detect embedded zip bomb in the office documents
> ---------------------------------------------------------------------
>
> Key: TIKA-3644
> URL: https://issues.apache.org/jira/browse/TIKA-3644
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 2.2.1
> Reporter: Sergen Bağ
> Priority: Minor
> Attachments: 10_2_2_2_2.zip, tika_exception.PNG, zipbomb.doc,
> zipbomb.docx, zipbomb.ppt, zipbomb.pptx, zipbomb.xls, zipbomb.xlsx
>
>
> Hi, I am trying to get "zip bomb detection" exception but I can't. I used
> attachments as below and I saw this situation like that:
> When I send "zipbomb.xls" and "zipbomb.doc" to Tika, Tika threw exception.
> When I send "zipbomb.xlsx","zipbomb.docx","zipbomb.ppt" and "zipbomb.pptx" to
> Tika, Tika didn't throw exception.
> Thanks.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)