Hello!

I'm sending a HTML document to Solr and Tika is throwing the "Zip bomb 
detected!" exception back. Looks like Tika has an arbitrary limit of 100 level 
of XML element nesting 
(https://github.com/apache/tika/blob/9130bbc1fa6d69419b2ad294917260d6b1cced08/tika-core/src/main/java/org/apache/tika/sax/SecureContentHandler.java#L72-L75).
  Luckily, the variable (maxDepth) does have a public setter function but I am 
not sure if it's possible to set this at Solr.  Is it possible? If so, how 
would I set the value of maxDepth to a higher number?

Thanks!

Here is the full stack trace:
2018-04-05 16:47:48.034 ERROR (qtp1654589030-15) [   x:aconn] 
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: 
org.apache.tika.exception.TikaException: Zip bomb detected!
                at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:234)
                at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
                at 
ca.calgary.csc.wds.solr.GsaAconnRequestHandler.handleRequestBody(GsaAconnRequestHandler.java:84)
                at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
                at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
                at 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
                at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
                at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
                at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
                at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)
                at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
                at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
                at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
                at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
                at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
                at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
                at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
                at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
                at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
                at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
                at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
                at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
                at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
                at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
                at org.eclipse.jetty.server.Server.handle(Server.java:534)
                at 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
                at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
                at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
                at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
                at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
                at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
                at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
                at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
                at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
                at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
                at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.tika.exception.TikaException: Zip bomb detected!
                at 
org.apache.tika.sax.SecureContentHandler.throwIfCauseOf(SecureContentHandler.java:192)
                at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:138)
                at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
                ... 35 more
Caused by: org.apache.tika.sax.SecureContentHandler$SecureSAXException: 
Suspected zip bomb: 100 levels of XML element nesting
                at 
org.apache.tika.sax.SecureContentHandler.startElement(SecureContentHandler.java:234)
                at 
org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
                at 
org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
                at 
org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
                at 
org.apache.tika.sax.SafeContentHandler.startElement(SafeContentHandler.java:264)
                at 
org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.java:255)
                at 
org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.java:297)
                at 
org.apache.tika.parser.html.HtmlHandler.startElementWithSafeAttributes(HtmlHandler.java:251)
                at 
org.apache.tika.parser.html.HtmlHandler.startElement(HtmlHandler.java:167)
                at 
org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
                at 
org.apache.tika.parser.html.XHTMLDowngradeHandler.startElement(XHTMLDowngradeHandler.java:60)
                at org.ccil.cowan.tagsoup.Parser.push(Parser.java:794)
                at org.ccil.cowan.tagsoup.Parser.rectify(Parser.java:1061)
                at org.ccil.cowan.tagsoup.Parser.stagc(Parser.java:1016)
                at org.ccil.cowan.tagsoup.HTMLScanner.scan(HTMLScanner.java:625)
                at org.ccil.cowan.tagsoup.Parser.parse(Parser.java:449)
                at 
org.apache.tika.parser.html.HtmlParser.parse(HtmlParser.java:135)
                at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
                at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
                at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
                ... 36 more



________________________________
NOTICE -
This communication is intended ONLY for the use of the person or entity named 
above and may contain information that is confidential or legally privileged. 
If you are not the intended recipient named above or a person responsible for 
delivering messages or communications to the intended recipient, YOU ARE HEREBY 
NOTIFIED that any use, distribution, or copying of this communication or any of 
the information contained in it is strictly prohibited. If you have received 
this communication in error, please notify us immediately by telephone and then 
destroy or delete this communication, or return it to us by mail if requested 
by us. The City of Calgary thanks you for your attention and co-operation.

Reply via email to