Hi, this is my first message in this list.
Is it possible to disable Zip bomb detection in the Tika handler?
I've also described the problem here:
http://stackoverflow.com/questions/39628519/how-to-disable-or-increase-limit-zip-bomb-detection-in-tika-with-solr-config?noredirect=1#comment66575342_39628519
Basically, I get this error when trying to process some big valid HTML
documents:
RSolr::Error::Http - 500 Internal Server Error
Error:
{'responseHeader'=>{'status'=>500,'QTime'=>76},'error'=>{'metadata'=>['error-class','org.apache.solr.common.SolrException','root-error-class','org.apache.tika.sax.SecureContentHandler$SecureSAXException'],'msg'=>'org.apache.tika.exception.TikaException:
Zip bomb detected!','trace'=>'org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Zip bomb detected!
at
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:234)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2089)
at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:652)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:459)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
I need to index those documents. Is it possible to disable Zip bomb
detection or to increase the limit using configuration files? I noticed
it's possible to add a tika.config file but I have no idea on how to
specify what I want in such Tika configuration files.
Any help is appreciated!
Thanks in advance,
Rodrigo.