Tika was upgraded from 1.7 to 1.13 in Solr 6.2 so this is likely a
change in Tika.

You could _try_ downgrading Tika, but that's chancy and I have no guarantee
that it'll work.

Or use a SolrJ client to use an older version of Tika and transmit it
to Solr, here's
an example:

https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/

Best,
Erick

On Thu, Sep 22, 2016 at 8:01 AM, Rodrigo Rosenfeld Rosas
<rr_ro...@yahoo.com.br.invalid> wrote:
> I forgot to mention that this problem just happened after I upgraded to a
> recent version of Solr and tried to reindex all documents. Some documents
> that had previously succeeded now failed with this error.
>
> Em 22-09-2016 11:58, Rodrigo Rosenfeld Rosas escreveu:
>>
>> Hi, thanks. I was talking to @elyograg over freenode#solr and he (or she,
>> can't know by the nickname) recommended me to create a Java app integrating
>> SolrJ and Tika to perform the indexing. Is this the only way to achieve that
>> with Solr? Since I'm not usually a Java developer, I'd prefer another kind
>> of solution, but if there isn't, I'll have to look at the Java API and
>> examples for SolrJ and Tika to achieve that...
>>
>> Just wanted to confirm. I'll try to get a sample HTML yielding to this
>> problem and attach it to Jira.
>>
>> Thanks,
>> Rodrigo.
>>
>> Em 22-09-2016 11:48, Allison, Timothy B. escreveu:
>>>
>>> Y, looks like Nick (gagravarr) has answered on SO -- can't do it in Tika
>>> currently.
>>>
>>> -----Original Message-----
>>> From: Allison, Timothy B. [mailto:talli...@mitre.org]
>>> Sent: Thursday, September 22, 2016 10:42 AM
>>> To: solr-user@lucene.apache.org
>>> Cc: 'u...@tika.apache.org' <u...@tika.apache.org>
>>> Subject: RE: Disabling Zip bomb detection in Tika
>>>
>>> I don't think that's configurable at the moment.
>>>
>>> Tika-colleagues, any recommendations?
>>>
>>> If you're able to share the file on Tika's jira, we'd be happy to take a
>>> look.  You shouldn't be getting the zip bomb unless there is a mismatch
>>> between opening and closing tags (which could point to a bug in Tika).
>>>
>>> -----Original Message-----
>>> From: Rodrigo Rosenfeld Rosas [mailto:rr_ro...@yahoo.com.br.INVALID]
>>> Sent: Thursday, September 22, 2016 10:06 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Disabling Zip bomb detection in Tika
>>>
>>> Hi, this is my first message in this list.
>>>
>>> Is it possible to disable Zip bomb detection in the Tika handler?
>>>
>>> I've also described the problem here:
>>>
>>>
>>> http://stackoverflow.com/questions/39628519/how-to-disable-or-increase-limit-zip-bomb-detection-in-tika-with-solr-config?noredirect=1#comment66575342_39628519
>>>
>>> Basically, I get this error when trying to process some big valid HTML
>>> documents:
>>>
>>> RSolr::Error::Http - 500 Internal Server Error
>>> Error:
>>>
>>> {'responseHeader'=>{'status'=>500,'QTime'=>76},'error'=>{'metadata'=>['error-class','org.apache.solr.common.SolrException','root-error-class','org.apache.tika.sax.SecureContentHandler$SecureSAXException'],'msg'=>'org.apache.tika.exception.TikaException:
>>> Zip bomb detected!','trace'=>'org.apache.solr.common.SolrException:
>>> org.apache.tika.exception.TikaException: Zip bomb detected!
>>>           at
>>>
>>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:234)
>>>           at
>>>
>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>>>           at
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154)
>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2089)
>>>           at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:652)
>>>           at
>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:459)
>>>           at
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
>>>           at
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
>>>
>>> I need to index those documents. Is it possible to disable Zip bomb
>>> detection or to increase the limit using configuration files? I noticed it's
>>> possible to add a tika.config file but I have no idea on how to specify what
>>> I want in such Tika configuration files.
>>>
>>> Any help is appreciated!
>>>
>>> Thanks in advance,
>>> Rodrigo.
>>
>>
>>
>>
>

Reply via email to