[ 
https://issues.apache.org/jira/browse/TIKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428933#comment-16428933
 ] 

Harinder commented on TIKA-2091:
--------------------------------

Hello [~talli...@mitre.org], you mentioned above that the zip bomb issue when 
extracting HTML files does not occur if you don't use Solr's custom 
MostlyPassthroughHtmlMapper.  
How would I go about configuring Solr to use Tika's default extractor? 

I have a thread open at SO with full details, [see 
here|https://stackoverflow.com/questions/49699256/zip-bomb-exception-while-sending-html-document-to-solr].

Thanks!

> regression: Zip bomb detected! for HTML file
> --------------------------------------------
>
>                 Key: TIKA-2091
>                 URL: https://issues.apache.org/jira/browse/TIKA-2091
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.13
>         Environment: Debian jessie Linux, Oracle Java 8
>            Reporter: Rodrigo Rosenfeld Rosas
>            Priority: Major
>
> Hi, while discussing an issue on Solr's mailing list it was suggested to me 
> to open a ticket here. Please let me know if this is not the proper place for 
> such ticket.
> After upgrading to latest Solr, this document is no longer indexing properly 
> in Solr. They told me they upgraded Tika from 1.7 to 1.13 in Solr 6.2. Before 
> the upgrade this documented was indexed as expected:
> https://www.sec.gov/Archives/edgar/data/1472033/000119380513001310/e611133_f6ef-eutelsat.htm
> I hope a fix could go on time for 1.14 ;)
> Cheers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to