Rodrigo Rosenfeld Rosas commented on TIKA-2091:

Great, good job :) Anyway, there's no hurry on my side as I tried sending the 
text directly from Ruby with "Nokogiri::HTML(html_content).text" and it worked 
pretty fine. I'm still sending it to the extract request handler because this 
is already configured, but I'm looking on simplifying this since I guess this 
could add some unnecessary overhead...

> regression: Zip bomb detected! for HTML file
> --------------------------------------------
>                 Key: TIKA-2091
>                 URL: https://issues.apache.org/jira/browse/TIKA-2091
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.13
>         Environment: Debian jessie Linux, Oracle Java 8
>            Reporter: Rodrigo Rosenfeld Rosas
>             Fix For: 1.7
> Hi, while discussing an issue on Solr's mailing list it was suggested to me 
> to open a ticket here. Please let me know if this is not the proper place for 
> such ticket.
> After upgrading to latest Solr, this document is no longer indexing properly 
> in Solr. They told me they upgraded Tika from 1.7 to 1.13 in Solr 6.2. Before 
> the upgrade this documented was indexed as expected:
> https://www.sec.gov/Archives/edgar/data/1472033/000119380513001310/e611133_f6ef-eutelsat.htm
> I hope a fix could go on time for 1.14 ;)
> Cheers.

This message was sent by Atlassian JIRA

Reply via email to