[ 
https://issues.apache.org/jira/browse/TIKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514142#comment-15514142
 ] 

Rodrigo Rosenfeld Rosas commented on TIKA-2091:
-----------------------------------------------

Great, good job :) Anyway, there's no hurry on my side as I tried sending the 
text directly from Ruby with "Nokogiri::HTML(html_content).text" and it worked 
pretty fine. I'm still sending it to the extract request handler because this 
is already configured, but I'm looking on simplifying this since I guess this 
could add some unnecessary overhead...

> regression: Zip bomb detected! for HTML file
> --------------------------------------------
>
>                 Key: TIKA-2091
>                 URL: https://issues.apache.org/jira/browse/TIKA-2091
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.13
>         Environment: Debian jessie Linux, Oracle Java 8
>            Reporter: Rodrigo Rosenfeld Rosas
>             Fix For: 1.7
>
>
> Hi, while discussing an issue on Solr's mailing list it was suggested to me 
> to open a ticket here. Please let me know if this is not the proper place for 
> such ticket.
> After upgrading to latest Solr, this document is no longer indexing properly 
> in Solr. They told me they upgraded Tika from 1.7 to 1.13 in Solr 6.2. Before 
> the upgrade this documented was indexed as expected:
> https://www.sec.gov/Archives/edgar/data/1472033/000119380513001310/e611133_f6ef-eutelsat.htm
> I hope a fix could go on time for 1.14 ;)
> Cheers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to