[
https://issues.apache.org/jira/browse/TIKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514142#comment-15514142
]
Rodrigo Rosenfeld Rosas commented on TIKA-2091:
-----------------------------------------------
Great, good job :) Anyway, there's no hurry on my side as I tried sending the
text directly from Ruby with "Nokogiri::HTML(html_content).text" and it worked
pretty fine. I'm still sending it to the extract request handler because this
is already configured, but I'm looking on simplifying this since I guess this
could add some unnecessary overhead...
> regression: Zip bomb detected! for HTML file
> --------------------------------------------
>
> Key: TIKA-2091
> URL: https://issues.apache.org/jira/browse/TIKA-2091
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.13
> Environment: Debian jessie Linux, Oracle Java 8
> Reporter: Rodrigo Rosenfeld Rosas
> Fix For: 1.7
>
>
> Hi, while discussing an issue on Solr's mailing list it was suggested to me
> to open a ticket here. Please let me know if this is not the proper place for
> such ticket.
> After upgrading to latest Solr, this document is no longer indexing properly
> in Solr. They told me they upgraded Tika from 1.7 to 1.13 in Solr 6.2. Before
> the upgrade this documented was indexed as expected:
> https://www.sec.gov/Archives/edgar/data/1472033/000119380513001310/e611133_f6ef-eutelsat.htm
> I hope a fix could go on time for 1.14 ;)
> Cheers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)