[ 
https://issues.apache.org/jira/browse/TIKA-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883301#comment-16883301
 ] 

Tim Allison edited comment on TIKA-2899 at 7/11/19 8:16 PM:
------------------------------------------------------------

I added a stack that tracks p, li, ol and ul elements written to the xml 
handler.  It ensures alignment of elements in the output even if the RTF is 
corrupt.

I am not convinced that the attached file has any problems -- I may have just 
covered over an error in our list handling -- but the change will ensure 
matched elements in the output.

If there are any objections to this fix, please let me know, and I can revert.


was (Author: [email protected]):
I added a stack that tracks p, li, ol and ul elements written to the xml 
handler.  It ensures alignment of elements in the output even if the RTF is 
corrupt.

I am not convinced that the attached file has any problems, but the change will 
ensure matched elements in the output.

If there are any objections to this fix, please let me know, and I can revert.

> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.rtf.RTFParser@375a26af
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-2899
>                 URL: https://issues.apache.org/jira/browse/TIKA-2899
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.19
>            Reporter: Pandurang
>            Assignee: Tim Allison
>            Priority: Critical
>             Fix For: 1.22
>
>         Attachments: ABC_PL_WI.rtf
>
>
> I am using Solr 8.0 by using solrnet liabrary we extracting some binary data 
> to text. In that case we are getting below error.
> Its working fine for 99 % documents but its failing for only 1 % docs
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.rtf.RTFParser@375a26af
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>  at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
>  ... 41 more



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to