[ 
https://issues.apache.org/jira/browse/SOLR-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620734#comment-14620734
 ] 

Erick Erickson commented on SOLR-7764:
--------------------------------------

Yet another option: Parse the Tika files outside of DIH with SolrJ. It's 
actually not hard at all, here's a sample that has some DB manipulations mixed 
in but those bits could easily be removed.

https://lucidworks.com/blog/indexing-with-solrj/


> Solr indexing hangs if encounters an certain XML parse error
> ------------------------------------------------------------
>
>                 Key: SOLR-7764
>                 URL: https://issues.apache.org/jira/browse/SOLR-7764
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>    Affects Versions: 4.7.2
>         Environment: Ubuntu 12.04.5 LTS
>            Reporter: Sorin Gheorghiu
>              Labels: indexing
>         Attachments: Solr_XML_parse_error_080715.txt
>
>
> BlueSpice (http://bluespice.com/) uses Solr to index documents for the 
> 'Extended search' feature.
> Solr hangs if during indexing certain error occurs:
> 8.7.2015 15:34:26
> ERROR
> SolrCore
> org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: XML parse error
> 8.7.2015 15:34:26
> ERROR
> SolrDispatchFilter
> null:org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: XML parse error



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to