Re: detailed Error reporting in Solr

Walter Underwood Fri, 05 Apr 2013 10:27:33 -0700

It is not a bug. XML parsers are required to reject documents with undefined 
character entities.


Try parsing it as HTML or XHTML.

wunder

On Apr 4, 2013, at 11:14 AM, eShard wrote:

> Yes, that's it exactly.
> I crawled a link with these (&nbsp;&rsaquo;) in each list item and solr
> couldn't handle it threw the xml parse error and the crawler terminated the
> job.
> 
> Is this fixable? Or do I have to submit a bug to the tika folks?
> 
> Thanks,
>

Re: detailed Error reporting in Solr

Reply via email to