Re: SolrException: An invalid XML character (Unicode: 0xffffffff) was found in the element content of the document.

neeraj Sun, 17 Mar 2013 19:31:45 -0700

Amuseme,

   Thanks for the reply. I reviewed the exceptions given on the link and I
am not getting any of those. I have more than 5 million documents crawled
and was able to index 120 K documents to Solr before this exception occurred
for invalid XML character.


I was trying to investigate around this issue and found that there are
previous posts on the same topic where the patch was being applied to
stripNonCharCodepoints(). But that is already part of Nutch 1.6 and I am
still getting the same exception.

My "parser.character.encoding.default" was set to windows-1252 when crawling
all these documents. Could that have let to this exception when indexing?

Any insight on this will be helpful.

Thanks,
Neeraj.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-1-6-Need-help-with-Indexing-tp4048290p4048391.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: SolrException: An invalid XML character (Unicode: 0xffffffff) was found in the element content of the document.

Reply via email to