Hello,

Am 27.06.2011 um 12:40 schrieb Markus Jelsma:

> Hi,
> 
> I came across the indexing error below. It happened in a huge batch update 
> from Nutch with SolrJ 3.1. Since the crawl was huge it is very hard to trace 
> the error back to a specific document. So i try my luck here: anyone seen 
> this 
> before with SolrJ 3.1? Anything else on the Nutch part i should have taken 
> care off?
> 
> Thanks!
> 
> 
> Jun 27, 2011 10:24:28 AM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/update params={wt=javabin&version=2} status=500 
> QTime=423 
> Jun 27, 2011 10:24:28 AM org.apache.solr.common.SolrException log
> SEVERE: java.lang.RuntimeException: [was class 
> java.io.CharConversionException] Invalid UTF-8 character 0xffff at char 
> #1142033, byte #1155068)
>       at 
> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)

and loads of other rubbish and 

>       ... 26 more


I see this as a problem of solr error-reporting. This is not only obnoxiously 
"loud" (white on grey with oversized fonts), but less useful than it should be.
Instead of telling the user where the error occurred (i.e. while reading which 
file, which column at which line) it unravels the stack. This is useless if the 
program just choked on some unexpected input, like a typo in a schema of config 
file or an invalid character in a file to be indexed.
I don't know if this is due to the Tomcat, the logging system of solr itself, 
but it is annoying.

And yes, I've seen something like this before and found the error not by 
inspecting solr but by opening the suspected files with an appropriate browser 
(e.g. Firefox) which tells me exactly where something goes wrong.

All the best
Thomas

Reply via email to