James,

I ran a crawl using your db modifications, I'm still receiving the
exceptions breaking the crawl. Its hard to believe I'm the only one
receiving these errors. 

java.io.IOException: java.sql.BatchUpdateException: Incorrect string value:
'\xE2\x80\x8Butu...' for column 'id' at row 1
        at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
        at
org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)

Oddly now this time, it seems to be a 3 byte character, \xE2\x80\x8B ...
playing with these characters in osx is as easy as 123, and not so much in
windows. :(

My next thought is to tear open gora and do something code side to deal with
these special characters. I looked at the SqlStore.flush() function last
night, but couldn't get gora_sqlstore to compile : some unresolved
dependencies.

any additional thoughts?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015606.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to