James,
I ran a crawl using your db modifications, I'm still receiving the
exceptions breaking the crawl. Its hard to believe I'm the only one
receiving these errors.
java.io.IOException: java.sql.BatchUpdateException: Incorrect string value:
'\xE2\x80\x8Butu...' for column 'id' at row 1
at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
at
org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
Oddly now this time, it seems to be a 3 byte character, \xE2\x80\x8B ...
playing with these characters in osx is as easy as 123, and not so much in
windows. :(
My next thought is to tear open gora and do something code side to deal with
these special characters. I looked at the SqlStore.flush() function last
night, but couldn't get gora_sqlstore to compile : some unresolved
dependencies.
any additional thoughts?
--
View this message in context:
http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015606.html
Sent from the Nutch - User mailing list archive at Nabble.com.