Running a "deeper" crawl revealed another problem, research has lead me to
beleive I need to upgrade to latest MySQL Server Version and use utf8mb4
encoding.
Does this sound 'Etch-a-Sketchy' to you?
java.io.IOException: java.sql.BatchUpdateException: Incorrect string value:
'\xF0\x9F\x92\x83Li...' for column 'text' at row 1
at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340)
at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185)
at
org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:540)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
at org.apache.hadoop.mapred.Child.main(Child.java:260)
Caused by: java.sql.BatchUpdateException: Incorrect string value:
'\xF0\x9F\x92\x83Li...' for column 'text' at row 1
at com.mysql.jdbc.PreparedStatement.exe
--
View this message in context:
http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015415.html
Sent from the Nutch - User mailing list archive at Nabble.com.