Sumarlidason Hi
The need to use utf8mb4 for web crawling should be fairly rare. If you are using MySQL 5.5 or later and have a set up like this http://nlp.solutions.asia/?p=180 you should be fine. James -----Original Message----- From: sumarlidason [mailto:[email protected]] Sent: Wednesday, October 24, 2012 3:22 AM To: [email protected] Subject: Re: nutch/hadoop/solr Running a "deeper" crawl revealed another problem, research has lead me to beleive I need to upgrade to latest MySQL Server Version and use utf8mb4 encoding. Does this sound 'Etch-a-Sketchy' to you? java.io.IOException: java.sql.BatchUpdateException: Incorrect string value: '\xF0\x9F\x92\x83Li...' for column 'text' at row 1 at org.apache.gora.sql.store.SqlStore.flush(SqlStore.java:340) at org.apache.gora.sql.store.SqlStore.close(SqlStore.java:185) at org.apache.gora.mapreduce.GoraRecordWriter.close(GoraRecordWriter.java:55) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:540) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child$4.run(Child.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.Child.main(Child.java:260) Caused by: java.sql.BatchUpdateException: Incorrect string value: '\xF0\x9F\x92\x83Li...' for column 'text' at row 1 at com.mysql.jdbc.PreparedStatement.exe -- View this message in context: http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015415.html Sent from the Nutch - User mailing list archive at Nabble.com.

