Actually, that is the tutorial I followed. I'm still getting these errors.. this string, \xF0\x9F\x92\x83, is actually this character: 💃 I assume thats where the issue is. However I am unable to reproduce the error when manually inserting via /usr/bin/mysql.
I read this article, http://mzsanford.wordpress.com/2010/12/28/mysql-and-unicode/, he suggests that utf8_bin might resolve the issue. Other forums suggest that even though the default charset is set, the column charset has to be specifically set as well. I can't get passed the fact that MySQL pre 5.5 is only storing 1-3Bytes UTF instead of 1-4Bytes. j.sullivan wrote > Sumarlidason > > Hi > > The need to use utf8mb4 for web crawling should be fairly rare. If you are > using MySQL 5.5 or later and have a set up like this > http://nlp.solutions.asia/?p=180 you should be fine. > > James -- View this message in context: http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015480.html Sent from the Nutch - User mailing list archive at Nabble.com.

