Actually, that is the tutorial I followed.

I'm still getting these errors.. this string, \xF0\x9F\x92\x83, is actually
this character: 💃
I assume thats where the issue is. However I am unable to reproduce the
error when manually inserting via /usr/bin/mysql.

I read this article,
http://mzsanford.wordpress.com/2010/12/28/mysql-and-unicode/, he suggests
that utf8_bin might resolve the issue. Other forums suggest that even though
the default charset is set, the column charset has to be specifically set as
well.

I can't get passed the fact that MySQL pre 5.5 is only storing 1-3Bytes UTF
instead of 1-4Bytes.


j.sullivan wrote
> Sumarlidason
> 
> Hi
> 
> The need to use utf8mb4 for web crawling should be fairly rare. If you are
> using MySQL 5.5 or later and have a set up like this
> http://nlp.solutions.asia/?p=180 you should be fine. 
> 
> James





--
View this message in context: 
http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015480.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to