Hi all, 
I am trying to dump the content by the segment reader(bin/nutch -dump). The
output text contain 2 encoding, utf-8 and a multi-byte character-encoding.
When I open the dump page, I found the multi-byte encoding is broken - even
I convert to the correct encoding, the text displayed is broken. How can I
fix the text?

Thank you.
-- 
View this message in context: 
http://www.nabble.com/Broken-crawled-content--tp16246942p16246942.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to