Hi, when dumping segments with "bin/nutch readseg -dump ...", special characters of non-utf8 encoced pages are lost. For example the "รถ" (ö) is replaced by a "?"...
I am really in need of the dumped files with correct representation of special chars. How can I deal with this problem? Thanks Felix.