Hello,
I'm using Hadoop 0.17.0 to analyze some large amount of CSV files.

And I need to read such files in different character encoding from UTF-8,
but I think TextInputFormat doesn't support such character encoding.

I guess LineRecordReader class or Text class should support encoding
settings like this.
 conf.set("io.file.defaultEncoding", "MS932");

Is there any plan to supoort different character encoding in
TextInputFormat?

Regards,
-- 
NOMURA Yoshihide:
    Software Innovation Laboratory, Fujitsu Labs. Ltd., Japan
    Tel: 044-754-2675 (Ext: 7112-6358)
    Fax: 044-754-2570 (Ext: 7112-3834)
    E-Mail: [EMAIL PROTECTED]

Reply via email to