Hello,
I'm using Hadoop 0.17.0 to analyze some large amount of CSV files.
And I need to read such files in different character encoding from UTF-8,
but I think TextInputFormat doesn't support such character encoding.
I guess LineRecordReader class or Text class should support encoding
settings like this.
conf.set("io.file.defaultEncoding", "MS932");
Is there any plan to supoort different character encoding in
TextInputFormat?
Regards,
--
NOMURA Yoshihide:
Software Innovation Laboratory, Fujitsu Labs. Ltd., Japan
Tel: 044-754-2675 (Ext: 7112-6358)
Fax: 044-754-2570 (Ext: 7112-3834)
E-Mail: [EMAIL PROTECTED]