Hi, Has any one addressed the org.apache.hadoop.mapreduce.lib.input.TextInputFormat compatibility with hadoop streaming?
The new API generates the following exception when lunching pipes jobs with org.apache.hadoop.mapreduce.lib.input.TextInputFormat Input Format instead of org.apache.hadoop.mapred.TextInputFormat. Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.mapreduce.lib.input.TextInputFormat not org.apache.hadoop.mapred.InputFormat My problem with the deprecated classes stands in mapred.min.split.size and the Map Tasks number. I need to generate N Maps on splits of approximately a same size. However, by fixing the mapred.min.split.size to 20MB I get splits of 6 to 64 MB. Any suggestions? Trad Mohamed Riadh, M.Sc, Ing. PhD. student INRIA-TELECOM PARISTECH Office: 11-15 Phone: (33)-1 39 63 59 33 Fax: (33)-1 39 63 56 74 Email: Riadh.Trad(a)inria.fr Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/
