Hi, I am a newbie to Hadoop. I am planning to run a sample hadoop streaming application for testing custom inputformat with streaming.
I use the following command: hadoop jar $HADOOP_HOME/hadoop-streaming.jar \ -input "/user/ahanda/input/images" \ -output "/user/ahanda/output" \ -mapper "org.apache.hadoop.mapred.lib.IdentityMapper" \ -reducer NONE \ -inputformat "MyFileInputFormat" \ -outputformat "org.apache.hadoop.mapred.SequenceFileAsBinaryOutputFormat" \ -archives hdfs://localhost:9000/user/ahanda/lib/myfile.jar#wholefile \ I get the following output : 09/06/07 22:26:35 ERROR streaming.StreamJob: Unexpected -archives while processing -input|-output|-mapper|-combiner|-reducer|-file|-dfs|-jt|-additionalconfspec|-inputformat|-outputformat|-partitioner|-numReduceTasks|-inputreader|-mapdebug|-reducedebug|||-cacheFile|-cacheArchive|-verbose|-info|-debug|-inputtagged|-help Usage: $HADOOP_HOME/bin/hadoop jar \ $HADOOP_HOME/hadoop-streaming.jar [options] <rest of the output about specifing options> MyFileInputFormat is my own custom input format which i have put in myfile.jar and copied it to /user/ahanda/lib directory. I am using hadoop 0.20.0 version. Can somebody help ? Am i missing something ? Thanks, Amit