According to the hadoop streaming docs<http://hadoop.apache.org/common/docs/r0.20.0/streaming.html#Working+with+the+Hadoop+Aggregate+Package+%28the+-reduce+aggregate+option%29>, there is an inbuilt Aggregate Java class which can work both as a mapper and a reducer.
Here is the command: *shell> hadoop jar hadoop-streaming.jar -file mapper.py -mapper mapper.py -combiner aggregate -reducer NONE -input input_files -output output_path* Executing this command fails the mapper with this error: *java.io.IOException: Cannot run program "aggregate": java.io.IOException: error=2, No such file or directory* However, if you run this command using aggregate as the reducer and not the combiner, the job works fine. *shell> hadoop jar hadoop-streaming.jar -file mapper.py -mapper mapper.py -reduce aggregate -input input_files -output output_path* What am I doing wrong? Is aggregate treated as a command and not a JavaClassName? If yes, how do I use the JavaClassName instead? -- Regards, Premal Shah.
