I want to try hadoop streaming example,so I execute following command: $hadoop jar /home/software/hadoop-2.2.0/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar -file shapetimemapper.rb -mapper shapetimemapper.rb -file shapetimereducer.rb -reducer shapetimereducer.rb -input ufo.tsv -output shapetime
but it raise following error infomation: 14/02/12 18:26:45 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead. 14/02/12 18:26:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable packageJobJar: [shapetimemapper.rb, shapetimereducer.rb, /home/hadoop/file:/home/software/temp/hadoop-unjar3584519780174586397/] [] /tmp/streamjob1442248093994422662.jar tmpDir=null 14/02/12 18:26:47 INFO client.RMProxy: Connecting to ResourceManager at master/172.11.12.6:8993 14/02/12 18:26:47 INFO client.RMProxy: Connecting to ResourceManager at master/172.11.12.6:8993 14/02/12 18:26:51 INFO mapred.FileInputFormat: Total input paths to process : 1 14/02/12 18:26:51 INFO mapreduce.JobSubmitter: number of splits:2 14/02/12 18:26:51 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 14/02/12 18:26:51 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/02/12 18:26:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392258126040_0001 14/02/12 18:26:54 INFO impl.YarnClientImpl: Submitted application application_1392258126040_0001 to ResourceManager at master/172.11.12.6:8993 14/02/12 18:26:54 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1392258126040_0001/ 14/02/12 18:26:54 INFO mapreduce.Job: Running job: job_1392258126040_0001 14/02/12 18:27:15 INFO mapreduce.Job: Job job_1392258126040_0001 running in uber mode : false 14/02/12 18:27:15 INFO mapreduce.Job: map 0% reduce 0% 14/02/12 18:28:38 INFO mapreduce.Job: map 100% reduce 0% 14/02/12 18:28:55 INFO mapreduce.Job: map 50% reduce 0% 14/02/12 18:29:02 INFO mapreduce.Job: Task Id : attempt_1392258126040_0001_m_000001_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 14/02/12 18:29:04 INFO mapreduce.Job: Task Id : attempt_1392258126040_0001_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 14/02/12 18:29:05 INFO mapreduce.Job: map 0% reduce 0% 14/02/12 18:32:18 INFO mapreduce.Job: map 100% reduce 0% 14/02/12 18:32:25 INFO mapreduce.Job: map 0% reduce 0% 14/02/12 18:32:25 INFO mapreduce.Job: Task Id : attempt_1392258126040_0001_m_000001_1, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 14/02/12 18:32:26 INFO mapreduce.Job: Task Id : attempt_1392258126040_0001_m_000000_1, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 14/02/12 18:34:55 INFO mapreduce.Job: map 50% reduce 0% 14/02/12 18:34:56 INFO mapreduce.Job: map 100% reduce 0% 14/02/12 18:34:58 INFO mapreduce.Job: Task Id : attempt_1392258126040_0001_m_000000_2, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 14/02/12 18:34:58 INFO mapreduce.Job: Task Id : attempt_1392258126040_0001_m_000001_2, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) 14/02/12 18:34:59 INFO mapreduce.Job: map 0% reduce 0% 14/02/12 18:39:56 INFO mapreduce.Job: map 50% reduce 0% 14/02/12 18:40:12 INFO mapreduce.Job: map 100% reduce 100% 14/02/12 18:40:15 INFO mapreduce.Job: Job job_1392258126040_0001 failed with state FAILED due to: Task failed task_1392258126040_0001_m_000001 Job failed as tasks failed. failedMaps:1 failedReduces:0 14/02/12 18:40:19 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 14/02/12 18:40:20 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:21 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:22 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:23 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:24 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:25 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:26 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:27 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:28 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:29 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:29 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 14/02/12 18:40:30 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:31 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:32 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:33 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:34 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:35 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:36 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:37 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:38 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:39 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:40 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 14/02/12 18:40:41 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:42 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:43 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:44 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:45 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:46 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:47 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:48 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:49 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:50 INFO ipc.Client: Retrying connect to server: master/172.11.12.6:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 14/02/12 18:40:50 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIMPLE) cause:java.io.IOException: java.net.ConnectException: Call From master/172.11.12.6 to master:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 14/02/12 18:40:50 ERROR streaming.StreamJob: Error Launching job : java.net.ConnectException: Call From master/172.11.12.6 to master:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused Streaming Command Failed! shapetimemapper.rb is follows: #!/usr/bin/env ruby pattern=Regexp.new /\d* ?((min)|(sec))/ while line=gets parts=line.split("\t") if parts.size==6 shape=parts[3].strip duration=parts[4].strip.downcase if !shape.empty? && !duration.empty? match=pattern.match(duration) time=/\d*/.match(match[0])[0] unit=match[1] time=Integer(time) time=time*60 if unit=="min" puts shape+"\t"+time.to_s end end end shapetimereducer.rb is follows: #!/usr/bin/env ruby current=nil min=0 max=0 mean=0 total=0 count=0 while line=gets word,time=line.split("\t") time=Integer(time) if word==current count=count+1 total=total+time min=time if time<min max=time if time>max else puts current+"\t"+min.to_s+" "+max.to_s+" "+(total/count).to_s if current current=word count=1 total=time min=time max=time end end puts current+"\t"+min.to_s+" "+max.to_s+" "+(total/count).to_s $ tail ufo.tsv | shapetimemapper.rb bash: $: command not found bash: shapetimemapper.rb: command not found $ tail ufo.tsv | ruby shapetimemapper.rb unknown 60 light 120 light 15 disk 1800 disk 40 oval 600 fireball 1200 circle 2700 other 300 triangle 15 $ tail ufo.tsv | ruby shapetimereducer.rb 20100807 20100810 20100810 20100810 20100826 20100826 20100826 20100826 20100701 20100828 20100828 20100828 20100828 20100827 20100828 20100827 20090424 20100820 20100820 20100820 20100821 20100826 20100826 20100826 20100827 20100827 20100827 20100827 20100818 20100821 20100821 20100821 20050502 20100824 20100824 20100824 I am puzzle with above hadoop error information,because if I execute shapetimemapper.rb and shapetimereducer.rb alone,they all can run well, why when I put shapetimemapper.rb and shapetimereducer.rb into hadoop,it run fail, anyone could give me some advice to make code run success? Thanks --------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. ---------------------------------------------------------------------------------------------------