WordCount MapReduce error

Васил Григоров Wed, 22 Feb 2017 11:24:11 -0800

Hello, I've been trying to run the WordCount example provided on the website on 
my Windows 10 machine. I have built the latest hadoop version (2.7.3) 
successfully and I want to run the code on the Local (Standalone) Mode. Thus, I 
have not specified any configurations, apart from setting the JAVA_HOME path in 
the "hadoop-env.cmd" file. When I try to run the WordCount file it fails to run 
the Reduce task but it completes the Map tasks. I get the following output:


   
    
D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount>hadoop 
jar wc.jar WordCount 
D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\input 
D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\output   
 17/02/22 18:40:43 INFO Configuration.deprecation: session.id is deprecated. 
Instead, use dfs.metrics.session-id    17/02/22 18:40:43 INFO jvm.JvmMetrics: 
Initializing JVM Metrics with processName=JobTracker, sessionId=    17/02/22 
18:40:43 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing 
not performed. Implement the Tool interface and execute your application with 
ToolRunner to remedy this.    17/02/22 18:40:43 WARN 
mapreduce.JobResourceUploader: No job jar file set. 
User classes may not be found. See Job or Job#setJar(String).    17/02/22 
18:40:44 INFO input.FileInputFormat: Total input paths to process : 2    
17/02/22 18:40:44 INFO mapreduce.JobSubmitter: number of splits:2    17/02/22 
18:40:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_local334410887_0001    17/02/22 18:40:45 INFO mapreduce.Job: The url to 
track the job: http://localhost:8080/    17/02/22 18:40:45 INFO mapreduce.Job: 
Running job: job_local334410887_0001    17/02/22 18:40:45 INFO 
mapred.LocalJobRunner: OutputCommitter set in config null    17/02/22 18:40:45 
INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1   
 17/02/22 18:40:45 INFO mapred.LocalJobRunner: OutputCommitter is 
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter    17/02/22 18:40:45 
INFO mapred.LocalJobRunner: Waiting for map tasks    17/02/22 18:40:45 INFO 
mapred.LocalJobRunner: Starting task: attempt_local334410887_0001_m_000000_0    
17/02/22 18:40:45 INF
 O output.FileOutputCommitter: File Output Committer Algorithm version is 1    
17/02/22 18:40:45 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree 
currently is supported only on Linux.    17/02/22 18:40:45 INFO mapred.Task: 
Using ResourceCalculatorProcessTree : 
org.apache.hadoop.yarn.util.WindowsBasedProcessTree@3019d00f    17/02/22 
18:40:45 INFO mapred.MapTask: Processing split: 
file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file02:0+27
    17/02/22 18:40:45 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)  
  17/02/22 18:40:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100    
17/02/22 18:40:45 INFO mapred.MapTask: soft limit at 83886080    17/02/22 
18:40:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600    17/02/22 
18:40:45 INFO mapred.MapTask: kvstart = 26214396; length = 6553600    17/02/22 
18:40:45 INFO mapred.MapTask: Map output collector class = 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer    17/02/22 18:40:45 INFO 
mapred.LocalJobRunner:    17/02/22 18:40:45 INFO mapred.MapTask: Starting flush 
of map output    17/02/22 18:40:45 INFO mapred.MapTask: Spilling map output    
17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufend 
 = 44; bufvoid = 104857600    17/02/22 18:40:45 INFO mapred.MapTask: kvstart = 
26214396(104857584); kvend = 26214384(104857536); length = 13/6553600    
17/02/22 18:40:45 INFO mapred.MapTask: Finished spill 0    17/02/22 18:40:45 
INFO mapred.Task: Task:attempt_local334410887_0001_m_000000_0 is done. And is 
in the process of committing    17/02/22 18:40:45 INFO mapred.LocalJobRunner: 
map    17/02/22 18:40:45 INFO mapred.Task: Task 
'attempt_local334410887_0001_m_000000_0' done.    17/02/22 18:40:45 INFO 
mapred.LocalJobRunner: Finishing task: attempt_local334410887_0001_m_000000_0   
 17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task: 
attempt_local334410887_0001_m_000001_0    17/02/22 18:40:46 INFO 
output.FileOutputCommitter: File Output Committer Algorithm version is 1    
17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree 
currently is supported only on Linux.    17/02/22 18:40:46 INFO mapred.Task: 
Using ResourceCalculatorProcessTree : 
org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39ef3a7    17/02/22 
18:40:46 INFO mapred.MapTask: Processing split: 
file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file01:0+25
    17/02/22 18:40:46 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)  
  17/02/22 18:40:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100    
17/02/22 18:40:46 INFO mapred.MapTask: soft limit at 83886080    17/02/22 
18:40:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600    17/02/22 
18:40:46 INFO mapred.MapTask: kvstart = 26214396; length = 6553600    17/02/22 
18:40:46 INFO mapred.MapTask: Map output collector class = 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer    17/02/22 18:40:46 INFO 
mapred.LocalJobRunner:    17/02/22 18:40:46 INFO mapred.MapTask: Starting flush 
of map output    17/02/22 18:40:46 INFO mapred.MapTask: Spilling map output    
17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufend =
  42; bufvoid = 104857600    17/02/22 18:40:46 INFO mapred.MapTask: kvstart = 
26214396(104857584); kvend = 26214384(104857536); length = 13/6553600    
17/02/22 18:40:46 INFO mapred.MapTask: Finished spill 0    17/02/22 18:40:46 
INFO mapred.Task: Task:attempt_local334410887_0001_m_000001_0 is done. And is 
in the process of committing    17/02/22 18:40:46 INFO mapred.LocalJobRunner: 
map    17/02/22 18:40:46 INFO mapreduce.Job: Job job_local334410887_0001 
running in uber mode : false    17/02/22 18:40:46 INFO mapred.Task: Task 
'attempt_local334410887_0001_m_000001_0' done.    17/02/22 18:40:46 INFO 
mapreduce.Job: 
map 100% reduce 0%    17/02/22 18:40:46 INFO mapred.LocalJobRunner: Finishing 
task: attempt_local334410887_0001_m_000001_0    17/02/22 18:40:46 INFO 
mapred.LocalJobRunner: map task executor complete.    17/02/22 18:40:46 INFO 
mapred.LocalJobRunner: Waiting for reduce tasks    17/02/22 18:40:46 INFO 
mapred.LocalJobRunner: Starting task: attempt_local334410887_0001_r_000000_0    
17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer 
Algorithm version is 1    17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree: 
ProcfsBasedProcessTree currently is supported only on Linux.    17/02/22 
18:40:46 INFO mapred.Task: 
Using ResourceCalculatorProcessTree : 
org.apache.hadoop.yarn.util.WindowsBasedProcessTree@13ac822f    17/02/22 
18:40:46 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: 
org.apache.hadoop.mapreduce.task.reduce.Shuffle@6c4d20c4    17/02/22 18:40:46 
INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464, 
maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10, 
memToMemMergeOutputsThreshold=10    17/02/22 18:40:46 INFO reduce.EventFetcher: 
attempt_local334410887_0001_r_000000_0 Thread started: EventFetcher for 
fetching Map Completion Events    17/02/22 18:40:46 INFO mapred.LocalJobRunner: 
reduce task executor complete.    17/02/22 18:40:46 WARN mapred.LocalJobRunner: 
job_local334410887_0001    java.lang.Exception: 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in localfetcher#1    
 
 
 
 at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)   
 
 
 
 
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)    
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error 
in shuffle in localfetcher#1    
 
 
 
 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)    
 
 
 
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)    
 
 
 
 at 
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
    
 
 
 
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)    
 
 
 
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)    
 
 
 
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
   
 
 
 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
   
 
 
 
 at java.lang.Thread.run(Thread.java:745)    Caused by: 
java.io.FileNotFoundException: 
D:/tmp/hadoop-Vasil%20Grigorov/mapred/local/localRunner/Vasil%20Grigorov/jobcache/job_local334410887_0001/attempt_local334410887_0001_m_000000_0/output/file.out.index
    
 
 
 
 at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:200)   
 
 
 
 
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)    
 
 
 
 at 
org.apache.hadoop.io.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:156)
    
 
 
 
 at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:71)    
 
 
 
 at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:62)    
 
 
 
 at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:57)    
 
 
 
 at 
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:124)
    
 
 
 
 at 
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102)
    
 
 
 
 at 
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85)  
  17/02/22 18:40:47 INFO mapreduce.Job: Job job_local334410887_0001 failed with 
state FAILED due to: NA    17/02/22 18:40:47 INFO mapreduce.Job: Counters: 18   
 
 
 
 
 File System Counters    
 
 
 
 
 
 
 
 FILE: Number of bytes read=1158    
 
 
 
 
 
 
 
 FILE: Number of bytes written=591978    
 
 
 
 
 
 
 
 FILE: Number of read operations=0    
 
 
 
 
 
 
 
 FILE: Number of large read operations=0    
 
 
 
 
 
 
 
 FILE: Number of write operations=0    
 
 
 
 Map-Reduce Framework    
 
 
 
 
 
 
 
 Map input records=2    
 
 
 
 
 
 
 
 Map output records=8    
 
 
 
 
 
 
 
 Map output bytes=86    
 
 
 
 
 
 
 
 Map output materialized bytes=89    
 
 
 
 
 
 
 
 Input split bytes=308    
 
 
 
 
 
 
 
 Combine input records=8    
 
 
 
 
 
 
 
 Combine output records=6    
 
 
 
 
 
 
 
 Spilled Records=6    
 
 
 
 
 
 
 
 Failed Shuffles=0    
 
 
 
 
 
 
 
 Merged Map outputs=0    
 
 
 
 
 
 
 
 GC time elapsed (ms)=0    
 
 
 
 
 
 
 
 Total committed heap usage (bytes)=574095360    
 
 
 
 File Input Format Counters    
 
 
 
 
 
 
 
 Bytes Read=52   
   I have followed every tutorial available and looked for a potention solution 
to the error I get, but I have been unsuccessful. As I mentioned before, I have 
not set any further configurations to any files because I want to run it in 
Standalone mode, rather than pseudo-distributed or fully distributed mode. I've 
spent a lot of time and effort to get this far and I've hit a brick wall with 
this error, so any help would be GREATLY appreciated.  
  Thank you in advance!

WordCount MapReduce error

Reply via email to