Hello, I've been trying to run the WordCount example provided on the website on
my Windows 10 machine. I have built the latest hadoop version (2.7.3)
successfully and I want to run the code on the Local (Standalone) Mode. Thus, I
have not specified any configurations, apart from setting the JAVA_HOME path in
the "hadoop-env.cmd" file. When I try to run the WordCount file it fails to run
the Reduce task but it completes the Map tasks. I get the following output:
D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount>hadoop
jar wc.jar WordCount
D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\input
D:\Programs\hadoop-2.7.3-src\hadoop-dist\target\hadoop-2.7.3\WordCount\output
17/02/22 18:40:43 INFO Configuration.deprecation: session.id is deprecated.
Instead, use dfs.metrics.session-id 17/02/22 18:40:43 INFO jvm.JvmMetrics:
Initializing JVM Metrics with processName=JobTracker, sessionId= 17/02/22
18:40:43 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing
not performed. Implement the Tool interface and execute your application with
ToolRunner to remedy this. 17/02/22 18:40:43 WARN
mapreduce.JobResourceUploader: No job jar file set.
User classes may not be found. See Job or Job#setJar(String). 17/02/22
18:40:44 INFO input.FileInputFormat: Total input paths to process : 2
17/02/22 18:40:44 INFO mapreduce.JobSubmitter: number of splits:2 17/02/22
18:40:44 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_local334410887_0001 17/02/22 18:40:45 INFO mapreduce.Job: The url to
track the job: http://localhost:8080/ 17/02/22 18:40:45 INFO mapreduce.Job:
Running job: job_local334410887_0001 17/02/22 18:40:45 INFO
mapred.LocalJobRunner: OutputCommitter set in config null 17/02/22 18:40:45
INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/02/22 18:40:45 INFO mapred.LocalJobRunner: OutputCommitter is
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/02/22 18:40:45
INFO mapred.LocalJobRunner: Waiting for map tasks 17/02/22 18:40:45 INFO
mapred.LocalJobRunner: Starting task: attempt_local334410887_0001_m_000000_0
17/02/22 18:40:45 INF
O output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/02/22 18:40:45 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree
currently is supported only on Linux. 17/02/22 18:40:45 INFO mapred.Task:
Using ResourceCalculatorProcessTree :
org.apache.hadoop.yarn.util.WindowsBasedProcessTree@3019d00f 17/02/22
18:40:45 INFO mapred.MapTask: Processing split:
file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file02:0+27
17/02/22 18:40:45 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/02/22 18:40:45 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/02/22 18:40:45 INFO mapred.MapTask: soft limit at 83886080 17/02/22
18:40:45 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 17/02/22
18:40:45 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 17/02/22
18:40:45 INFO mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer 17/02/22 18:40:45 INFO
mapred.LocalJobRunner: 17/02/22 18:40:45 INFO mapred.MapTask: Starting flush
of map output 17/02/22 18:40:45 INFO mapred.MapTask: Spilling map output
17/02/22 18:40:45 INFO mapred.MapTask: bufstart = 0; bufend
= 44; bufvoid = 104857600 17/02/22 18:40:45 INFO mapred.MapTask: kvstart =
26214396(104857584); kvend = 26214384(104857536); length = 13/6553600
17/02/22 18:40:45 INFO mapred.MapTask: Finished spill 0 17/02/22 18:40:45
INFO mapred.Task: Task:attempt_local334410887_0001_m_000000_0 is done. And is
in the process of committing 17/02/22 18:40:45 INFO mapred.LocalJobRunner:
map 17/02/22 18:40:45 INFO mapred.Task: Task
'attempt_local334410887_0001_m_000000_0' done. 17/02/22 18:40:45 INFO
mapred.LocalJobRunner: Finishing task: attempt_local334410887_0001_m_000000_0
17/02/22 18:40:45 INFO mapred.LocalJobRunner: Starting task:
attempt_local334410887_0001_m_000001_0 17/02/22 18:40:46 INFO
output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree
currently is supported only on Linux. 17/02/22 18:40:46 INFO mapred.Task:
Using ResourceCalculatorProcessTree :
org.apache.hadoop.yarn.util.WindowsBasedProcessTree@39ef3a7 17/02/22
18:40:46 INFO mapred.MapTask: Processing split:
file:/D:/Programs/hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/WordCount/input/file01:0+25
17/02/22 18:40:46 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/02/22 18:40:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/02/22 18:40:46 INFO mapred.MapTask: soft limit at 83886080 17/02/22
18:40:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 17/02/22
18:40:46 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 17/02/22
18:40:46 INFO mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer 17/02/22 18:40:46 INFO
mapred.LocalJobRunner: 17/02/22 18:40:46 INFO mapred.MapTask: Starting flush
of map output 17/02/22 18:40:46 INFO mapred.MapTask: Spilling map output
17/02/22 18:40:46 INFO mapred.MapTask: bufstart = 0; bufend =
42; bufvoid = 104857600 17/02/22 18:40:46 INFO mapred.MapTask: kvstart =
26214396(104857584); kvend = 26214384(104857536); length = 13/6553600
17/02/22 18:40:46 INFO mapred.MapTask: Finished spill 0 17/02/22 18:40:46
INFO mapred.Task: Task:attempt_local334410887_0001_m_000001_0 is done. And is
in the process of committing 17/02/22 18:40:46 INFO mapred.LocalJobRunner:
map 17/02/22 18:40:46 INFO mapreduce.Job: Job job_local334410887_0001
running in uber mode : false 17/02/22 18:40:46 INFO mapred.Task: Task
'attempt_local334410887_0001_m_000001_0' done. 17/02/22 18:40:46 INFO
mapreduce.Job:
map 100% reduce 0% 17/02/22 18:40:46 INFO mapred.LocalJobRunner: Finishing
task: attempt_local334410887_0001_m_000001_0 17/02/22 18:40:46 INFO
mapred.LocalJobRunner: map task executor complete. 17/02/22 18:40:46 INFO
mapred.LocalJobRunner: Waiting for reduce tasks 17/02/22 18:40:46 INFO
mapred.LocalJobRunner: Starting task: attempt_local334410887_0001_r_000000_0
17/02/22 18:40:46 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 1 17/02/22 18:40:46 INFO util.ProcfsBasedProcessTree:
ProcfsBasedProcessTree currently is supported only on Linux. 17/02/22
18:40:46 INFO mapred.Task:
Using ResourceCalculatorProcessTree :
org.apache.hadoop.yarn.util.WindowsBasedProcessTree@13ac822f 17/02/22
18:40:46 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin:
org.apache.hadoop.mapreduce.task.reduce.Shuffle@6c4d20c4 17/02/22 18:40:46
INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=334338464,
maxSingleShuffleLimit=83584616, mergeThreshold=220663392, ioSortFactor=10,
memToMemMergeOutputsThreshold=10 17/02/22 18:40:46 INFO reduce.EventFetcher:
attempt_local334410887_0001_r_000000_0 Thread started: EventFetcher for
fetching Map Completion Events 17/02/22 18:40:46 INFO mapred.LocalJobRunner:
reduce task executor complete. 17/02/22 18:40:46 WARN mapred.LocalJobRunner:
job_local334410887_0001 java.lang.Exception:
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle
in localfetcher#1
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in localfetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Caused by:
java.io.FileNotFoundException:
D:/tmp/hadoop-Vasil%20Grigorov/mapred/local/localRunner/Vasil%20Grigorov/jobcache/job_local334410887_0001/attempt_local334410887_0001_m_000000_0/output/file.out.index
at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:200)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
at
org.apache.hadoop.io.SecureIOUtils.openFSDataInputStream(SecureIOUtils.java:156)
at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:71)
at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:62)
at org.apache.hadoop.mapred.SpillRecord. (SpillRecord.java:57)
at
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:124)
at
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102)
at
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85)
17/02/22 18:40:47 INFO mapreduce.Job: Job job_local334410887_0001 failed with
state FAILED due to: NA 17/02/22 18:40:47 INFO mapreduce.Job: Counters: 18
File System Counters
FILE: Number of bytes read=1158
FILE: Number of bytes written=591978
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=2
Map output records=8
Map output bytes=86
Map output materialized bytes=89
Input split bytes=308
Combine input records=8
Combine output records=6
Spilled Records=6
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
Total committed heap usage (bytes)=574095360
File Input Format Counters
Bytes Read=52
I have followed every tutorial available and looked for a potention solution
to the error I get, but I have been unsuccessful. As I mentioned before, I have
not set any further configurations to any files because I want to run it in
Standalone mode, rather than pseudo-distributed or fully distributed mode. I've
spent a lot of time and effort to get this far and I've hit a brick wall with
this error, so any help would be GREATLY appreciated.
Thank you in advance!