Hi,
If I define in mapred-site.xml the property mapred.reduce.tasks to 1, how
many reduce tasks will actually run? I think it will run 2 and I don't know
why. But in a log that I've added, the two constructors of the
ReduceTask.java class will run ( ReduceTask() and ReduceTask(with
parameters) ).
I don't understand why ReduceTask() [with no parameters] willl run. Here's
the stacktrace that I get to understand the thread of execution of this
contructor.
[code]
java.lang.Exception:
at org.apache.hadoop.mapred.ReduceTask.<init>(ReduceTask.java:164)
at
org.apache.hadoop.mapred.LaunchTaskAction.readFields(LaunchTaskAction.java:62)
at
org.apache.hadoop.mapred.HeartbeatResponse.readFields(HeartbeatResponse.java:137)
at
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:237)
at
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:510)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:445)
[/code]
As you can see, the ReduceTask() comes from the Connection class. Can
anyone explain me what's the purpose of this thread of execution, and what's
the purpose of the Client class?
2 -
A ReduceTask is launched by a TaskTracker in a new child JVM, right?
3 -
A TaskTracker is a thread that can run several map and reduces at the the
same time, right?
Thanks,
--
Pedro