My namenode and jobtracker are both on a machine that is a datanode and has
a tasktracker as well.  It is also less well outfitted than yours.

I have no problems, but my data is encrypted which might make the CPU/disk
trade-offs very different.


On 12/26/07 12:11 PM, "Jason Venner" <[EMAIL PROTECTED]> wrote:

> This seems to be more a function of input file size than anything. I had
> a single (uncompressed) 35gig input file Text,Value.
> 
> Jason Venner wrote:
>> I have been experimenting with that, and when I do, the master
>> saturates well before the slave nodes, and the jobs start experiencing
>> timeouts
>> 
>> The map task in question is the IdentityMapper, this job is a simple
>> merge sort, combining data by key where there are duplicate keys in
>> the input stream.
>> There is no swapping going on in my cluster, and the machines in
>> question are all 8 processor boxes, and the tasks.maximum was set to 6.
>> 
>> task_200712261033_0002_m_000078_0: Exception in thread "main"
>> java.net.SocketTimeoutException: timed out waiting for rpc response
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.ipc.Client.call(Client.java:484)
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.mapred.$Proxy0.getTask(Unknown Source)
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1747)
>> 07/12/26 10:48:03 INFO mapred.JobClient: Task Id :
>> task_200712261033_0002_m_000081_1, Status : FAILED
>> 

Reply via email to