Re: Maps running - how to increase?

Zeev Milin Thu, 06 Aug 2009 11:42:01 -0700

Thanks Aaron,

I changed the settings in hadoop-site.xml file on all the machines. BTW,
some settings are only reflected on the job level when I change the
hadoop-default file, not sure why hadoop-site is being ignored (ex:
mapred.tasktracker.map.tasks.maximum).


The files I am trying load are fairly small (~4MB on average). The
configuration of each machine is: 2 dual cores (Xeon, 2.33Ghz), 8GB ram and
a local SCSI hard drive. (total of 6 nodes)

I will look into the article you mentioned, I understand that to load the
files is going to be slow, was just wondering why the machines are not being
utilized and mostly idle when more maps can be run in parallel. Maps running
is always 6.

Another option is to load one 20GB file but currently the speed is fairly
slow in my opinion: 1GB in 1.5min. What kind of tuning can be done to
speedup the load into hdfs? If you have any recommendation for specific
parameters that might help it will be great.

Thanks,
Zeev

Re: Maps running - how to increase?

Reply via email to