false alarm guys, thanks for the replies,
I do have 2 set as the task maximum, and it is utilizing 2 cores according to top. I must have caught it in between tasks or during the reduce, since i had only 1 reducer per node going on at the time.

hadoop-default.xml:
<property>
 <name>mapred.tasktracker.map.tasks.maximum</name>
 <value>2</value>
</property>

output from top:

top - 12:54:50 up 48 days, 16:19,  1 user,  load average: 2.60, 1.55, 0.66
Tasks:  80 total,   3 running,  77 sleeping,   0 stopped,   0 zombie
Cpu0 : 98.1%us, 1.6%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu1 : 95.8%us, 2.9%sy, 0.0%ni, 0.0%id, 1.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem:   1035160k total,  1019608k used,    15552k free,     1808k buffers
Swap:  2031608k total,      372k used,  2031236k free,   293612k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2469 root 25 0 410m 161m 10m R 44.5 15.9 0:40.40 java 2446 root 25 0 411m 161m 11m R 43.2 16.0 0:45.88 java

Alex Loddengaard wrote:
Elia, perhaps you can try changing "mapred.tasktracker.map.tasks.maximum"
and "mapred.tasktracker.reduce.tasks.maximum" to "4" in hadoop-site.xml in
hopes of getting better utilization.  It's strange to me that having these
both set to 2 only utilizes a single core, because I would imagine that any
modern OS scheduler would do a good job of core utilization.

Just a thought.

Alex

On Wed, Oct 8, 2008 at 12:52 AM, Taeho Kang <[EMAIL PROTECTED]> wrote:

First of all, "mapred.tasktracker.map.tasks.maximum" and
"mapred.tasktracker.reduce.tasks.maximum" are both set to 2 in
hadoop-default.xml file; this file is read before hadoop-site.xml file so
any properties that aren't set in hadoop-site.xml will follow the values
set
in hadoop-default.xml.
As for the question on why only one core is utilized...
I think it really depends on the process scheduling of the underlying OS.
It's not like two tasks (two JVM subprocesses spawned by the tasktracker)
will always run on independent cores as there are other processes which
need
one or more cores to be run.

By the way, what tools did you use to find out which tasks (or processes)
use which cores?

/Taeho


On Wed, Oct 8, 2008 at 1:01 PM, Alex Loddengaard
<[EMAIL PROTECTED]>wrote:

Taeho, I was going to suggest this change as well, but it's documented
that
"mapred.tasktracker.map.tasks.maximum" defaults to 2.  Can you explain
why
Elia is only having one core utilized when this config option is set to
2?
Here is the documentation I'm referring to:
<http://hadoop.apache.org/core/docs/r0.18.1/cluster_setup.html>

Alex

On Tue, Oct 7, 2008 at 8:27 PM, Taeho Kang <[EMAIL PROTECTED]> wrote:

You can have your node (tasktracker) running more than 1 task
simultaneously.
You may set "mapred.tasktracker.map.tasks.maximum" and
"mapred.tasktracker.reduce.tasks.maximum" properties found in
hadoop-site.xml file. You should change hadoop-site.xml file on all
your
slave nodes depending on how many cores each slave has. For example,
you
don't really want to have 8 tasks running at once on a 2 core machine.

/Taeho

On Wed, Oct 8, 2008 at 5:53 AM, Elia Mazzawi
<[EMAIL PROTECTED]>wrote:

hello,

I have some dual core nodes, and I've noticed hadoop is only running
1
instance, and so is only using 1 on the CPU's on each node.
is there a configuration to tell it to run more than once?
or do i need to turn each machine into 2 nodes?

Thanks.



Reply via email to