Thanks for the quick responses.  I raised the 2 parameters to 14 (figuring 
there might be other apps running - like Zookeeper - that might want some cores 
of their own).  This has made a qualitative difference - the System Monitor now 
shows much higher squiggly lines indicating better distribution of the job to 
the various cores.  However, the quantitative difference is insignificant - my 
job runs about 4% faster.  I hope I don't have to migrate to use the C++ API 
from Java.

Alan


-----Original Message-----
From: Mohamed Riadh Trad [mailto:[email protected]] 
Sent: Wednesday, September 15, 2010 10:24 AM
To: [email protected]; [email protected]
Subject: EXTERNAL:Re: Making optimum use of cores

Hi Christopher,

I ve been  Working @Sungard(Global Trading), I left 2 yeas ago... Hope you 
enjoy working in there...

When it comes to performance, you should rather use the C++ API. By fixing the 
maps slots per node to Virtual Cpus number per Node,  u can fully parallelize 
jobs.. and use 16000% of the Nehalem CPU.

Regards,


Le 15 sept. 2010 à 16:00, <[email protected]> 
<[email protected]> a écrit :

> It seems likely that you are only running one (single-threaded) map or reduce 
> operation per worker node. Do you know whether you are in fact running 
> multiple operations?
> 
> This also sounds like it may be a manifestation of a question that I have 
> seen a lot on the mailing lists lately, which is that people do not know how 
> to increase the number of task slots in their tasktracker configuration.  
> This setting is normally controlled via the setting 
> mapred.tasktracker.{map|reduce}.tasks.maximum in mapred-site.xml.  The 
> default of 2 each is probably too low for your servers.
> 
> 
> ----- Original Message -----
> From: Ratner, Alan S (IS) <[email protected]>
> To: [email protected] <[email protected]>
> Sent: Wed Sep 15 09:47:47 2010
> Subject: Making optimum use of cores
> 
> I'm running Hadoop 0.20.2 on a cluster of servers running Ubuntu 10.4.
> Each server has 2 quad-core Nehalem CPUs for a total of 8 physical cores
> running as 16 virtual cores.  Ubuntu's System Monitor displays 16
> squiggly lines showing usage of the 16 virtual cores.  We only seem to
> be making use of one of the 16 virtual cores on any slave node and even
> on the master node only one virtual core is significantly busy at a
> time.  Is there a way to make better use of the cores?  Presumably I
> could run Hadoop in a VM assigned to each virtual core but I would think
> there must be a more elegant solution.
> 
> Alan Ratner
> 
> 

Reply via email to