> > How much RAM do you have?
> >
> > A good rule of thumb is to use 1-1.5G for maps and 2G per reduce
> > (vmem). Ensure your OS has at least 2G of memory.
> >
> > Thus, with 24G and dual quad cores you should be at 8-10m/2r. Scale
> up
> > if you have more memory.
> 
> Would you say RAM was the main factor? We currently have 1G heap per
> mapper.
> We had heard multiples of 1 disk / 2 core / 4G were good with slightly
> more slots for (mappers + reducers) than cores. Would you agree?
> Can you speak to how we should use hyperthreading, can I treat them as
> separate cores? (I know in virtualisation that the recommendation is
> to disable it but for some other workloads you get 2x performance
> improvement)
> 
> 
> Tom

Tom,

I can't speak for other virtualization vendors, but VMware does not recommend 
disabling HT.  Do you have a source that says otherwise (so we can fix it)?  
The benefit from HT running on vSphere is pretty much the same as what you get 
from the native OS.  I've never seen any workload on any platform that can get 
2X from HT, but I've seen as high as 1.5X.  I'm getting very good results 
running about one task per logical processor (2 per core).  Recent virtualized 
Hadoop performance results are here:  
http://www.vmware.com/resources/techresources/10222

Jeff

Reply via email to