Re: Idle nodes with terasort and MRv2/YARN (0.23.1)

Trevor Robinson Wed, 30 May 2012 13:39:30 -0700

Jeff:

Thanks for the corroboration and advice. I can't retreat to 0.20, and
must forge ahead with 2.0, so I'll share any progress.


Arun:

I haven't set the minimum container size. Do you know the default? Is
there a way to easily find defaults (more complete/reliable than
docs)?

Thanks, I'll give that a try. How does minimum container size relate
to settings mapreduce.map.memory.mb? Would it essentially raise my
768M map memory allocation to 1G? Does CapacityScheduler need any
additional configuration to function optimally? BTW, what is the
default scheduler?

I should have MAPREDUCE-3641. I'm using 0.23.1 with CDH4b2 patches
(and a few Java 7/Ubuntu 12.04 build patches). How does 2.0.0-alpha
compare to 0.23.1?

If there's anything I can do to assist with the issue of spreading out
map tasks, please let me know. Is there a JIRA issue for it (or if
not, should there be)?

Incidentally, my current benchmarking work on x86 is only a training
ground and baseline before moving onto ARM-based systems, which have
4GB RAM and generally fewer, smaller (2.5" form factor) disks per
node. It sounds like the smaller RAM will force better distribution,
but the disk capacity/utilization situation will be more severe.

Thanks,
Trevor

On Tue, May 29, 2012 at 6:21 PM, Arun C Murthy <a...@hortonworks.com> wrote:
> What is the minimum container size? i.e.
> yarn.scheduler.minimum-allocation-mb.
>
> I'd bump it up to at least 1G and use the CapacityScheduler for performance
> tests:
> http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>
> In case of teragen, the job has no locality at all (since it's just
> generating data from 'random' input-splits) and hence you are getting them
> stuck on fewer nodes since you have so many containers on each node.
>
> The reduces should be better spread if you are using CapacityScheduler and
> have https://issues.apache.org/jira/browse/MAPREDUCE-3641 in your build i.e.
> hadoop-0.23.1 or hadoop-2.0.0-alpha (I'd use the latter).
>
> Also, FYI, currently the CS makes the tradeoff that node-locality is almost
> same as rack-locality and hence you might see maps not spread out for
> terasort. I'll fix that one soon.
>
> hth,
> Arun
>
> On May 29, 2012, at 2:33 PM, Trevor Robinson wrote:
>
> Hello,
>
> I'm trying to tune terasort on a small cluster (4 identical slave
> nodes w/ 4 disks and 16GB RAM each), but I'm having problems with very
> uneven load.
>
> For teragen, I specify 24 mappers, but for some reason, only 2 nodes
> out of 4 run them all, even though the web UI (for both YARN and HDFS)
> shows all 4 nodes available. Similarly, I specify 16 reducers for
> terasort, but the reducers seem to run on 3 nodes out of 4. Do I have
> something configured wrong, or does the scheduler not attempt to
> spread out the load? In addition to performing sub-optimally, this
> also causes me to run out of disk space for large jobs, since the data
> is not being spread out evenly.
>
> Currently, I'm using these settings (not shown as XML for brevity):
>
> yarn-site.xml:
> yarn.nodemanager.resource.memory-mb=13824
>
> mapred-site.xml:
> mapreduce.map.memory.mb=768
> mapreduce.map.java.opts=-Xmx512M
> mapreduce.reduce.memory.mb=2304
> mapreduce.reduce.java.opts=-Xmx2048M
> mapreduce.task.io.sort.mb=512
>
> In case it's significant, I've scripted the cluster setup and terasort
> jobs, so everything runs back-to-back instantly, except that I poll to
> ensure that HDFS is up and has active data nodes before running
> teragen. I've also tried adding delays, but they didn't seem to have
> any effect, so I don't *think* it's a start-up race issue.
>
> Thanks for any advice,
> Trevor
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>

Re: Idle nodes with terasort and MRv2/YARN (0.23.1)

Reply via email to