[
https://issues.apache.org/jira/browse/HBASE-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Robertson resolved HBASE-6111.
----------------------------------
Resolution: Won't Fix
Closing this, as it is an issue in our environment that we're investigating and
also using a version of code that is rather different to current trunk (e.g.
trunk TableInputFormatBase does a reverseDNS(...) whereas the old one simply
uses what is in the InetSocketAddress host address).
> CLONE - Map tasks not local to RS
> ---------------------------------
>
> Key: HBASE-6111
> URL: https://issues.apache.org/jira/browse/HBASE-6111
> Project: HBase
> Issue Type: Bug
> Components: mapred, master, regionserver
> Affects Versions: 0.20.2, 0.90.4
> Environment: DN, TT and RS running on the same nodes, all using CDH3.
> Ganglia monitoring everything.
> Reporter: Tim Robertson
>
> I have started seeing this issue in our environment. HBASE-1672 was closed
> as non reproducible, so I cloned it here.
> I have a 367M record table, compressed with snappy, and running a vanilla MR
> SCAN with no filters spawns 441 Mappers. The cluster currently has 216 slots
> for mappers, and the first wave all report 100% data-local mappers. As the
> second wave of mappers come up they don't get run locally to the RS and data
> locality drops.
> This kills our environment, as it saturates the network at 120M which is very
> clear on ganglia.
> I am really happy to help diagnose this, but need some guidance on what to
> do. I don't know enough yet about how task assignment works in MR to
> determine why the machines are picking up random tasks for their second
> effort and not one for the local RS.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira