[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
-------------------------------

    Attachment: D1413.1.patch

Liyin requested code review of "[jira][HBASE-5259] Normalize the RegionLocation 
in TableInputFormat by the reverse DNS lookup.".
Reviewers: Kannan, Karthik, mbautin

  Assuming the HBase and MapReduce running in the same cluster, the 
TableInputFormat is to override the split function which divides all the 
regions from one particular table into a series of mapper tasks. So each mapper 
task can process a region or one part of a region. Ideally, the mapper task 
should run on the same machine on which the region server hosts the 
corresponding region. That's the motivation that the TableInputFormat sets the 
RegionLocation so that the MapReduce framework can respect the node locality.

  The code simply set the host name of the region server as the 
HRegionLocation. However, the host name of the region server may have different 
format with the host name of the task tracker (Mapper task). The task tracker 
always gets its hostname by the reverse DNS lookup. And the DNS service may 
return different host name format. For example, the host name of the region 
server is correctly set as a.b.c.d while the reverse DNS lookup may return 
a.b.c.d. (With an additional doc in the end).

  So the solution is to set the RegionLocation by the reverse DNS lookup as 
well. No matter what host name format the DNS system is using, the 
TableInputFormat has the responsibility to keep the consistent host name format 
with the MapReduce framework.

TEST PLAN
  running all the unit tests

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/2937/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.

                
> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-5259
>                 URL: https://issues.apache.org/jira/browse/HBASE-5259
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
> D1413.1.patch
>
>
> Assuming the HBase and MapReduce running in the same cluster, the 
> TableInputFormat is to override the split function which divides all the 
> regions from one particular table into a series of mapper tasks. So each 
> mapper task can process a region or one part of a region. Ideally, the mapper 
> task should run on the same machine on which the region server hosts the 
> corresponding region. That's the motivation that the TableInputFormat sets 
> the RegionLocation so that the MapReduce framework can respect the node 
> locality. 
> The code simply set the host name of the region server as the 
> HRegionLocation. However, the host name of the region server may have 
> different format with the host name of the task tracker (Mapper task). The 
> task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
> service may return different host name format. For example, the host name of 
> the region server is correctly set as a.b.c.d while the reverse DNS lookup 
> may return a.b.c.d. (With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as 
> well. No matter what host name format the DNS system is using, the 
> TableInputFormat has the responsibility to keep the consistent host name 
> format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to