[ 
https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563847#action_12563847
 ] 

dhruba borthakur commented on HADOOP-1985:
------------------------------------------

A few comments related to HDFS changes:

0. This implementation could introduce more delay in the Namenode startup time. 
Reducing the namenode startup time has been a key concern in recent times. 
Maybe the ResolutionThread can invoke the resolution script with a bunch of 
datanodes (rather than one datanode at a time).
1. DatanodeProtocol.java: typo "registraction" instead of registration
2. The ResolutionThread should be renamed as "ResolutionMonitor" just to keep 
conformity with other static threads in FSNamesystem. Also, the thread should 
be created in FSNamesystem.initialize() just to keep conformity with the 
remainder of the code. The other threads (e.g. ReplicationMonitor) are created 
as Daemon, so shud be this new one too.
3. Typo in FSNamesystem: "NSToSwitchMapping reolution Thread"
4. If my memory serves me right, the current implementation depends on the fact 
that all the datanodes that are in FSNamesystem.datanodeMap *should* exist in 
FSNamesystem.host2DataNodeMap as well as in FSNamesystem.clusterMap. If we want 
to keep this invariant, then FSNamesystem.registerDatanode should insert the 
datanode in clusterMap (with a default rack setting). It will also insert the 
datanode in queue for the ResolutionThread. When the ResolutionThread operates 
on a datanode, it updates the networkLocation field of the datanode. This means 
that if a datanode is used before the resolution thread gets to it, it might 
return a default location, but this is ok and acceptable. If we adopt this 
approach, then the Resolution could be done much lazily, in fact, even towards 
the end of the SafeMode period, thereby reducing namenode restart times.


> Abstract node to switch mapping into a topology service class used by 
> namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1985
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs, mapred
>            Reporter: eric baldeschwieler
>            Assignee: Devaraj Das
>             Fix For: 0.17.0
>
>         Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch, 
> 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch, 
> 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch 
> location in both the namenode and job tracker.  Currently the namenode asks 
> the data nodes for this info and they run a local script to answer this 
> question.  In our environment and others that I know of there is no reason to 
> push this to each node.  It is easier to maintain a centralized script that 
> maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch 
> mappings and invokes a loadable class or a configurable system call to 
> resolve unknown DNS to switch mappings.  We can then add this to the namenode 
> to support the current block to switch mapping needs and simplify the data 
> nodes.  We can also add this same callout to the job tracker and then 
> implement rack locality logic there without needing to chane the filesystem 
> API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID, 
> it is also future compatible to future infrastructures that may derive 
> topology on the fly, etc, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to