[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3934:
---------------------------------------

    Attachment: HDFS-3934.010.patch

bq. It'd be nice to continue to use the HostsFileReader and post-process the 
result. Otherwise it's a consistency/maintenance to copy-n-paste any new 
parsing functionality.

OK, I'll use the {{HostsFileReader}} parsing code.

bq. Why does the reader need to instantiate dummy DatanodeID?

You're right.  Re-using {{DatanodeID}} for this purpose doesn't reall ymake 
sense.  I created a new type called {{HostFileManager#Entry}} to represent host 
file entries.

bq. It appears to be for repeatedly making the somewhat fragile assumption that 
xferAddr is ipAddr+port? If that relationship changes, we've got a problem...

Fixed to use getIpAddr() + ":" + getXferPort() in all cases.

bq. Patch appears to have dropped supported for the node's registration name. 
Eli Collins wanted me to maintain that feature in HDFS-3990. If we need to keep 
it, doing a lookup and a canonical lookup (can trigger another dns lookup) 
isn't compatible with supporting the reg name.

Thanks for pointing this out.  I talked to Eli and he explained the distinction 
between registration names and hostnames to me.  I added back support for 
"registration names" and added a unit test to ensure this works properly.

bq. Doing a lookup followed by getCanonicalName is a bad idea. It does 2 more 
lookups: hostname -> PTR -> A so it can resolve CNAMES to IP to hostname. With 
this change I think it will cause 3 lookups per host.

One key feature of this change is that all the lookups happen when the include 
and exclude files are read.  *No* lookups happen during 
{{DatanodeManager#getDatanodeListForReport}}, or any of the other cases where 
we check the host file entries.

On the advice of Eli, I removed the call to {{getCanonicalName}}.  We can just 
use the name the user specified in the hosts file; that should be fine.

bq. Question about "// If no transfer port was specified, we take a guess". Why 
needed, and what are the ramifications for getting this wrong? Just a display 
issue?

We just don't have the information.  If the datanode is dead, we only know what 
the entry says in the hosts file(s).  If the entries don't have the port, we 
have to guess.  I don't see any way around this.  It might be more elegant if 
the web UI could understand the concept of "port is unknown," but adding that 
seems out of scope.

In addition to the unit tests, I did some manual testing on this and verified 
that it got rid of the double-counting of nodes in the web UI for me.
                
> duplicative dfs_hosts entries handled wrong
> -------------------------------------------
>
>                 Key: HDFS-3934
>                 URL: https://issues.apache.org/jira/browse/HDFS-3934
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.1-alpha
>            Reporter: Andy Isaacson
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
> HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, 
> HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, 
> HDFS-3934.010.patch
>
>
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
> hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
> after the NN restarts because {{getDatanodeListForReport}} does not handle 
> such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the 
> knowledge to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" 
> loop does not do hostname lookups, so does not know that the IP and hostname 
> refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
> the JSP output:  The *Node* column shows ":50010" as the nodename, with HTML 
> markup {{<a 
> href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&amp;dir=%2F&amp;nnaddr=172.29.97.196:8020";
>  title="172.29.97.216:50010">:50010</a>}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to