[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.

Colin Patrick McCabe (JIRA) Thu, 18 Sep 2014 19:54:52 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139909#comment-14139909
 ]


Colin Patrick McCabe commented on HDFS-6808:
--------------------------------------------

bq. i.e. -reconfig datanode start would trigger a reconfig on all datanodes

The client doesn't have the list of all datanodes.  I suppose it could ask the 
NameNode, but the NameNode's list might be partial or incomplete as well.  And 
what happens if we can't contact some of them?  This is too complex.

bq. (we can figure out the IPC port of the host based upon either DN registry 
or hdfs-site.xml, right? ) in the case of two on the same host, we'd fail.

How about assuming the default DataNode IPC port if the port is left off?  It's 
simpler, works for most cases, and doesn't require an RPC to the NameNode.  
Plus it avoids a lot of confusion when a single host has multiple DataNodes.  
The admin may not even be aware that this is happening (I've seen cases like 
this).  Perhaps the datanode that was "supposed" to be there died and now 
there's another one someone else started.  The client reading its XML file is 
even worse.  There's absolutely no guarantee that its XML file is the same as 
the one that the datanode is using.  Anything we can do to reduce confusion is 
good, and requiring port numbers for non-default ports is one of those things.

bq. This leaves the door open for: \-reconfig namenode host start

Right.  We should support NN reconfig eventually.

bq. Here's a fun experiment: go down to your support folks and ask them if 
there is a difference between 'refresh' and 'reconfig' with no prompting. 
What's the first thing they say? It'd be interesting to hear the results.

I think you are underestimating our support folks.  They're pretty aware of the 
various kinds of refresh operations that we support and what problems each 
command had / still has, sometimes more so than I am.  Reconfig is a new 
operation.  It is its own thing, not related to anything else.  That's why I 
asked Eddy to rename references to "decommissioning drives" in the original 
patch.  This isnt' a decom operation, nor is it a refresh operation.

> Add command line option to ask DataNode reload configuration.
> -------------------------------------------------------------
>
>                 Key: HDFS-6808
>                 URL: https://issues.apache.org/jira/browse/HDFS-6808
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-6808.000.combo.patch, HDFS-6808.000.patch, 
> HDFS-6808.001.combo.patch, HDFS-6808.001.patch, HDFS-6808.002.combo.patch, 
> HDFS-6808.002.patch, HDFS-6808.003.combo.txt, HDFS-6808.003.patch, 
> HDFS-6808.004.combo.patch, HDFS-6808.004.patch, HDFS-6808.005.combo.patch, 
> HDFS-6808.005.patch, HDFS-6808.006.combo.patch, HDFS-6808.006.patch, 
> HDFS-6808.007.combo.patch, HDFS-6808.007.patch, HDFS-6808.008.combo.patch, 
> HDFS-6808.008.patch, HDFS-6808.009.combo.patch, HDFS-6808.009.patch
>
>
> The workflow of dynamically changing data volumes on DataNode is
> # Users manually changed {{dfs.datanode.data.dir}} in the configuration file
> # User use command line to notify DN to reload configuration and updates its 
> volumes. 
> This work adds command line support to notify DN to reload configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6808) Add command line option to ask DataNode reload configuration.

Reply via email to