[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

Colin Patrick McCabe (JIRA) Wed, 17 Dec 2014 16:08:35 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250854#comment-14250854
 ]


Colin Patrick McCabe commented on HDFS-7411:
--------------------------------------------

* Can you rebase this on trunk?  ChunkedArrayList has moved and this caused a
patch application failure.

* I would really prefer that size stay a O(1) operation for ChunkedArrayList.
We should be able to do this by hooking into the iterator's remove() method,
creating a custom iterator if needed.  If that's too complex to do in this
jira, then let's at least file a follow-on.

{code}
    <property>
      <name>dfs.namenode.decommission.blocks.per.node</name>
      <value>400000</value>
      <description>The approximate number of blocks per node. 
          This affects the number of blocks processed per 
          decommission interval, as defined in 
dfs.namenode.decommission.interval.
          This is multiplied by dfs.namenode.decommission.nodes.per.interval
          to define the actual processing rate.</description>
    </property>
{code}

* Why do we need this parameter?  The NameNode already tracks how many blocks
each DataNode has in each storage.  That information is in
DatanodeStorageInfo#size.

{code}
    <property>
      <name>dfs.namenode.decommission.max.concurrent.tracked.nodes</name>
      <value>100</value>
      <description>
        The maximum number of decommission-in-progress datanodes nodes that 
will be
        tracked at one time by the namenode. Tracking a decommission-in-progress
        datanode consumes additional NN memory proportional to the number of 
blocks
        on the datnode. Having a conservative limit reduces the potential impact
        of decomissioning a large number of nodes at once.
      </description>
    </property>
{code}

* Should this be called something like
dfs.namenode.decomission.max.concurrent.nodes?  I'm confused by the mention of
"tracking" here.  It seems to imply that setting this too low would allow more
nodes to be decommissioned, but we'd stop tracking the decomissioning?

{code}
-  static final Log LOG = LogFactory.getLog(BlockManager.class);
+  static final Logger LOG = LoggerFactory.getLogger(BlockManager.class);
{code}
If you're going to change this, you need to change all the unit tests that are
changing the log level, so that they use the correct function to do so.

{code}
hdoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java:
    
((Log4JLogger)LogFactory.getLog(BlockManager.class)).getLogger().setLevel(Level.ALL);
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java:
    
((Log4JLogger)LogFactory.getLog(BlockManager.class)).getLogger().setLevel(Level.ALL);
and many more...
{code}
I can see that you fixed it in TestPendingInvalidateBlock.java, but there's a 
lot more locations.

You probably need something like the GenericTestUtils#disableLog function I
created in the HDFS-7430 patch.  I guess we could split that off into a
separate patch if it's important enough.  Or perhaps we could just put off
changing this until a follow-on JIRA?

{code}
    if (node.isAlive) {
      return true;
    } else {
      ... long block ...
    }
{code}
We can reduce the indentation by getting rid of the else block here.  Similar
with the other nested 'else'.

{code}
-          LOG.fatal("ReplicationMonitor thread received Runtime exception. ", 
t);
+          LOG.error("ReplicationMonitor thread received Runtime exception. ",
+              t);
{code}
What's the rationale for changing the log level here?

{code}
   /**
-   * Decommission the node if it is in exclude list.
+   * Decommission the node if it is in the host exclude list.
+   *
+   * @param nodeReg datanode
    */
-  private void checkDecommissioning(DatanodeDescriptor nodeReg) { 
+  void checkDecommissioning(DatanodeDescriptor nodeReg) {
{code}

I realize this isn't introduced by this patch, but this function seems
misleadingly named.  Perhaps it should be named something like
"startDecomissioningIfExcluded"?  It's definitely not just a "check."

more comments coming...

> Refactor and improve decommissioning logic into DecommissionManager
> -------------------------------------------------------------------
>
>                 Key: HDFS-7411
>                 URL: https://issues.apache.org/jira/browse/HDFS-7411
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.5.1
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, 
> hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch
>
>
> Would be nice to split out decommission logic from DatanodeManager to 
> DecommissionManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager

Reply via email to