[jira] Updated: (HADOOP-5124) A few optimizations to FsNamesystem#RecentInvalidateSets

Hairong Kuang (JIRA) Mon, 02 Feb 2009 17:22:22 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hairong Kuang updated HADOOP-5124:
----------------------------------

    Attachment: optimizeInvalidate2.patch

This patch incorporates Konstantin's first comment.

As for (3), I am not clear how to evaluate the optimization. The goal of the 
new strategy is not to improve performance. Instead it aims to give fairness to 
all nodes when computing invalidation work. The current strategy always favors 
the ones in the beginning of the map since recentInvalidateSets is a TreeMap so 
it is sorted. Another flaw is that after the first node is scheduled, the node 
is reinserted into the map if it still has remaining blocks. Since it becomes 
the first node again, next call of invalidateWorkForOneNode will work on the 
same node again. The current strategy would work fine if recentInvalidateSets 
is a FIFO queue.

> A few optimizations to FsNamesystem#RecentInvalidateSets
> --------------------------------------------------------
>
>                 Key: HADOOP-5124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5124
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: optimizeInvalidate.patch, optimizeInvalidate1.patch, 
> optimizeInvalidate2.patch
>
>
> This jira proposes a few optimization to FsNamesystem#RecentInvalidateSets:
> 1. when removing all replicas of a block, it does not traverse all nodes in 
> the map. Instead it traverse only the nodes that the block is located.
> 2. When dispatching blocks to datanodes in ReplicationMonitor. It randomly 
> chooses a predefined number of datanodes and dispatches blocks to those 
> datanodes. This strategy provides fairness to all datanodes. The current 
> strategy always starts from the first datanode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-5124) A few optimizations to FsNamesystem#RecentInvalidateSets

Reply via email to