[jira] Commented: (HADOOP-2148) Inefficient FSDataset.getBlockFile()

Konstantin Shvachko (JIRA) Wed, 19 Mar 2008 13:10:15 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580526#action_12580526
 ]


Konstantin Shvachko commented on HADOOP-2148:
---------------------------------------------

I verified both test failures. They are not related to the patch imo.
In the first case TestDFSStorageStateRecovery took too long. It finished only 
47 cases out of 71 in 13 minutes when the junit framework terminated it.
In the second case the cluster fell into a infinite loop trying to replicate a 
block. Filed HADOOP-3050 to investigate it.

> Inefficient FSDataset.getBlockFile()
> ------------------------------------
>
>                 Key: HADOOP-2148
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2148
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.14.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>             Fix For: 0.17.0
>
>         Attachments: getBlockFile.patch, getBlockFile1.patch
>
>
> FSDataset.getBlockFile() first verifies that the block is valid and then 
> returns the file name corresponding to the block.
> Doing that it performs the data-node blockMap lookup twice. Only one lookup 
> is needed here. 
> This is important since the data-node blockMap is big.
> Another observation is that data-nodes do not need the blockMap at all. File 
> names can be derived from the block IDs,
> there is no need to hold Block to File mapping in memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2148) Inefficient FSDataset.getBlockFile()

Reply via email to