[jira] Commented: (HADOOP-620) replication factor should be calucalated based on actual dfs block sizes at the NameNode.

Raghu Angadi (JIRA) Tue, 03 Apr 2007 17:12:52 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486537
 ]


Raghu Angadi commented on HADOOP-620:
-------------------------------------

Test. Please ignore:
<pre> 
----
|   |
----
</pre>


> replication factor should be calucalated based on actual dfs block sizes at 
> the NameNode.
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-620
>                 URL: https://issues.apache.org/jira/browse/HADOOP-620
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>            Priority: Minor
>
> Currently 'dfs -report' calculates replication facto like the following :
>      (totalCapacity - totalDiskRemaining) / (totalSize of dfs files in Name 
> space).
> Problem with this is that this includes disk space used by non-dfs files 
> (e.g. map reduce jobs) on data node. On my single node test, I get 
> replication factor of 100 since I have a 1 GB dfs file with out replication 
> and there is 99GB of unrelated data on the same volume.
> ideally name should calculate it with : (total size of all the blocks known 
> to it) / (total size of files in Name space).
> Initial proposal to keep 'total size of all the blocks' update is to track it 
> in datanode descriptor and update it when namenode receives block reports 
> from the datanode ( and subtract when the datanode is removed).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-620) replication factor should be calucalated based on actual dfs block sizes at the NameNode.

Reply via email to