[ 
https://issues.apache.org/jira/browse/HDFS-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-684:
----------------------------------

    Status: Open  (was: Patch Available)

I am looking into the hadoop-tools issue. will comment soon.

Please let me understand this patch a little bit more.. Suppose a directory 
/dhruba has 10 files in it. All files initially have a replication factor of 3. 
Then the RaidNode creates a xxx_raid.har that replaces all the parity files. 
Now suppose, a user deletes the first file /dhruba. Now /dhruba has only 9 
files. This patch will now delete the har file associated with /dhruba. At this 
point, all the 9 files in /dhruba are left with a replication factor of 2 only! 
Am I understanding this right?

The next iteration of RaidNode will recreate the parity files (and then har 
them). this is good.

> Use HAR filesystem to merge parity files 
> -----------------------------------------
>
>                 Key: HDFS-684
>                 URL: https://issues.apache.org/jira/browse/HDFS-684
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: contrib/raid
>            Reporter: dhruba borthakur
>            Assignee: Rodrigo Schmidt
>         Attachments: HDFS-684.0.patch, HDFS-684.1.patch
>
>
> The HDFS raid implementation (HDFS-503) creates a parity file for every file 
> that is RAIDed. This puts additional burden on the memory requirements of the 
> namenode. It will be  nice if the parity files are combined together using 
> the HadoopArchive (har) format. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to