[
https://issues.apache.org/jira/browse/HDFS-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dhruba borthakur updated HDFS-684:
----------------------------------
Status: Open (was: Patch Available)
I am looking into the hadoop-tools issue. will comment soon.
Please let me understand this patch a little bit more.. Suppose a directory
/dhruba has 10 files in it. All files initially have a replication factor of 3.
Then the RaidNode creates a xxx_raid.har that replaces all the parity files.
Now suppose, a user deletes the first file /dhruba. Now /dhruba has only 9
files. This patch will now delete the har file associated with /dhruba. At this
point, all the 9 files in /dhruba are left with a replication factor of 2 only!
Am I understanding this right?
The next iteration of RaidNode will recreate the parity files (and then har
them). this is good.
> Use HAR filesystem to merge parity files
> -----------------------------------------
>
> Key: HDFS-684
> URL: https://issues.apache.org/jira/browse/HDFS-684
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: contrib/raid
> Reporter: dhruba borthakur
> Assignee: Rodrigo Schmidt
> Attachments: HDFS-684.0.patch, HDFS-684.1.patch
>
>
> The HDFS raid implementation (HDFS-503) creates a parity file for every file
> that is RAIDed. This puts additional burden on the memory requirements of the
> namenode. It will be nice if the parity files are combined together using
> the HadoopArchive (har) format.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.