[jira] Issue Comment Edited: (HDFS-684) Use HAR filesystem to merge parity files

dhruba borthakur (JIRA) Sat, 21 Nov 2009 07:24:04 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780996#action_12780996
 ]


dhruba borthakur edited comment on HDFS-684 at 11/21/09 3:23 PM:
-----------------------------------------------------------------

Is there a downside to crearting a HAR for every directory in the /raid 
directory? This will imply that there will be one HAR for every data-set, 
because a single data-set usually reside in one single directory.

If you create one Har  file for all files in a certain policy, and we RAID new 
files every hour based on that policy, then each of these iterations will have 
to delete and recreate the entire HAR again, isn't it (I am assuming that you 
cannot add/delete items to a previously created har file)

      was (Author: dhruba):
    Is there a downside to crearting a HAR for every directory in the /raid 
directory? This will imply that there will be one HAR for every data-set, 
because a single data-set usually reside in one single directory.

If you create one RAID file for all files in a certain policy, and we RAID new 
files eveyr hour based on that policy, then each of these iterations will have 
to detele and recreate the entire HAR again, isn't it (I am assuming that you 
cannot add/delete items to a previously created har file)
  
> Use HAR filesystem to merge parity files 
> -----------------------------------------
>
>                 Key: HDFS-684
>                 URL: https://issues.apache.org/jira/browse/HDFS-684
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: contrib/raid
>            Reporter: dhruba borthakur
>            Assignee: Rodrigo Schmidt
>
> The HDFS raid implementation (HDFS-503) creates a parity file for every file 
> that is RAIDed. This puts additional burden on the memory requirements of the 
> namenode. It will be  nice if the parity files are combined together using 
> the HadoopArchive (har) format. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HDFS-684) Use HAR filesystem to merge parity files

Reply via email to