[jira] Commented: (HADOOP-1700) Append to files in HDFS

dhruba borthakur (JIRA) Thu, 03 Jan 2008 16:05:03 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555728#action_12555728
 ]


dhruba borthakur commented on HADOOP-1700:
------------------------------------------

Hi Ruyue,

I like all of the preceding review comments by stack. I have two basic 
questions.

1.  If a client has opened a file for "append" and is writing to the end of the 
file. The last block of the file has three replicas. It can so happen that the 
system fails to write the data to some replicas. What is your error recovery 
policy? All replicas would be of different size, can you pl point me to the 
code that handles this inconsistency? I cannot find this code in your patch.

2. The DataNode, while finalizing a block, sends the block-size to the 
NameNode. In the case of "appends", the block size of the block changes with 
every write. Can you pl point me to the code that makesthe necessary changes to 
the Namenode's data structure that reflects the new-size after the append is 
finished? I could not find this code in your patch.

thanks,
dhruba

> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.15.1
>            Reporter: stack
>         Attachments: append.patch, Appends.doc, Appends.doc, Appends.html
>
>
> Request for being able to append to files in HDFS has been raised a couple of 
> times on the list of late.   For one example, see 
> http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
>   Other mail describes folks' workarounds because this feature is lacking: 
> e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> feature).  HADOOP-337 'DFS files should be appendable' makes mention of file 
> append but it was opened early in the life of HDFS when the focus was more on 
> implementing the basics rather than adding new features.  Interest fizzled.  
> Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and 
> being able to concurrently read/write -- rather than try and breathe new life 
> into HADOOP-337, instead, here is a new issue focused on file append.  
> Ultimately, being able to do as the google GFS paper describes -- having 
> multiple concurrent clients making 'Atomic Record Append' to a single file 
> would be sweet but at least for a first cut at this feature, IMO, a single 
> client appending to a single HDFS file letting the application manage the 
> access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1700) Append to files in HDFS

Reply via email to