[jira] Commented: (HADOOP-1700) Append to files in HDFS

Sameer Paranjpye (JIRA) Thu, 30 Aug 2007 20:28:59 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523987
 ]


Sameer Paranjpye commented on HADOOP-1700:
------------------------------------------

> But isn't <blockid+generation> really tantamount to a new block id?

Not really, the implications for implementation are pretty different. If a new 
block id is to be used, the Namenode has to allocate a new block and delete the 
old block. Scheduling the old blocks replicas for deletion, dispatching the 
requests and journaling the new block is a non-trivial amount of Namenode 
activity. A revision number update can simply be recorded in memory. In the 
event of a conflict the Namenode would treat the highest revision numbered 
replicas as valid and discard out of date replicas.

> Copying could prevent such issues [ ... ]

Copying does make error handling somewhat easier. But it seems to me that it 
does so only when changes to a file are exposed in the Namenode at a block 
granularity. If we want to make changes visible at a finer grain both 
approaches have similar complexity in the corner cases of datanodes and writers 
crashing in the middle of updates.









> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: stack
>
> Request for being able to append to files in HDFS has been raised a couple of 
> times on the list of late.   For one example, see 
> http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
>   Other mail describes folks' workarounds because this feature is lacking: 
> e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> feature).  HADOOP-337 'DFS files should be appendable' makes mention of file 
> append but it was opened early in the life of HDFS when the focus was more on 
> implementing the basics rather than adding new features.  Interest fizzled.  
> Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and 
> being able to concurrently read/write -- rather than try and breathe new life 
> into HADOOP-337, instead, here is a new issue focused on file append.  
> Ultimately, being able to do as the google GFS paper describes -- having 
> multiple concurrent clients making 'Atomic Record Append' to a single file 
> would be sweet but at least for a first cut at this feature, IMO, a single 
> client appending to a single HDFS file letting the application manage the 
> access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1700) Append to files in HDFS

Reply via email to