[jira] Commented: (HADOOP-1700) Append to files in HDFS

Konstantin Shvachko (JIRA) Thu, 17 Jul 2008 10:50:03 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614464#action_12614464
 ]


Konstantin Shvachko commented on HADOOP-1700:
---------------------------------------------

This is a large patch, and we don't have a test plan or any evidence that the 
feature was thoroughly tested.
The unit test includes two unit tests
# Create one file, close, reopen and append.
# Run many appends in parallel to many files.

What really need to be tested is different failure scenarios, like when the 
client or one of data-nodes in the pipeline fail 
at different stages of the transaction, and how name-node reacts to this. Or 
what happens if name-node fails in the middle
of appends and restarted, etc.  As we have seen with the lease recovery feature 
all these things can cause serious problems 
and should be tested in details.
So as stated in  HADOOP-2658 a test plan should be both *designed* and 
*implemented* before such a big feature can be committed.
Imo it should include a lot more unit tests.

Briefly looked at the code, noticed that you forgot to increment the 
ClientDatanodeProtocol.versionID although the comment is there.

> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.15.1
>            Reporter: stack
>            Assignee: dhruba borthakur
>         Attachments: 1700_20080606.patch, append.patch, append3.patch, 
> Appends.doc, Appends.doc, Appends.html, appendtrunk10.patch, 
> appendtrunk11.patch, appendtrunk12.patch, appendtrunk13.patch, 
> appendtrunk13.patch, appendtrunk13.patch, appendtrunk14.patch, 
> appendtrunk14.patch, appendtrunk6.patch, appendtrunk7.patch, 
> appendtrunk8.patch, appendtrunk9.patch, Grid_HadoopRenumberBlocks.pdf
>
>
> Request for being able to append to files in HDFS has been raised a couple of 
> times on the list of late.   For one example, see 
> http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
>   Other mail describes folks' workarounds because this feature is lacking: 
> e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> feature).  HADOOP-337 'DFS files should be appendable' makes mention of file 
> append but it was opened early in the life of HDFS when the focus was more on 
> implementing the basics rather than adding new features.  Interest fizzled.  
> Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and 
> being able to concurrently read/write -- rather than try and breathe new life 
> into HADOOP-337, instead, here is a new issue focused on file append.  
> Ultimately, being able to do as the google GFS paper describes -- having 
> multiple concurrent clients making 'Atomic Record Append' to a single file 
> would be sweet but at least for a first cut at this feature, IMO, a single 
> client appending to a single HDFS file letting the application manage the 
> access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1700) Append to files in HDFS

Reply via email to