[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114859#comment-15114859
 ] 

Dinesh S. Atreya commented on HDFS-9607:
----------------------------------------

Continuing on the semantics from the parent (umbrella) JIRA before re-visiting 
the API.

The proposed enhancements to core Hadoop include capability to do 
“updates-in-place” in HDFS.
• Support seeks for writes (in addition to reads).
• After seek, if the new byte length is the same as the read (old) byte length, 
in place update is allowed.
• Delete is an update with appropriate Delete marker
• If byte length is different, old entry can be marked as delete (as per higher 
level API of the calling application such as Hive/ORC etc.) with new one 
appended as before.
• It is the client’s discretion to perform either update, append or both and 
the API changes in different Hadoop components should provide these 
capabilities.

Expanded set of APIs is being advocated to ensure data integrity starting at 
the HDFS layer itself.

This is echoed by other comments such as [[email protected]]
{quote}
HDFS is the most critical part of the Hadoop stack; data integrity is the one 
thing the team cares about more than anything else. Something at the YARN layer 
could impact availability or performance —but it shouldn't lose or corrupt 
data. Things at the HDFS layer do, and every time something has gone in there 
have been surprises downstream.
{quote}

> Advance Hadoop Architecture (AHA) - HDFS Update (write-in-place)
> ----------------------------------------------------------------
>
>                 Key: HDFS-9607
>                 URL: https://issues.apache.org/jira/browse/HDFS-9607
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to