[ https://issues.apache.org/jira/browse/HDFS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667807#comment-13667807 ]
Steve Loughran commented on HDFS-4849: -------------------------------------- for delete, there's always the option of requiring a suitably random requestID (problem#1), have the server store them somewhere efficient for lookups and storage at scale and under load (problem#2), replicating that structure to all failover nodes (problem#3). You could always abuse the edit log for this. I don't know what idempotency guarantees other filesystems could offer -anything built on LocalFS # as Suresh said, this is not a bug. It is a new feature with long term implications. # Implementing this on HDFS in a way that doesn't break existing consistency guarantees "the NN implement a strict happens-before ordering where each operation happens exactly once" is non-trivial. This is not "a slight change in semantics" # Implementing idempotency in other filesystems could be hard to achieve: without knowledge of those filesystem we can't even begin to assess the impact. All I know is {{LocalFS}}, which doesn't make any guarantees. Because of #3, I'd be hard pressed to support this even if you had a solution that was efficient and scalable on HDFS. A better use of engineering effort could go into making the key FS client apps robust against failures of these operations, giving them better retry logic. We added a lot to handle Hadoop 1.x NN HA, and JT failover -with Job submission being the hard one. If there are ways you could get the MR layer to handle failures of these ops and retry under programmatic control, then you'd have something that would work with more filesystems (assuming you can differentiate failure causes). > Idempotent create, append and delete operations. > ------------------------------------------------ > > Key: HDFS-4849 > URL: https://issues.apache.org/jira/browse/HDFS-4849 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.0.4-alpha > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > > create, append and delete operations can be made idempotent. This will reduce > chances for a job or other app failures when NN fails over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira