[
https://issues.apache.org/jira/browse/HDFS-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715427#comment-13715427
]
Suresh Srinivas edited comment on HDFS-4974 at 7/22/13 5:48 PM:
----------------------------------------------------------------
Here is some analysis of how frequently a cache will hit with non-idempotent
operations. This is useful for configuring the cache size in HDFS-4979.
*ClientProtocol*
|create() with no overWrite|Frequently used - overwrite is enabled by default.
Lets assume 50% of the creates are without overWrite|
|append()|Rarely used|
|rename()|Frequently used|
|concat()|Rarely used|
|rename2()|Rarely used (needs FileContext)|
|delete()|Frequently used|
|saveNamespace()|Rarely used|
|createSymlink()|Rarely used|
|updatePipeline()|Moderate usage|
|createSnapshot()|Rarely used|
|deleteSnapshot()|Rarely used|
|renameSnapshot()|Rarely used|
*NamenodeProtocol*
|startCheckpoint()|Rarely used|
|endCheckpoint()|Rarely used|
|commitBlockSynchronization()|Could be used heavily when a job writing to a
large number of files fails without closing/deleting the files and lease expiry
kicks in.|
*DatanodeProtocol*
|blockReceivedAndDeleted()|Frequently used|
In summary the following will use the retry cache frequently:
ClientProtocol#create()
ClientProtocol#delete()
ClientProtocol#rename()
NamenodeProtocol#commitBlockSynchronization()
DatanodeProtocol#blockReceivedAndDeleted()
We could make blockReceivedAndDeleted() idempotent. Also explore making
commitBlockSynchronization() idempotent. That leaves us with the remaining
three operations.
was (Author: sureshms):
Here is some analysis of how frequently a cache will hit with
non-idempotent operations. This is useful for configuring the cache size in
HDFS-4979.
*ClientProtocol*
|create() with no overWrite|Frequently used - overwrite is enabled by default.
Lets assume 50% of the creates are without overWrite|
|append()|Rarely used|
|rename()|Frequently used|
|concat()|Rarely used|
|rename2()|Rarely used (needs FileContext)|
|delete()|Frequently used|
|saveNamespace()|Rarely used|
|createSymlink()|Rarely used|
|updatePipeline()|Moderate usage|
|createSnapshot()|Rarely used|
|deleteSnapshot()|Rarely used|
|renameSnapshot()|Rarely used|
*NamenodeProtocol*
|startCheckpoint()|Rarely used|
|endCheckpoint()|Rarely used|
|commitBlockSynchronized()|Rarely used|
*DatanodeProtocol*
|blockReceivedAndDeleted()|Frequently used|
In summary the following will use the retry cache frequently:
ClientProtocol#create()
ClientProtocol#delete()
ClientProtocol#rename()
DatanodeProtocol#blockReceivedAndDeleted()
We could make blockReceivedAndDeleted() idempotent. That leaves us with the
three operations.
> Analyze and add annotations to Namenode protocol methods and enable retry
> -------------------------------------------------------------------------
>
> Key: HDFS-4974
> URL: https://issues.apache.org/jira/browse/HDFS-4974
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ha, namenode
> Reporter: Suresh Srinivas
> Assignee: Suresh Srinivas
> Attachments: HDFS-4974.1.patch, HDFS-4974.2.patch, HDFS-4974.patch
>
>
> This jira is intended for:
> # Discussing current @Idempotent annotations in HDFS protocols and adding
> that annotation where it is missing.
> # Discuss how retry should be enabled for non-idempotent requests.
> I will post the analysis of current methods in a subsequent comment.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira