[ 
https://issues.apache.org/jira/browse/HDFS-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715427#comment-13715427
 ] 

Suresh Srinivas edited comment on HDFS-4974 at 7/22/13 5:48 PM:
----------------------------------------------------------------

Here is some analysis of how frequently a cache will hit with non-idempotent 
operations. This is useful for configuring the cache size in HDFS-4979.
*ClientProtocol*
|create() with no overWrite|Frequently used - overwrite is enabled by default. 
Lets assume 50% of the creates are without overWrite|
|append()|Rarely used|
|rename()|Frequently used|
|concat()|Rarely used|
|rename2()|Rarely used (needs FileContext)|
|delete()|Frequently used|
|saveNamespace()|Rarely used|
|createSymlink()|Rarely used|
|updatePipeline()|Moderate usage|
|createSnapshot()|Rarely used|
|deleteSnapshot()|Rarely used|
|renameSnapshot()|Rarely used|
*NamenodeProtocol*
|startCheckpoint()|Rarely used|
|endCheckpoint()|Rarely used|
|commitBlockSynchronization()|Could be used heavily when a job writing to a 
large number of files fails without closing/deleting the files and lease expiry 
kicks in.|
*DatanodeProtocol*
|blockReceivedAndDeleted()|Frequently used|

In summary the following will use the retry cache frequently:
ClientProtocol#create()
ClientProtocol#delete()
ClientProtocol#rename()
NamenodeProtocol#commitBlockSynchronization()
DatanodeProtocol#blockReceivedAndDeleted()

We could make blockReceivedAndDeleted() idempotent. Also explore making 
commitBlockSynchronization() idempotent. That leaves us with the remaining 
three operations.
                
      was (Author: sureshms):
    Here is some analysis of how frequently a cache will hit with 
non-idempotent operations. This is useful for configuring the cache size in 
HDFS-4979.
*ClientProtocol*
|create() with no overWrite|Frequently used - overwrite is enabled by default. 
Lets assume 50% of the creates are without overWrite|
|append()|Rarely used|
|rename()|Frequently used|
|concat()|Rarely used|
|rename2()|Rarely used (needs FileContext)|
|delete()|Frequently used|
|saveNamespace()|Rarely used|
|createSymlink()|Rarely used|
|updatePipeline()|Moderate usage|
|createSnapshot()|Rarely used|
|deleteSnapshot()|Rarely used|
|renameSnapshot()|Rarely used|
*NamenodeProtocol*
|startCheckpoint()|Rarely used|
|endCheckpoint()|Rarely used|
|commitBlockSynchronized()|Rarely used|
*DatanodeProtocol*
|blockReceivedAndDeleted()|Frequently used|

In summary the following will use the retry cache frequently:
ClientProtocol#create()
ClientProtocol#delete()
ClientProtocol#rename()
DatanodeProtocol#blockReceivedAndDeleted()

We could make blockReceivedAndDeleted() idempotent. That leaves us with the 
three operations.

                  
> Analyze and add annotations to Namenode protocol methods and enable retry
> -------------------------------------------------------------------------
>
>                 Key: HDFS-4974
>                 URL: https://issues.apache.org/jira/browse/HDFS-4974
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, namenode
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: HDFS-4974.1.patch, HDFS-4974.2.patch, HDFS-4974.patch
>
>
> This jira is intended for:
> # Discussing current @Idempotent annotations in HDFS protocols and adding 
> that annotation where it is missing.
> # Discuss how retry should be enabled for non-idempotent requests.
> I will post the analysis of current methods in a subsequent comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to