[ 
https://issues.apache.org/jira/browse/HDFS-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846119#comment-13846119
 ] 

Chris Nauroth commented on HDFS-5596:
-------------------------------------

# {{AclEntryScope}} and {{AclEntryType}}: Let's move the comment about 
consistency with protobuf to an internal comment instead of user-facing 
JavaDoc.  It might cause confusion for a Hadoop user, but it's an important 
comment for a Hadoop developer.
# {{FsAction}}: Can we skip making the internal values array public?  This is a 
public/stable part of the client-facing API, so increasing its visibility opens 
a risk that clients will start depending on it, and then we're locked in to 
keeping that contract.  Also, a caller could change the array elements, which 
would break the logic of the permission merging methods.  If you want, you 
could cache a private copy inside {{PBHelper}}.
# {{ClientProtocol}}: Despite the scenario you described, I believe all ACL 
operations are idempotent.

bq. If Alice calls removeAcl() while Bob is changing the ACLs, re-executing 
removeAcl() could cause data loss.

Idempotence states a guarantee that repeated execution of the same operation 
yields the same end result, but idempotence does not state any guarantees 
around transaction isolation when multiple independent operations concurrently 
mutate the same state.  All ACL modifications are idempotent, because repeated 
execution reaches the same end state for that operation with no visible side 
effects for the requester.  Additionally, there is nothing in the RPC response 
that can differ between repeated executions, because the return type is void.  
It's true that concurrent modification of the same file's ACL will cause data 
loss for either Alice or Bob, but this is beyond the scope of idempotence.  
There are many existing examples of methods annotated idempotent that could 
cause perceived data loss for concurrent clients if a retry is interleaved with 
a different client's request.  Examples include {{setOwner}}, 
{{setReplication}} and {{setPermission}}.

Another way of thinking about this is what benefit would we gain with 
at-most-once, and thus involving the retry cache?  If the NN receives a retry 
of Alice's {{removeAcl}}, then it would immediately return without executing 
it.  However, Bob still has no way of knowing if Alice's {{removeAcl}} deleted 
his changes before he received his RPC response over the network.  No matter 
what we do, if Bob issues a subsequent getAcl, the result he receives is 
non-deterministic.

bq. The reason is similar to the one that the RPC about deletion are 
at-most-once, but not idempotent.

If this refers to calls like {{delete}} and {{deleteSnapshot}}, these are not 
at-most-once due to data loss concerns or concurrency handling.  The reason 
these are at-most-once is that the logic of the operations is not idempotent.  
If {{delete}} was annotated idempotent, then an RPC client retry would always 
cause the NN to re-execute the delete.  If it turns out that the first request 
actually did make it to the NN, then the file no longer exists, and the NN 
returns an exception on the retry.  Since the response differs between first 
attempt and retry, the operation is not idempotent, and we have to involve the 
retry cache to provide at-most-once semantics by caching the first response.  
Using at-most-once still doesn't protect against data loss scenarios on 
concurrent operations though.  If Alice calls {{delete}} concurrently while Bob 
is creating-writing-closing a file at the same path, then it is 
non-deterministic whether or not Bob's file exists at the end.  If Alice's call 
lands on the NN before Bob's, then Bob's file will exist at the end.  If 
Alice's call lands last, then the file won't exist.  Despite the at-most-once 
annotation, the concurrency behavior is still non-deterministic 
last-one-in-wins.

bq. I'm not sure that whether SnapshotAccessControlException and 
NSQuotaExceededException should be unwrapped, therefore I left these two out of 
this patch.

If you want to wait until we're testing in combination with snapshots and 
quotas, that's OK, but I do think we'll want to unwrap those 2.  Unwrapping 
{{SnapshotAccessControlException}} will give the client a friendlier error 
message if they try to change the ACL on a read-only snapshot path like 
/dir/.snapshot/snapshotName.  Unwrapping {{NSQuotaExceededException}} will give 
a friendlier error message if a client's modification of the ACL on a 
snapshotted inode requires us to split off a new inode to record the change, 
possibly exceeding the user's namespace quota.


> Implement RPC stubs
> -------------------
>
>                 Key: HDFS-5596
>                 URL: https://issues.apache.org/jira/browse/HDFS-5596
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client, namenode
>    Affects Versions: HDFS ACLs (HDFS-4685)
>            Reporter: Chris Nauroth
>            Assignee: Haohui Mai
>         Attachments: HDFS-5596.000.patch, HDFS-5596.001.patch, 
> HDFS-5596.002.patch
>
>
> Implement RPC stubs for both {{DistributedFileSystem}} and 
> {{NameNodeRpcServer}}.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to