[
https://issues.apache.org/jira/browse/HDFS-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846119#comment-13846119
]
Chris Nauroth commented on HDFS-5596:
-------------------------------------
# {{AclEntryScope}} and {{AclEntryType}}: Let's move the comment about
consistency with protobuf to an internal comment instead of user-facing
JavaDoc. It might cause confusion for a Hadoop user, but it's an important
comment for a Hadoop developer.
# {{FsAction}}: Can we skip making the internal values array public? This is a
public/stable part of the client-facing API, so increasing its visibility opens
a risk that clients will start depending on it, and then we're locked in to
keeping that contract. Also, a caller could change the array elements, which
would break the logic of the permission merging methods. If you want, you
could cache a private copy inside {{PBHelper}}.
# {{ClientProtocol}}: Despite the scenario you described, I believe all ACL
operations are idempotent.
bq. If Alice calls removeAcl() while Bob is changing the ACLs, re-executing
removeAcl() could cause data loss.
Idempotence states a guarantee that repeated execution of the same operation
yields the same end result, but idempotence does not state any guarantees
around transaction isolation when multiple independent operations concurrently
mutate the same state. All ACL modifications are idempotent, because repeated
execution reaches the same end state for that operation with no visible side
effects for the requester. Additionally, there is nothing in the RPC response
that can differ between repeated executions, because the return type is void.
It's true that concurrent modification of the same file's ACL will cause data
loss for either Alice or Bob, but this is beyond the scope of idempotence.
There are many existing examples of methods annotated idempotent that could
cause perceived data loss for concurrent clients if a retry is interleaved with
a different client's request. Examples include {{setOwner}},
{{setReplication}} and {{setPermission}}.
Another way of thinking about this is what benefit would we gain with
at-most-once, and thus involving the retry cache? If the NN receives a retry
of Alice's {{removeAcl}}, then it would immediately return without executing
it. However, Bob still has no way of knowing if Alice's {{removeAcl}} deleted
his changes before he received his RPC response over the network. No matter
what we do, if Bob issues a subsequent getAcl, the result he receives is
non-deterministic.
bq. The reason is similar to the one that the RPC about deletion are
at-most-once, but not idempotent.
If this refers to calls like {{delete}} and {{deleteSnapshot}}, these are not
at-most-once due to data loss concerns or concurrency handling. The reason
these are at-most-once is that the logic of the operations is not idempotent.
If {{delete}} was annotated idempotent, then an RPC client retry would always
cause the NN to re-execute the delete. If it turns out that the first request
actually did make it to the NN, then the file no longer exists, and the NN
returns an exception on the retry. Since the response differs between first
attempt and retry, the operation is not idempotent, and we have to involve the
retry cache to provide at-most-once semantics by caching the first response.
Using at-most-once still doesn't protect against data loss scenarios on
concurrent operations though. If Alice calls {{delete}} concurrently while Bob
is creating-writing-closing a file at the same path, then it is
non-deterministic whether or not Bob's file exists at the end. If Alice's call
lands on the NN before Bob's, then Bob's file will exist at the end. If
Alice's call lands last, then the file won't exist. Despite the at-most-once
annotation, the concurrency behavior is still non-deterministic
last-one-in-wins.
bq. I'm not sure that whether SnapshotAccessControlException and
NSQuotaExceededException should be unwrapped, therefore I left these two out of
this patch.
If you want to wait until we're testing in combination with snapshots and
quotas, that's OK, but I do think we'll want to unwrap those 2. Unwrapping
{{SnapshotAccessControlException}} will give the client a friendlier error
message if they try to change the ACL on a read-only snapshot path like
/dir/.snapshot/snapshotName. Unwrapping {{NSQuotaExceededException}} will give
a friendlier error message if a client's modification of the ACL on a
snapshotted inode requires us to split off a new inode to record the change,
possibly exceeding the user's namespace quota.
> Implement RPC stubs
> -------------------
>
> Key: HDFS-5596
> URL: https://issues.apache.org/jira/browse/HDFS-5596
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client, namenode
> Affects Versions: HDFS ACLs (HDFS-4685)
> Reporter: Chris Nauroth
> Assignee: Haohui Mai
> Attachments: HDFS-5596.000.patch, HDFS-5596.001.patch,
> HDFS-5596.002.patch
>
>
> Implement RPC stubs for both {{DistributedFileSystem}} and
> {{NameNodeRpcServer}}.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)