smengcl commented on code in PR #4188:
URL: https://github.com/apache/ozone/pull/4188#discussion_r1550044937
##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java:
##########
@@ -1195,7 +1196,8 @@ public OzoneOutputStream createKey(
.setReplicationConfig(replicationConfig)
.addAllMetadata(metadata)
.setAcls(getAclList())
- .setLatestVersionLocation(getLatestVersionLocation);
+ .setLatestVersionLocation(getLatestVersionLocation)
+ .setOwnerName(ownerName);
Review Comment:
Actually I wonder if we really need to add this owner field to every single
new key created in Ozone.
Arguably we could just leave this as unset (empty) to indicate the default
behavior, which is to inherit parent's owner. This way we don't bloat every
single key with this extra owner string in its proto message.
Cons of setting `owner` for every single new key:
1. Slight perf hit for every new key/dir creation.
2. This also implies `chown` on a bucket/dir would be recursive by default.
Makes `chown` much slower. (Where in the case of not setting `owner` for every
key, we don't need to recurse by default because the sub-keys and sub-dirs
should just inherit the parent owner)
3. Because `owner` is a proto string, it can be wasting quite a lot of space
repeating the same name string. (numeric user ID mapping can solve that)
4. When a cluster is upgraded to a new Ozone version with key ownership
support, all existing keys would natually have empty owners as well. Unless we
add a lengthy step during OM upgrade to add owner to all existing keys, which I
don't think is the best idea).
Pro:
1. On the other hand I get that the the owner field in key info is required
for "sticky bit" behavior to work properly with Ranger authorizer (previously
this requires a workaround, HDDS-9701).
2. Writer user is not necessarily the same as the parent owner.
Discussions:
1. Is there some other reasons we must set this owner field for new keys and
dirs?
3. How much perf hit would there actually be to add this owner field for
every key? Try this in a micro benchmark with and without S3?
4. Is it reasonable to add an config to allow not setting owner field for
new keys and dirs by default? Or, only add owner field if the parent dir have
the "sticky bit".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]