fapifta commented on code in PR #4478:
URL: https://github.com/apache/ozone/pull/4478#discussion_r1157027665
##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystemWithFSO.java:
##########
@@ -137,6 +137,19 @@ public void testRenameDestinationParentDoesntExist()
throws Exception {
Assert.assertFalse(getFs().rename(dir2SourcePath, newDestinPath));
}
+ @Test
+ public void testKeyRenameToBucketLevel() throws IOException {
+ final String dir = "dir1";
+ final String key = dir + "/key1";
+ final Path source = new Path(getBucketPath(), key);
+ final Path dest = new Path(String.valueOf(getBucketPath()));
+ LOG.info("Will move {} to {}", source, dest);
+ getFs().rename(source, dest);
+ assertTrue("Key rename failed", getFs().exists(dest));
Review Comment:
Here we are asserting if dest exists, but dest is the bucket path, which I
believe should always yield to true.
The intent I guess would be to check whether dest/key1 exists, or do I
misunderstand something?
##########
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OFSPath.java:
##########
@@ -163,6 +163,14 @@ public String getKeyName() {
return keyName;
}
+ /**
+ * Returns the last element of the KeyName, which can either be a file name
+ * or the leaf directory/bucket.
+ */
+ public String getFileName() {
Review Comment:
Maybe getLastElement()?
##########
hadoop-ozone/ozonefs-common/src/main/java/org/apache/hadoop/fs/ozone/BasicRootedOzoneFileSystem.java:
##########
@@ -430,8 +430,14 @@ private boolean renameFSO(OzoneBucket bucket,
// construct src and dst key paths
String srcKeyPath = srcPath.getNonKeyPathNoPrefixDelim() +
OZONE_URI_DELIMITER + srcPath.getKeyName();
- String dstKeyPath = dstPath.getNonKeyPathNoPrefixDelim() +
- OZONE_URI_DELIMITER + dstPath.getKeyName();
+ String dstKeyPath;
+ if (dstPath.isBucket()) {
+ dstKeyPath = dstPath.getNonKeyPathNoPrefixDelim() +
+ OZONE_URI_DELIMITER + srcPath.getFileName();
+ } else {
+ dstKeyPath = dstPath.getNonKeyPathNoPrefixDelim() +
+ OZONE_URI_DELIMITER + dstPath.getKeyName();
Review Comment:
@sumitagrawl, @dombizita
Let me chime in a bit into this discussion and think about this loud.
The basic problem is this as I understand:
During a rename operation, where we need to resolve the target's final path
and name in case the target of the rename is a directory, and especially when
the target is a bucket.
Clearly the expectation is coming from the semantics of how a file system
rename operation behaves in the local filesystem world, as we are not different
from any local filesystem when it comes to operations with files, but we are
mainly an object store that is designed to be accessible as a file system as
well.
HDFS solves this clearly just on the NameNode side, while GNU mv for example
solves this internally within the command and does not push this responsibility
to the filesystem implementation. So both schools exists.
We have 3 bucket types, legacy OBS and FSO, but some of the codebase is
different for these on the client and the server side, let's see what we have
internally:
- preExecute code is common for all three bucket types.
- legacy and OBS request is handled by the OmKeyRenameRequest class, FSO is
handled by OmKeyRenameRequestWithFSO class
- in preExecute we have a check via the validateAndNormalizeKey(boolean,
String, BucketLayout) method, this method operates only if the bucket type is
FSO, or if the bucket type is legacy and "ozone.om.enable.filesystem.paths" is
enabled (default is disabled). Normalization on empty keyName does nothing with
the keyName, and then validation throws an exception as an empty key name is
invalid.
- the actual request handling in validateAndUpdateCache later on for legacy
bucket type checks if the target keyName's length is greater than zero, if not,
it throws an exception and denies the request, similarly the FSO bucket type's
request handling denies the request with an exception being thrown if the
target keyName's lenght is not greater than zero.
What needs to be done on the client side to fix this problem is showcased in
this patch by @dombizita. It is fairly simple, and definitely simpler then what
I can imagine on the server side, but, Let me play around with the thought what
would it take to do the fix there.
Considering the current class structure, as OBS and legacy are handled
together with the same code, and FSO is the only one that has a specialization,
our solution either separates OBS and legacy from each other, or the two
solution is tied together. I think it is not feasible to separate OBS and
legacy for this, as the problem does not manifests with these bucket types at
all. But why? Because this case is solved for legacy buckets on the client
side. How? In BasicOzoneFileSystem and BasicRootedOzoneFilesystem it is
handled, and for the RenameKeyHandler in the object store CLI client it is not
allowed to have an empty target key name.
Ok, so legacy and OBS are handled on the client side, and we do not need to
care... What we can do with FSO then on the server side?
We can specialize the preExecute's validation on the target key, and allow
an empty target keyName to go through. What would that imply? In the preexecute
it is not a problem, we can introduce a method that abstracts the validation
away for the target path, and allow empty key name only in FSO. Fairly simple
so far.
Let's see what it takes to allow empty target in validateAndUpdateCache...
Only for FSO after preExecute logic is changed.
In the check for keyName lengths we can omit the check for toKeyName, easy.
After that ACL checks are the next step with an empty key name I think we
can safely say it is undefined what we will get back, so we are forced to form
the proper target key before ACL checks are happening.
How we can do that? We can put an if check there, and see if the target key
name length is zero, and if it is, then we need to check the ACLs on the
bucket, and not on the key for the target. One additional branching add for
complexity.
verifyToDirIsASubDirOfFromDirectory should work with empty target key name,
easy
getOmKeyInfoIfExists should return null for the target file status
It seems that after this the renameKey call handles the case when the target
parent is a bucket, and does the rename correctly by looking at the code.
Based on this I think even if it causes some more hurdle to handle this on
the server side, it is not that big of a difference (if I am right about what
is necessary) considering confusion that the mixed solution may cause when we
have parts of the solution on the client side while other parts on the server
side. It already causes confusion that legacy and OBS is solved on the client
side, while FSO is on the server side, but we can try to keep consistency with
this approach.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]