Thanks to start the thread and the work with the implementation @Siyaou.

I agree: it's in the last state before the merge, good to see the extensive unit / acceptance tests, and I especially like this detailed description in the vote mail (one of the best what I read until now).

I tried to understand the code, and have a few comments:

1. It seems to have a huge number of duplicated code. I think it would be great to remove them. It seems that all the classes are copied on this branch from the original FileSystem. But one benefit of the feature branch that we can make bigger changes: For example I don't think that we need dedicated client adapter for the new file system. Adapters seems to be the interface between file systems and the Ozone API (that was the original goal) and seems to be easy to share. I would add all your required methods to the existing interface.

2. It's also harder to understand the new logic because the duplicated code.

3. BasicRootedOzoneFileSystem class (which is the main class) use implementation (instead or the interface) and casting. If we use interface, then all the important methods should be there. Seems to be an open question regarding to the abstractions. (Or we don't need an interface. That's also possible...)

4. I am not sure what is the subject of the VOTE as currently there is no proposed commit to merge (branch is conflicted, unfortunately due to my latest commits. Happy to help to cleanup/rebase.)

If it's a generic question about the merge, I am absolutely +1. Seems to be very good work, with good test coverage, and it couldn't break anything in the master.

From technical point of view, I think we may need 2-3 more commits before the merge (at least one, to resolve the merge).

I would also rerun CI checks of the proposed commit 3-4 times to see if any new tests are intermittent...

Thanks again the work,
Marton












On 6/4/20 6:14 PM, Siyao Meng wrote:
Hi Ozone developers,

   I'd like to propose merging feature branch HDDS-2665 into master branch
for new Ozone Filesystem scheme *ofs://*.

   This new Filesystem scheme intends to improve Ozone user experience. OFS
is a client-side FileSystem implementation. It can co-exist with o3fs and
be used interchangeably if needed on the same client.


OFS scheme in a nutshell:

ofs://<Host name[:Port] or OM Service
ID>/[<volumeName>/<bucketName>/path/to/key]


And here's a simple list of valid OFS URI -- this should cover all expected
daily usages:

ofs://om1/
ofs://om3:9862/
ofs://omservice/
ofs://omservice/volume1/
ofs://omservice/volume1/bucket1/
ofs://omservice/volume1/bucket1/dir1
ofs://omservice/volume1/bucket1/dir1/key1

ofs://omservice/tmp/

ofs://omservice/tmp/key1


   Located at the root of an OFS Filesystem are volumes and mount(s). Inside
a volume lies all the buckets. Inside buckets are keys and directories.
   For mounts, only temp mount */tmp/* is supported at the moment -- more on
this later.

   So naturally, OFS *don't allow creating keys(files) directly under root
or volumes*. Users will receive an error message if they try to do that:

$ ozone fs -mkdir /volume1
2020-06-04 00:00:00,000 [main] INFO rpc.RpcClient: Creating Volume:
volume1, with hadoop as owner.
$ ozone fs -touch /volume1/key1
touch: Cannot create file under root or volume.


   A short note: `ozone fs`, `hadoop fs`, `hdfs dfs` can be used
interchangeably. As long as the jars and client config for OFS are in place.


1. With OFS, fs.defaultFS (in core-site.xml) no longer needs to have a
specific volume and bucket in its path like o3fs did. Simply put the OM
host or service ID:

<property>
   <name>fs.defaultFS</name>
   <value>ofs://omservice</value>
</property>


Then the client should be able to access every volume and bucket in that
cluster without specifying the host name or service ID.

ozone fs -mkdir -p /volume1/bucket1


2. Admins can create and delete volumes and buckets easily with Hadoop FS
shell. Volumes and buckets are treated similar to directories so they will
be created if they don't exist with *-p*:

ozone fs -mkdir -p ofs://omservice/volume1/bucket1/dir1/


Note that the supported volume and bucket name character set rule still
applies. For example, bucket and volume names don't like *underscore*(_):

$ ozone fs -mkdir -p /volume_1
mkdir: Bucket or Volume name has an unsupported character : _


3. To be compatible with legacy Hadoop applications that use /tmp/, we have
a special temp mount located at the root of the FS.

In order to use it, first an admin needs to create the volume *tmp* (the
volume name is hardcoded at the moment) and set its ACL to world ALL
access. This only needs to be done *once per cluster*:

$ ozone sh volume create tmp
$ ozone sh volume setacl tmp -al world::a


Then *each user* needs to mkdir first to initialize their own temp bucket
once. After that they can write to it just like they would do to a regular
directory:

$ ozone fs -mkdir /tmp
2020-06-04 00:00:00,050 [main] INFO rpc.RpcClient: Creating Bucket:
tmp/0238775c7bd96e2eab98038afe0c4279, with Versioning false and Storage
Type set to DISK and Encryption set to false

$ ozone fs -touch /tmp/key1


4. When keys are deleted to trash, they are moved to a trash directory
under each *bucket*, because keys can't be moved(renamed) between buckets.

$ ozone fs -rm /volume1/bucket1/key1
2020-06-04 00:00:00,100 [main] INFO fs.TrashPolicyDefault: Moved:
'ofs://id1/volume1/bucket1/key1' to trash at:
ofs://id1/volume1/bucket1/.Trash/hadoop/Current/volume1/bucket1/key1


This is similar to how the HDFS encryption zone handles trash location.

5. OFS supports recursive volume, bucket and key listing.

i.e. `ozone fs -ls -R ofs://omservice/` will recursively list all volumes,
buckets and keys the user has LIST permission to (if ACL is enabled).
Note this shouldn't degrade server performance as this logic is completely
client-side. As if the client is issuing multiple requests to the server to
get all the information.


   So far the OFS feature set is complete, see sub-tasks of HDDS-2665. The
OFS feature branch was rebased 1 day ago to include Ozone master branch
commits. It passes existing checks in my latest rebase PR.

   FileSystem contract tests and basic integration tests are also in place.
   I ran basic shell commands i.e. the examples above. They work fine.
   I ran WordCount in the docker compose environment (w/o YARN). It
succeeded. And I manually confirmed the correctness of the result.
   I also ran TeraSort suite (only 1000 rows, in docker compose). The
result looks fine.
   We have tested compatibility with MapReduce and Hive

   I think it is time to merge HDDS-2665 into master. We can continue future
work (refactoring, improvements, performance analysis) from there.
   I have submitted a PR for easier review of all code changes for OFS here:
https://github.com/apache/hadoop-ozone/pull/1021

   Please vote on this thread. The vote will run for 7 days through
Thursday, June 11 11:59 PM GMT.


Thanks,
Siyao Meng


---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org

Reply via email to