+1 for merging. I tried out ofs and all the Hadoop shell commands that I tried 
worked seamlessly.

Good feedback from others who voted, let’s address it soon.


> 
>> On Jun 4, 2020, at 9:17 AM, Siyao Meng <sm...@cloudera.com.INVALID> wrote:
>> 
>> Also forwarding to hdfs-dev@. I misspelled the address.
>> --Siyao
>> 
>> ---------- Forwarded message ---------
>> From: Siyao Meng <sm...@cloudera.com>
>> Date: Thu, Jun 4, 2020 at 9:14 AM
>> Subject: [Ozone] [VOTE] Merge HDDS-2665 branch (OFS) to master
>> To: <ozone-dev@hadoop.apache.org>, <hadoop-...@hadoop.apache.org>
>> 
>> 
>> Hi Ozone developers,
>> 
>> I'd like to propose merging feature branch HDDS-2665 into master branch
>> for new Ozone Filesystem scheme *ofs://*.
>> 
>> This new Filesystem scheme intends to improve Ozone user experience. OFS
>> is a client-side FileSystem implementation. It can co-exist with o3fs and
>> be used interchangeably if needed on the same client.
>> 
>> 
>> OFS scheme in a nutshell:
>> 
>> ofs://<Host name[:Port] or OM Service
>>> ID>/[<volumeName>/<bucketName>/path/to/key]
>> 
>> 
>> And here's a simple list of valid OFS URI -- this should cover all expected
>> daily usages:
>> 
>> ofs://om1/
>>> ofs://om3:9862/
>>> ofs://omservice/
>>> ofs://omservice/volume1/
>>> ofs://omservice/volume1/bucket1/
>>> ofs://omservice/volume1/bucket1/dir1
>>> ofs://omservice/volume1/bucket1/dir1/key1
>> 
>> ofs://omservice/tmp/
>> 
>> ofs://omservice/tmp/key1
>> 
>> 
>> Located at the root of an OFS Filesystem are volumes and mount(s). Inside
>> a volume lies all the buckets. Inside buckets are keys and directories.
>> For mounts, only temp mount */tmp/* is supported at the moment -- more on
>> this later.
>> 
>> So naturally, OFS *don't allow creating keys(files) directly under root
>> or volumes*. Users will receive an error message if they try to do that:
>> 
>> $ ozone fs -mkdir /volume1
>>> 2020-06-04 00:00:00,000 [main] INFO rpc.RpcClient: Creating Volume:
>>> volume1, with hadoop as owner.
>>> $ ozone fs -touch /volume1/key1
>>> touch: Cannot create file under root or volume.
>> 
>> 
>> A short note: `ozone fs`, `hadoop fs`, `hdfs dfs` can be used
>> interchangeably. As long as the jars and client config for OFS are in place.
>> 
>> 
>> 1. With OFS, fs.defaultFS (in core-site.xml) no longer needs to have a
>> specific volume and bucket in its path like o3fs did. Simply put the OM
>> host or service ID:
>> 
>> <property>
>>> <name>fs.defaultFS</name>
>>> <value>ofs://omservice</value>
>>> </property>
>> 
>> 
>> Then the client should be able to access every volume and bucket in that
>> cluster without specifying the host name or service ID.
>> 
>> ozone fs -mkdir -p /volume1/bucket1
>> 
>> 
>> 2. Admins can create and delete volumes and buckets easily with Hadoop FS
>> shell. Volumes and buckets are treated similar to directories so they will
>> be created if they don't exist with *-p*:
>> 
>> ozone fs -mkdir -p ofs://omservice/volume1/bucket1/dir1/
>> 
>> 
>> Note that the supported volume and bucket name character set rule still
>> applies. For example, bucket and volume names don't like *underscore*(_):
>> 
>> $ ozone fs -mkdir -p /volume_1
>>> mkdir: Bucket or Volume name has an unsupported character : _
>> 
>> 
>> 3. To be compatible with legacy Hadoop applications that use /tmp/, we have
>> a special temp mount located at the root of the FS.
>> 
>> In order to use it, first an admin needs to create the volume *tmp* (the
>> volume name is hardcoded at the moment) and set its ACL to world ALL
>> access. This only needs to be done *once per cluster*:
>> 
>> $ ozone sh volume create tmp
>>> $ ozone sh volume setacl tmp -al world::a
>> 
>> 
>> Then *each user* needs to mkdir first to initialize their own temp bucket
>> once. After that they can write to it just like they would do to a regular
>> directory:
>> 
>> $ ozone fs -mkdir /tmp
>>> 2020-06-04 00:00:00,050 [main] INFO rpc.RpcClient: Creating Bucket:
>>> tmp/0238775c7bd96e2eab98038afe0c4279, with Versioning false and Storage
>>> Type set to DISK and Encryption set to false
>> 
>> $ ozone fs -touch /tmp/key1
>> 
>> 
>> 4. When keys are deleted to trash, they are moved to a trash directory
>> under each *bucket*, because keys can't be moved(renamed) between buckets.
>> 
>> $ ozone fs -rm /volume1/bucket1/key1
>>> 2020-06-04 00:00:00,100 [main] INFO fs.TrashPolicyDefault: Moved:
>>> 'ofs://id1/volume1/bucket1/key1' to trash at:
>>> ofs://id1/volume1/bucket1/.Trash/hadoop/Current/volume1/bucket1/key1
>> 
>> 
>> This is similar to how the HDFS encryption zone handles trash location.
>> 
>> 5. OFS supports recursive volume, bucket and key listing.
>> 
>> i.e. `ozone fs -ls -R ofs://omservice/` will recursively list all volumes,
>> buckets and keys the user has LIST permission to (if ACL is enabled).
>> Note this shouldn't degrade server performance as this logic is completely
>> client-side. As if the client is issuing multiple requests to the server to
>> get all the information.
>> 
>> 
>> So far the OFS feature set is complete, see sub-tasks of HDDS-2665. The
>> OFS feature branch was rebased 1 day ago to include Ozone master branch
>> commits. It passes existing checks in my latest rebase PR.
>> 
>> FileSystem contract tests and basic integration tests are also in place.
>> I ran basic shell commands i.e. the examples above. They work fine.
>> I ran WordCount in the docker compose environment (w/o YARN). It
>> succeeded. And I manually confirmed the correctness of the result.
>> I also ran TeraSort suite (only 1000 rows, in docker compose). The
>> result looks fine.
>> We have tested compatibility with MapReduce and Hive
>> 
>> I think it is time to merge HDDS-2665 into master. We can continue future
>> work (refactoring, improvements, performance analysis) from there.
>> I have submitted a PR for easier review of all code changes for OFS here:
>> https://github.com/apache/hadoop-ozone/pull/1021
>> 
>> Please vote on this thread. The vote will run for 7 days through
>> Thursday, June 11 11:59 PM GMT.
>> 
>> 
>> Thanks,
>> Siyao Meng
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org

Reply via email to