+1 for merging. I tried out ofs and all the Hadoop shell commands that I tried worked seamlessly.
Good feedback from others who voted, let’s address it soon. > >> On Jun 4, 2020, at 9:17 AM, Siyao Meng <sm...@cloudera.com.INVALID> wrote: >> >> Also forwarding to hdfs-dev@. I misspelled the address. >> --Siyao >> >> ---------- Forwarded message --------- >> From: Siyao Meng <sm...@cloudera.com> >> Date: Thu, Jun 4, 2020 at 9:14 AM >> Subject: [Ozone] [VOTE] Merge HDDS-2665 branch (OFS) to master >> To: <ozone-dev@hadoop.apache.org>, <hadoop-...@hadoop.apache.org> >> >> >> Hi Ozone developers, >> >> I'd like to propose merging feature branch HDDS-2665 into master branch >> for new Ozone Filesystem scheme *ofs://*. >> >> This new Filesystem scheme intends to improve Ozone user experience. OFS >> is a client-side FileSystem implementation. It can co-exist with o3fs and >> be used interchangeably if needed on the same client. >> >> >> OFS scheme in a nutshell: >> >> ofs://<Host name[:Port] or OM Service >>> ID>/[<volumeName>/<bucketName>/path/to/key] >> >> >> And here's a simple list of valid OFS URI -- this should cover all expected >> daily usages: >> >> ofs://om1/ >>> ofs://om3:9862/ >>> ofs://omservice/ >>> ofs://omservice/volume1/ >>> ofs://omservice/volume1/bucket1/ >>> ofs://omservice/volume1/bucket1/dir1 >>> ofs://omservice/volume1/bucket1/dir1/key1 >> >> ofs://omservice/tmp/ >> >> ofs://omservice/tmp/key1 >> >> >> Located at the root of an OFS Filesystem are volumes and mount(s). Inside >> a volume lies all the buckets. Inside buckets are keys and directories. >> For mounts, only temp mount */tmp/* is supported at the moment -- more on >> this later. >> >> So naturally, OFS *don't allow creating keys(files) directly under root >> or volumes*. Users will receive an error message if they try to do that: >> >> $ ozone fs -mkdir /volume1 >>> 2020-06-04 00:00:00,000 [main] INFO rpc.RpcClient: Creating Volume: >>> volume1, with hadoop as owner. >>> $ ozone fs -touch /volume1/key1 >>> touch: Cannot create file under root or volume. >> >> >> A short note: `ozone fs`, `hadoop fs`, `hdfs dfs` can be used >> interchangeably. As long as the jars and client config for OFS are in place. >> >> >> 1. With OFS, fs.defaultFS (in core-site.xml) no longer needs to have a >> specific volume and bucket in its path like o3fs did. Simply put the OM >> host or service ID: >> >> <property> >>> <name>fs.defaultFS</name> >>> <value>ofs://omservice</value> >>> </property> >> >> >> Then the client should be able to access every volume and bucket in that >> cluster without specifying the host name or service ID. >> >> ozone fs -mkdir -p /volume1/bucket1 >> >> >> 2. Admins can create and delete volumes and buckets easily with Hadoop FS >> shell. Volumes and buckets are treated similar to directories so they will >> be created if they don't exist with *-p*: >> >> ozone fs -mkdir -p ofs://omservice/volume1/bucket1/dir1/ >> >> >> Note that the supported volume and bucket name character set rule still >> applies. For example, bucket and volume names don't like *underscore*(_): >> >> $ ozone fs -mkdir -p /volume_1 >>> mkdir: Bucket or Volume name has an unsupported character : _ >> >> >> 3. To be compatible with legacy Hadoop applications that use /tmp/, we have >> a special temp mount located at the root of the FS. >> >> In order to use it, first an admin needs to create the volume *tmp* (the >> volume name is hardcoded at the moment) and set its ACL to world ALL >> access. This only needs to be done *once per cluster*: >> >> $ ozone sh volume create tmp >>> $ ozone sh volume setacl tmp -al world::a >> >> >> Then *each user* needs to mkdir first to initialize their own temp bucket >> once. After that they can write to it just like they would do to a regular >> directory: >> >> $ ozone fs -mkdir /tmp >>> 2020-06-04 00:00:00,050 [main] INFO rpc.RpcClient: Creating Bucket: >>> tmp/0238775c7bd96e2eab98038afe0c4279, with Versioning false and Storage >>> Type set to DISK and Encryption set to false >> >> $ ozone fs -touch /tmp/key1 >> >> >> 4. When keys are deleted to trash, they are moved to a trash directory >> under each *bucket*, because keys can't be moved(renamed) between buckets. >> >> $ ozone fs -rm /volume1/bucket1/key1 >>> 2020-06-04 00:00:00,100 [main] INFO fs.TrashPolicyDefault: Moved: >>> 'ofs://id1/volume1/bucket1/key1' to trash at: >>> ofs://id1/volume1/bucket1/.Trash/hadoop/Current/volume1/bucket1/key1 >> >> >> This is similar to how the HDFS encryption zone handles trash location. >> >> 5. OFS supports recursive volume, bucket and key listing. >> >> i.e. `ozone fs -ls -R ofs://omservice/` will recursively list all volumes, >> buckets and keys the user has LIST permission to (if ACL is enabled). >> Note this shouldn't degrade server performance as this logic is completely >> client-side. As if the client is issuing multiple requests to the server to >> get all the information. >> >> >> So far the OFS feature set is complete, see sub-tasks of HDDS-2665. The >> OFS feature branch was rebased 1 day ago to include Ozone master branch >> commits. It passes existing checks in my latest rebase PR. >> >> FileSystem contract tests and basic integration tests are also in place. >> I ran basic shell commands i.e. the examples above. They work fine. >> I ran WordCount in the docker compose environment (w/o YARN). It >> succeeded. And I manually confirmed the correctness of the result. >> I also ran TeraSort suite (only 1000 rows, in docker compose). The >> result looks fine. >> We have tested compatibility with MapReduce and Hive >> >> I think it is time to merge HDDS-2665 into master. We can continue future >> work (refactoring, improvements, performance analysis) from there. >> I have submitted a PR for easier review of all code changes for OFS here: >> https://github.com/apache/hadoop-ozone/pull/1021 >> >> Please vote on this thread. The vote will run for 7 days through >> Thursday, June 11 11:59 PM GMT. >> >> >> Thanks, >> Siyao Meng > --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org