[jira] [Updated] (HDDS-4174) Add current HDDS layout version to Datanode heartbeat and registration.
[ https://issues.apache.org/jira/browse/HDDS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-4174: - Labels: pull-request-available (was: ) > Add current HDDS layout version to Datanode heartbeat and registration. > --- > > Key: HDDS-4174 > URL: https://issues.apache.org/jira/browse/HDDS-4174 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Datanode >Reporter: Aravindan Vijayan >Assignee: Prashant Pogde >Priority: Major > Labels: pull-request-available > Fix For: 1.1.0 > > > Add the layout version as a field to proto. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] prashantpogde opened a new pull request #1421: HDDS-4174. Add current HDDS layout version to Datanode heartbeat/registration
prashantpogde opened a new pull request #1421: URL: https://github.com/apache/hadoop-ozone/pull/1421 ## What changes were proposed in this pull request? Add current HDDS layout version to DataNode heartbeat/registration ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-4174 ## How was this patch tested? Successful Build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1418: HDDS-4209. S3A Filesystem does not work with Ozone S3.
bharatviswa504 edited a comment on pull request #1418: URL: https://github.com/apache/hadoop-ozone/pull/1418#issuecomment-691794808 >In this specific case, intermediate directories will be created even if OZONE_OM_ENABLE_FILESYSTEM_PATHS is not >enabled. I created HDDS-4238 to make it more visible: Good point. Might be we need to have a flag/param in createDirectory to say create intermediate directories or not. In this way, we can use this flag from the client in proto along with the normalization flag(current config in code) to create intermediate directories or not. Example: boolean zerobyteFile default=false It Will be set to true only from S3G, and when create directory comes to OM, it uses a normalization flag to create intermediate directories, else just create an entry without any intermediate directories. >I think it's a safer approach to fix the normalization (in case of OZONE_OM_ENABLE_FILESYSTEM_PATHS enabled), to >avoid the removal of / from the end if the file size is zero. My reasoning to take this approach is once HDDS-2939 comes in Ozone `directory` and `key` are not distinguished with trailing "/". So, using putObject when length is zero might not be a correct solution in OM, as the entries will be still created in keyTable. For this, if we want to go this route, then might be if ending with "/" and size is zero, in putObject we should create an entry in the directory table. So, instead of doing these changes in OM, I thought it could be safer to do in S3G. As in S3G when someones tries to create a zero byte file with trailing "/" might be they want to simulate like a directory in Object Store. Let me know your thoughts on how to proceed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1418: HDDS-4209. S3A Filesystem does not work with Ozone S3.
bharatviswa504 commented on pull request #1418: URL: https://github.com/apache/hadoop-ozone/pull/1418#issuecomment-691794808 >In this specific case, intermediate directories will be created even if OZONE_OM_ENABLE_FILESYSTEM_PATHS is not >enabled. I created HDDS-4238 to make it more visible: Good point. Might be we need to have a flag/param in createDirectory to say create intermediate directories or not. In this way, we can use this flag from the client in proto along with the normalization flag(current config in code) to create intermediate directories or not. Example: boolean zerobyteFile default=false It Will be set to true only from S3G, and when create directory comes to OM, it uses a normalization flag to create intermediate directories, else just create an entry without any intermediate directories. >I think it's a safer approach to fix the normalization (in case of OZONE_OM_ENABLE_FILESYSTEM_PATHS enabled), to >avoid the removal of / from the end if the file size is zero. My reasoning to take this approach is once HDDS-2939 comes in Ozone `directory` and `key` are not distinguished with trailing "/". So, using putObject when length is zero might not be a correct solution in OM, as the entries will be still created in keyTable. For this, if we want to go this route, then might be if ending with "/" and size is zero, in putObject we should create an entry in the directory table. So, instead of doing these changes in OM, I thought it could be safer to do in S3G. As in S3G when someones tries to create a zero byte file with trailing "/" might be they want to simulate like a directory in Object Store. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1398: HDDS-4210. ResolveBucket during checkAcls fails.
bharatviswa504 commented on a change in pull request #1398: URL: https://github.com/apache/hadoop-ozone/pull/1398#discussion_r487634517 ## File path: hadoop-ozone/dist/src/main/compose/ozonesecure-om-ha/test.sh ## @@ -30,6 +30,8 @@ execute_robot_test scm kinit.robot execute_robot_test scm freon +execute_robot_test scm basic/links.robot Review comment: Good question might be it is good to run only during security-enabled cluster. But it has some tests which are bucket link features, so I thought it might be good to run in both secure and non-secure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4237) Testing Infrastructure for network partitioning
[ https://issues.apache.org/jira/browse/HDDS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated HDDS-4237: --- Description: Network partitioning can cause brian-split case where there are two leaders exist. We need some sort of testing Infrastructure/framework to simulate such case and verify whether our SCM HA implementation can achieve strong consistency under partitioned network. There might be two ways suggested by Mukul Kumar Singh: a) Blockade tests, blockade is a docker based framework where the network for one DN can be isolated from the other b) MiniOzoneChaosCluster - This is a unit test based test, where a random datanode was killed and this helped in finding out issues with the consistency. We might need similar solution for SCM: block SCM leader network and also increase timeout to make old leader do not turn into candidate. was: Network partitioning can cause Brian-split case where there are two leaders exist. We need some sort of testing Infrastructure/framework to simulate such case and verify whether our SCM HA implementation can achieve strong consistency. There might be two ways suggested by Mukul Kumar Singh: a) Blockade tests, blockade is a docker based framework where the network for one DN can be isolated from the other b) MiniOzoneChaosCluster - This is a unit test based test, where a random datanode was killed and this helped in finding out issues with the consistency. We might need similar solution for SCM: block SCM leader network and also increase timeout to make old leader do not turn into candidate. > Testing Infrastructure for network partitioning > --- > > Key: HDDS-4237 > URL: https://issues.apache.org/jira/browse/HDDS-4237 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Rui Wang >Priority: Major > > Network partitioning can cause brian-split case where there are two leaders > exist. We need some sort of testing Infrastructure/framework to simulate such > case and verify whether our SCM HA implementation can achieve strong > consistency under partitioned network. > There might be two ways suggested by Mukul Kumar Singh: > a) Blockade tests, blockade is a docker based framework where the > network for one DN can be isolated from the other > b) MiniOzoneChaosCluster - This is a unit test based test, where a > random datanode was killed and this helped in finding out issues with > the consistency. > We might need similar solution for SCM: block SCM leader network and also > increase timeout to make old leader do not turn into candidate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4237) Testing Infrastructure for network partitioning
[ https://issues.apache.org/jira/browse/HDDS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated HDDS-4237: --- Description: Network partitioning can cause Brian-split case where there are two leaders exist. We need some sort of testing Infrastructure/framework to simulate such case and verify whether our SCM HA implementation can achieve strong consistency. There might be two ways suggested by Mukul Kumar Singh: a) Blockade tests, blockade is a docker based framework where the network for one DN can be isolated from the other b) MiniOzoneChaosCluster - This is a unit test based test, where a random datanode was killed and this helped in finding out issues with the consistency. We might need similar solution for SCM: block SCM leader network and also increase timeout to make old leader do not turn into candidate. was: Network partitioning can cause Brian-split case where there are two leaders exist. We need some sort of testing Infrastructure/framework to simulate such case and verify whether our SCM HA implementation can achieve strong consistency. There might be two ways suggested by Mukul Kumar Singh: a) Blockade tests, blockade is a docker based framework where the network for one DN can be isolated from the other b) MiniOzoneChaosCluster - This is a unit test based test, where a random datanode was killed and this helped in finding out issues with the consistency. > Testing Infrastructure for network partitioning > --- > > Key: HDDS-4237 > URL: https://issues.apache.org/jira/browse/HDDS-4237 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Rui Wang >Priority: Major > > Network partitioning can cause Brian-split case where there are two leaders > exist. We need some sort of testing Infrastructure/framework to simulate such > case and verify whether our SCM HA implementation can achieve strong > consistency. > There might be two ways suggested by Mukul Kumar Singh: > a) Blockade tests, blockade is a docker based framework where the > network for one DN can be isolated from the other > b) MiniOzoneChaosCluster - This is a unit test based test, where a > random datanode was killed and this helped in finding out issues with > the consistency. > We might need similar solution for SCM: block SCM leader network and also > increase timeout to make old leader do not turn into candidate. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4240) Ozone support append operation
runzhiwang created HDDS-4240: Summary: Ozone support append operation Key: HDDS-4240 URL: https://issues.apache.org/jira/browse/HDDS-4240 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: runzhiwang Assignee: runzhiwang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4239) Ozone support truncate operation
[ https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4239: - Summary: Ozone support truncate operation (was: Ozone support truncate) > Ozone support truncate operation > > > Key: HDDS-4239 > URL: https://issues.apache.org/jira/browse/HDDS-4239 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: Ozone Truncate Design-v1.pdf > > > Design: > https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit# -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4239) Ozone support truncate
[ https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4239: - Attachment: Ozone Truncate Design-v1.pdf > Ozone support truncate > -- > > Key: HDDS-4239 > URL: https://issues.apache.org/jira/browse/HDDS-4239 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > Attachments: Ozone Truncate Design-v1.pdf > > > Design: > https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit# -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4239) Ozone support truncate
[ https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4239: - Description: Design: https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit# > Ozone support truncate > -- > > Key: HDDS-4239 > URL: https://issues.apache.org/jira/browse/HDDS-4239 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > > Design: > https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit# -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-4239) Ozone support truncate
[ https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] runzhiwang updated HDDS-4239: - Parent: HDDS-3714 Issue Type: Sub-task (was: New Feature) > Ozone support truncate > -- > > Key: HDDS-4239 > URL: https://issues.apache.org/jira/browse/HDDS-4239 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: runzhiwang >Assignee: runzhiwang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-4239) Ozone support truncate
runzhiwang created HDDS-4239: Summary: Ozone support truncate Key: HDDS-4239 URL: https://issues.apache.org/jira/browse/HDDS-4239 Project: Hadoop Distributed Data Store Issue Type: New Feature Reporter: runzhiwang Assignee: runzhiwang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-3927) Rename Ozone OM,DN,SCM runtime options to conform to naming conventions
[ https://issues.apache.org/jira/browse/HDDS-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-3927: -- Fix Version/s: 1.1.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Rename Ozone OM,DN,SCM runtime options to conform to naming conventions > --- > > Key: HDDS-3927 > URL: https://issues.apache.org/jira/browse/HDDS-3927 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Minor > Labels: pull-request-available > Fix For: 1.1.0 > > > Similar to {{HDFS_NAMENODE_OPTS}}, {{HDFS_DATANODE_OPTS}}, etc., we should > have {{OZONE_MANAGER_OPTS}}, {{OZONE_DATANODE_OPTS}} to allow adding JVM args > for GC tuning and debugging. > Update 1: > [~bharat] mentioned we already have some equivalents for OM and Ozone DNs: > - > [HDFS_OM_OPTS|https://github.com/apache/hadoop-ozone/blob/bc7786a2fafb2d36923506f8de6c25fcfd26d55b/hadoop-ozone/dist/src/shell/ozone/ozone#L157] > for Ozone OM. This looks like a typo, should begin with HDDS > - > [HDDS_DN_OPTS|https://github.com/apache/hadoop-ozone/blob/bc7786a2fafb2d36923506f8de6c25fcfd26d55b/hadoop-ozone/dist/src/shell/ozone/ozone#L108] > for Ozone DNs > Update 2: > - HDFS_OM_OPTS -> OZONE_OM_OPTS > - HDDS_DN_OPTS -> OZONE_DATANODE_OPTS > - HDFS_STORAGECONTAINERMANAGER_OPTS -> OZONE_SCM_OPTS > The new names conforms to {{hadoop_subcommand_opts}}. Thanks [~elek] for > pointing this out. > Objective: > Rename the environment variables to be in accordance with the convention, and > keep the compatibility by deprecating the old variable names. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek merged pull request #1401: HDDS-3927. Rename Ozone OM,DN,SCM runtime options to conform to naming conventions
elek merged pull request #1401: URL: https://github.com/apache/hadoop-ozone/pull/1401 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] codecov-commenter commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
codecov-commenter commented on pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-691705833 # [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=h1) Report > Merging [#1411](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hadoop-ozone/commit/9a4cb9e385c9fc95331ff7a0d2dd731e0a74a21c?el=desc) will **increase** coverage by `0.07%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/graphs/tree.svg?width=650&height=150&src=pr&token=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#1411 +/- ## + Coverage 75.11% 75.19% +0.07% - Complexity1048810497 +9 Files 990 990 Lines 5088550885 Branches 4960 4960 + Hits 3822138261 +40 + Misses1028010238 -42 - Partials 2384 2386 +2 ``` | [Impacted Files](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...er/common/transport/server/GrpcXceiverService.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3RyYW5zcG9ydC9zZXJ2ZXIvR3JwY1hjZWl2ZXJTZXJ2aWNlLmphdmE=) | `70.00% <0.00%> (-10.00%)` | `3.00% <0.00%> (ø%)` | | | [...ache/hadoop/ozone/om/codec/S3SecretValueCodec.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9jb2RlYy9TM1NlY3JldFZhbHVlQ29kZWMuamF2YQ==) | `90.90% <0.00%> (-9.10%)` | `3.00% <0.00%> (-1.00%)` | | | [...hdds/scm/container/common/helpers/ExcludeList.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zY20vY29udGFpbmVyL2NvbW1vbi9oZWxwZXJzL0V4Y2x1ZGVMaXN0LmphdmE=) | `78.26% <0.00%> (-8.70%)` | `17.00% <0.00%> (-2.00%)` | | | [...doop/hdds/scm/container/ContainerStateManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL2NvbnRhaW5lci9Db250YWluZXJTdGF0ZU1hbmFnZXIuamF2YQ==) | `81.67% <0.00%> (-6.88%)` | `32.00% <0.00%> (-3.00%)` | | | [...apache/hadoop/hdds/server/events/EventWatcher.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvZnJhbWV3b3JrL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zZXJ2ZXIvZXZlbnRzL0V2ZW50V2F0Y2hlci5qYXZh) | `77.77% <0.00%> (-4.17%)` | `14.00% <0.00%> (ø%)` | | | [...doop/hdds/scm/pipeline/SimplePipelineProvider.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL3BpcGVsaW5lL1NpbXBsZVBpcGVsaW5lUHJvdmlkZXIuamF2YQ==) | `76.00% <0.00%> (-4.00%)` | `4.00% <0.00%> (-1.00%)` | | | [...ent/algorithms/SCMContainerPlacementRackAware.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL2NvbnRhaW5lci9wbGFjZW1lbnQvYWxnb3JpdGhtcy9TQ01Db250YWluZXJQbGFjZW1lbnRSYWNrQXdhcmUuamF2YQ==) | `76.69% <0.00%> (-3.01%)` | `31.00% <0.00%> (-2.00%)` | | | [...va/org/apache/hadoop/ozone/lease/LeaseManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3Avb3pvbmUvbGVhc2UvTGVhc2VNYW5hZ2VyLmphdmE=) | `90.80% <0.00%> (-2.30%)` | `15.00% <0.00%> (-1.00%)` | | | [...apache/hadoop/ozone/client/io/KeyOutputStream.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL2NsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL2NsaWVudC9pby9LZXlPdXRwdXRTdHJlYW0uamF2YQ==) | `78.75% <0.00%> (-2.09%)` | `45.00% <0.00%> (-3.00%)` | | | [...hadoop/hdds/scm/container/SCMContainerManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL2NvbnRhaW5lci9TQ01Db250YWluZXJNYW5hZ2VyLmphdmE=) | `73.68% <0.00%> (-1.92%)` | `39.00% <0.00%> (-1.00%)` | | | ... and [20 more](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree-more) | |
[jira] [Updated] (HDDS-3689) Add various profiles to MiniOzoneChaosCluster to run different modes
[ https://issues.apache.org/jira/browse/HDDS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-3689: - Labels: pull-request-available (was: ) > Add various profiles to MiniOzoneChaosCluster to run different modes > > > Key: HDDS-3689 > URL: https://issues.apache.org/jira/browse/HDDS-3689 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > > Add various profiles to MiniOzoneChaosCluster to run different modes. This > will help in running different modes easily from MiniOzoneChaosCluster shell > script -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] mukul1987 opened a new pull request #1420: HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes.
mukul1987 opened a new pull request #1420: URL: https://github.com/apache/hadoop-ozone/pull/1420 ## What changes were proposed in this pull request? This change add datanode, OM and a mix of both as a chaos mode. The chaos tests have helped in finding multiple issues in Ozone. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-3689 ## How was this patch tested? Ran MiniOzoneChaosCluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek opened a new pull request #1419: HDDS-3755 [DESIGN] Storage-class for Ozone
elek opened a new pull request #1419: URL: https://github.com/apache/hadoop-ozone/pull/1419 ## What changes were proposed in this pull request? This is a design doc, which is moved from hackmd to make it easier to track the progress. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487497090 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,280 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used. It means that if both S3 and HCFS are used, normalization is forced, and S3 interface is not fully AWS S3 compatible. There is no option to use HCFS and S3 but with full AWS compatibility (and reduced HCFS compatibility). + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for example `/a/b` for the key `/b/c/c`) + 2. Special file-system incompatible key names require special attention + +The second couldn't be done with compromise. + + 1. We either support all key names (including non fs compatible key names), which means `ofs
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487496850 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,280 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used. It means that if both S3 and HCFS are used, normalization is forced, and S3 interface is not fully AWS S3 compatible. There is no option to use HCFS and S3 but with full AWS compatibility (and reduced HCFS compatibility). + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for example `/a/b` for the key `/b/c/c`) + 2. Special file-system incompatible key names require special attention + +The second couldn't be done with compromise. + + 1. We either support all key names (including non fs compatible key names), which means `ofs
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487496485 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,280 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used. It means that if both S3 and HCFS are used, normalization is forced, and S3 interface is not fully AWS S3 compatible. There is no option to use HCFS and S3 but with full AWS compatibility (and reduced HCFS compatibility). + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for example `/a/b` for the key `/b/c/c`) Review comment: Good catch, thanks, fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on
[GitHub] [hadoop-ozone] elek commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-691630554 > > > Thank You @elek for the design document. > > > My understanding from this is the draft is as below. Let me know if I am missing something here. > > > https://user-images.githubusercontent.com/8586345/92635994-8856fe80-f28b-11ea-95bf-8864d48e488f.png";> > > > > > > Correct. But this is not a matrix anymore. You should turn on either first or second of the configs, but not both. > > Not sure what is meant here, because we have 2 configs, now we can have 4 combinations according to proposal 3 are valid, 4th one is not. Agree, but there are two ways to define this 3 options: 1st approach: KEY1=true,KEY2=true --> option1 KEY1=false,KEY2=false --> option2 KEY1=true,KEY2=false --> option3 KEY1=false,KEY2=true --> invalid 2nd approach: KEY1=true --> option 1 KEY2=true --> option2 else --> option3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487496013 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,282 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used which means that S3 and HCFS couldn't be used together with normalization. + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle `/a/b` for the key `/b/c/c`) + 2. Special file-system incompatible key names require special attention + +The second couldn't be done with compromise. + + 1. We either support all key names (including non fs compatible key names), which means `ofs/o3fs` can provide only a partial view + 2. Or we can normalize the key names to be fs compatible (which makes it possible to create inconsistent S3 k
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487495900 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,282 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used which means that S3 and HCFS couldn't be used together with normalization. + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle `/a/b` for the key `/b/c/c`) + 2. Special file-system incompatible key names require special attention + +The second couldn't be done with compromise. + + 1. We either support all key names (including non fs compatible key names), which means `ofs/o3fs` can provide only a partial view + 2. Or we can normalize the key names to be fs compatible (which makes it possible to create inconsistent S3 k
[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
xiaoyuyao commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486761072 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,282 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used which means that S3 and HCFS couldn't be used together with normalization. + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle `/a/b` for the key `/b/c/c`) + 2. Special file-system incompatible key names require special attention + +The second couldn't be done with compromise. + + 1. We either support all key names (including non fs compatible key names), which means `ofs/o3fs` can provide only a partial view + 2. Or we can normalize the key names to be fs compatible (which makes it possible to create inconsistent
[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op
elek commented on a change in pull request #1411: URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487495511 ## File path: hadoop-hdds/docs/content/design/s3_hcfs.md ## @@ -0,0 +1,282 @@ +--- +title: S3/Ozone Filesystem inter-op +summary: How to support both S3 and HCFS and the same time +date: 2020-09-09 +jira: HDDS-4097 +status: draft +author: Marton Elek, +--- + + +# Ozone S3 vs file-system semantics + +Ozone is an object-store for Hadoop ecosystem which can be used from multiple interfaces: + + 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the remaining of this document) (RPC) + 2. From S3 compatible applications (REST) + 3. From container orchestrator as mounted volume (CSI, alpha feature) + +As Ozone is an object store it stores key and values in a flat hierarchy which is enough to support S3 (2). But to support Hadoop Compatible File System (and CSI), Ozone should simulated file system hierarchy. + +There are multiple challenges when file system hierarchy is simulated by a flat namespace: + + 1. Some key patterns couldn't be easily transformed to file system path (e.g. `/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`) + 2. Directory entries (which may have own properties) require special handling as file system interface requires a dir entry even if it's not created explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a visible directory entry for file system interface) + 3. Non-recursive listing of directories can be hard (Listing direct entries under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) + 4. Similar to listing, rename can be a costly operation as it requires to rename many keys (renaming a first level directory means a rename of all the keys with the same prefix) + +See also the [Hadoop S3A documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client) which describes some of these problem when AWS S3 is used. (*Warnings* section) + +# Current status + +As of today *Ozone Manager* has two different interfaces (both are defined in `OmClientProtocol.proto`): + + 1. object store related functions (like *CreateKey*, *LookupKey*, *DeleteKey*,...) + 2. file system related functions (like *CreateFile*, *LookupFile*,...) + +File system related functions uses the same flat hierarchy under the hood but includes additional functionalities. For example the `createFile` call creates all the intermediate directories for a specific key (create file `/a/b/c` will create `/a/b` and `/a` entries in the key space) + +Today, if a key is created from the S3 interface can cause exceptions if the intermediate directories are checked from HCFS: + + +```shell +$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key /a/b/c/d + +$ ozone fs -ls o3fs://bucket1.s3v/a/ +ls: `o3fs://bucket1.s3v/a/': No such file or directory +``` + +This problem is reported in [HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this is enabled, intermediate directories are created even if the object store interface is used. + +This configuration is turned off by default, which means that S3 and HCFS couldn't be used together. + +To solve the performance problems of the directory listing / rename, [HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which propose to use a new prefix table to store the "directory" entries (=prefixes). + +[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to normalize the key names based on file-system semantics if `ozone.om.enable.filesystem.paths` is enabled. But please note that `ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS are both used which means that S3 and HCFS couldn't be used together with normalization. + +# Goals + + * Out of the box Ozone should support both S3 and HCFS interfaces without any settings. (It's possible only for the regular, fs compatible key names) + * As 100% compatibility couldn't be achieved on both side we need a configuration to set the expectations for incompatible key names + * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible (when s3 compatibilty is prefered) + +# Possible cases to support + +There are two main aspects of supporting both `ofs/o3fs` and `s3` together: + + 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle `/a/b` for the key `/b/c/c`) + 2. Special file-system incompatible key names require special attention + +The second couldn't be done with compromise. + + 1. We either support all key names (including non fs compatible key names), which means `ofs/o3fs` can provide only a partial view + 2. Or we can normalize the key names to be fs compatible (which makes it possible to create inconsistent S3 k
[jira] [Created] (HDDS-4238) Test AWS S3 client compatibility with fs incompatible keys
Marton Elek created HDDS-4238: - Summary: Test AWS S3 client compatibility with fs incompatible keys Key: HDDS-4238 URL: https://issues.apache.org/jira/browse/HDDS-4238 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Marton Elek Assignee: Marton Elek There is a discussion in HDDS-4097 to define the ofs and s3 behavior (how to normalize / store keys). Keys which has FS compatible names (like a/b/v/d) can be handled easily, but there are corner cases with fs incompatible path (like a/bd or a/b/../c) This patch creates a new robot test suite to test these cases (based on the behavior of AWS S3). Note: based on the discussion in HDDS_4097 there are cases where the new test is failed (with specific settings we can prefer ofs/o3fs compatibilty/full-view instead of 100% s3 compatibility) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org