[jira] [Updated] (HDDS-4174) Add current HDDS layout version to Datanode heartbeat and registration.

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4174:
-
Labels: pull-request-available  (was: )

> Add current HDDS layout version to Datanode heartbeat and registration.
> ---
>
> Key: HDDS-4174
> URL: https://issues.apache.org/jira/browse/HDDS-4174
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Aravindan Vijayan
>Assignee: Prashant Pogde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> Add the layout version as a field to proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] prashantpogde opened a new pull request #1421: HDDS-4174. Add current HDDS layout version to Datanode heartbeat/registration

2020-09-13 Thread GitBox


prashantpogde opened a new pull request #1421:
URL: https://github.com/apache/hadoop-ozone/pull/1421


   ## What changes were proposed in this pull request?
   
   Add current HDDS layout version to DataNode heartbeat/registration
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4174
   
   ## How was this patch tested?
   Successful Build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1418: HDDS-4209. S3A Filesystem does not work with Ozone S3.

2020-09-13 Thread GitBox


bharatviswa504 edited a comment on pull request #1418:
URL: https://github.com/apache/hadoop-ozone/pull/1418#issuecomment-691794808


   >In this specific case, intermediate directories will be created even if 
OZONE_OM_ENABLE_FILESYSTEM_PATHS is not >enabled. I created HDDS-4238 to make 
it more visible:
   
   Good point. Might be we need to have a flag/param in createDirectory to say 
create intermediate directories or not. In this way, we can use this flag from 
the client in proto along with the normalization flag(current config in code) 
to create intermediate directories or not.
   
   Example:
   boolean zerobyteFile default=false
   
   It Will be set to true only from S3G, and when create directory comes to OM, 
it uses a normalization flag to create intermediate directories, else just 
create an entry without any intermediate directories.
   
   
   
   >I think it's a safer approach to fix the normalization (in case of 
OZONE_OM_ENABLE_FILESYSTEM_PATHS enabled), to >avoid the removal of / from the 
end if the file size is zero.
   
   My reasoning to take this approach is once HDDS-2939 comes in Ozone 
`directory` and `key` are not distinguished with trailing "/". So, using 
putObject when length is zero might not be a correct solution in OM, as the 
entries will be still created in keyTable. For this, if we want to go this 
route, then might be if  ending with "/" and size is zero, in putObject we 
should create an entry in the directory table. So, instead of doing these 
changes in OM, I thought it could be safer to do in S3G. As in S3G when 
someones tries to create a zero byte file with trailing "/" might be they want 
to simulate like a directory in Object Store.
   
   Let me know your thoughts on how to proceed?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1418: HDDS-4209. S3A Filesystem does not work with Ozone S3.

2020-09-13 Thread GitBox


bharatviswa504 commented on pull request #1418:
URL: https://github.com/apache/hadoop-ozone/pull/1418#issuecomment-691794808


   
   >In this specific case, intermediate directories will be created even if 
OZONE_OM_ENABLE_FILESYSTEM_PATHS is not >enabled. I created HDDS-4238 to make 
it more visible:
   
   Good point. Might be we need to have a flag/param in createDirectory to say 
create intermediate directories or not. In this way, we can use this flag from 
the client in proto along with the normalization flag(current config in code) 
to create intermediate directories or not.
   
   Example:
   boolean zerobyteFile default=false
   
   It Will be set to true only from S3G, and when create directory comes to OM, 
it uses a normalization flag to create intermediate directories, else just 
create an entry without any intermediate directories.
   
   
   
   >I think it's a safer approach to fix the normalization (in case of 
OZONE_OM_ENABLE_FILESYSTEM_PATHS enabled), to >avoid the removal of / from the 
end if the file size is zero.
   
   My reasoning to take this approach is once HDDS-2939 comes in Ozone 
`directory` and `key` are not distinguished with trailing "/". So, using 
putObject when length is zero might not be a correct solution in OM, as the 
entries will be still created in keyTable. For this, if we want to go this 
route, then might be if  ending with "/" and size is zero, in putObject we 
should create an entry in the directory table. So, instead of doing these 
changes in OM, I thought it could be safer to do in S3G. As in S3G when 
someones tries to create a zero byte file with trailing "/" might be they want 
to simulate like a directory in Object Store.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1398: HDDS-4210. ResolveBucket during checkAcls fails.

2020-09-13 Thread GitBox


bharatviswa504 commented on a change in pull request #1398:
URL: https://github.com/apache/hadoop-ozone/pull/1398#discussion_r487634517



##
File path: hadoop-ozone/dist/src/main/compose/ozonesecure-om-ha/test.sh
##
@@ -30,6 +30,8 @@ execute_robot_test scm kinit.robot
 
 execute_robot_test scm freon
 
+execute_robot_test scm basic/links.robot

Review comment:
   Good question might be it is good to run only during security-enabled 
cluster.
   But it has some tests which are bucket link features, so I thought it might 
be good to run in both secure and non-secure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4237) Testing Infrastructure for network partitioning

2020-09-13 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated HDDS-4237:
---
Description: 
Network partitioning can cause brian-split case where there are two leaders 
exist. We need some sort of testing Infrastructure/framework to simulate such 
case and verify whether our  SCM HA implementation can achieve strong 
consistency under partitioned network.

There might be two ways suggested by Mukul Kumar Singh:

a) Blockade tests, blockade is a docker based framework where the
network for one DN can be isolated from the other

b) MiniOzoneChaosCluster - This is a unit test based test, where a
random datanode was killed and this helped in finding out issues with
the consistency.


We might need similar solution for SCM: block SCM leader network and also 
increase timeout to make old leader do not turn into candidate.

  was:
Network partitioning can cause Brian-split case where there are two leaders 
exist. We need some sort of testing Infrastructure/framework to simulate such 
case and verify whether our  SCM HA implementation can achieve strong 
consistency.

There might be two ways suggested by Mukul Kumar Singh:

a) Blockade tests, blockade is a docker based framework where the
network for one DN can be isolated from the other

b) MiniOzoneChaosCluster - This is a unit test based test, where a
random datanode was killed and this helped in finding out issues with
the consistency.


We might need similar solution for SCM: block SCM leader network and also 
increase timeout to make old leader do not turn into candidate.


> Testing Infrastructure for network partitioning
> ---
>
> Key: HDDS-4237
> URL: https://issues.apache.org/jira/browse/HDDS-4237
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Priority: Major
>
> Network partitioning can cause brian-split case where there are two leaders 
> exist. We need some sort of testing Infrastructure/framework to simulate such 
> case and verify whether our  SCM HA implementation can achieve strong 
> consistency under partitioned network.
> There might be two ways suggested by Mukul Kumar Singh:
> a) Blockade tests, blockade is a docker based framework where the
> network for one DN can be isolated from the other
> b) MiniOzoneChaosCluster - This is a unit test based test, where a
> random datanode was killed and this helped in finding out issues with
> the consistency.
> We might need similar solution for SCM: block SCM leader network and also 
> increase timeout to make old leader do not turn into candidate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4237) Testing Infrastructure for network partitioning

2020-09-13 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated HDDS-4237:
---
Description: 
Network partitioning can cause Brian-split case where there are two leaders 
exist. We need some sort of testing Infrastructure/framework to simulate such 
case and verify whether our  SCM HA implementation can achieve strong 
consistency.

There might be two ways suggested by Mukul Kumar Singh:

a) Blockade tests, blockade is a docker based framework where the
network for one DN can be isolated from the other

b) MiniOzoneChaosCluster - This is a unit test based test, where a
random datanode was killed and this helped in finding out issues with
the consistency.


We might need similar solution for SCM: block SCM leader network and also 
increase timeout to make old leader do not turn into candidate.

  was:
Network partitioning can cause Brian-split case where there are two leaders 
exist. We need some sort of testing Infrastructure/framework to simulate such 
case and verify whether our  SCM HA implementation can achieve strong 
consistency.

There might be two ways suggested by Mukul Kumar Singh:

a) Blockade tests, blockade is a docker based framework where the
network for one DN can be isolated from the other

b) MiniOzoneChaosCluster - This is a unit test based test, where a
random datanode was killed and this helped in finding out issues with
the consistency.


> Testing Infrastructure for network partitioning
> ---
>
> Key: HDDS-4237
> URL: https://issues.apache.org/jira/browse/HDDS-4237
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Priority: Major
>
> Network partitioning can cause Brian-split case where there are two leaders 
> exist. We need some sort of testing Infrastructure/framework to simulate such 
> case and verify whether our  SCM HA implementation can achieve strong 
> consistency.
> There might be two ways suggested by Mukul Kumar Singh:
> a) Blockade tests, blockade is a docker based framework where the
> network for one DN can be isolated from the other
> b) MiniOzoneChaosCluster - This is a unit test based test, where a
> random datanode was killed and this helped in finding out issues with
> the consistency.
> We might need similar solution for SCM: block SCM leader network and also 
> increase timeout to make old leader do not turn into candidate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4240) Ozone support append operation

2020-09-13 Thread runzhiwang (Jira)
runzhiwang created HDDS-4240:


 Summary: Ozone support append operation
 Key: HDDS-4240
 URL: https://issues.apache.org/jira/browse/HDDS-4240
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4239) Ozone support truncate operation

2020-09-13 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-4239:
-
Summary: Ozone support truncate operation  (was: Ozone support truncate)

> Ozone support truncate operation
> 
>
> Key: HDDS-4239
> URL: https://issues.apache.org/jira/browse/HDDS-4239
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: Ozone Truncate Design-v1.pdf
>
>
> Design: 
> https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4239) Ozone support truncate

2020-09-13 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-4239:
-
Attachment: Ozone Truncate Design-v1.pdf

> Ozone support truncate
> --
>
> Key: HDDS-4239
> URL: https://issues.apache.org/jira/browse/HDDS-4239
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
> Attachments: Ozone Truncate Design-v1.pdf
>
>
> Design: 
> https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4239) Ozone support truncate

2020-09-13 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-4239:
-
Description: Design: 
https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit#

> Ozone support truncate
> --
>
> Key: HDDS-4239
> URL: https://issues.apache.org/jira/browse/HDDS-4239
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>
> Design: 
> https://docs.google.com/document/d/1Ju9WeuFuf_D8gElRCJH1-as0OyC6TOtHPHErycL43XQ/edit#



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4239) Ozone support truncate

2020-09-13 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-4239:
-
Parent: HDDS-3714
Issue Type: Sub-task  (was: New Feature)

> Ozone support truncate
> --
>
> Key: HDDS-4239
> URL: https://issues.apache.org/jira/browse/HDDS-4239
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4239) Ozone support truncate

2020-09-13 Thread runzhiwang (Jira)
runzhiwang created HDDS-4239:


 Summary: Ozone support truncate
 Key: HDDS-4239
 URL: https://issues.apache.org/jira/browse/HDDS-4239
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3927) Rename Ozone OM,DN,SCM runtime options to conform to naming conventions

2020-09-13 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-3927:
--
Fix Version/s: 1.1.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Rename Ozone OM,DN,SCM runtime options to conform to naming conventions
> ---
>
> Key: HDDS-3927
> URL: https://issues.apache.org/jira/browse/HDDS-3927
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> Similar to {{HDFS_NAMENODE_OPTS}}, {{HDFS_DATANODE_OPTS}}, etc., we should 
> have {{OZONE_MANAGER_OPTS}}, {{OZONE_DATANODE_OPTS}} to allow adding JVM args 
> for GC tuning and debugging.
> Update 1:
> [~bharat] mentioned we already have some equivalents for OM and Ozone DNs:
> - 
> [HDFS_OM_OPTS|https://github.com/apache/hadoop-ozone/blob/bc7786a2fafb2d36923506f8de6c25fcfd26d55b/hadoop-ozone/dist/src/shell/ozone/ozone#L157]
>  for Ozone OM. This looks like a typo, should begin with HDDS
> - 
> [HDDS_DN_OPTS|https://github.com/apache/hadoop-ozone/blob/bc7786a2fafb2d36923506f8de6c25fcfd26d55b/hadoop-ozone/dist/src/shell/ozone/ozone#L108]
>  for Ozone DNs
> Update 2:
> - HDFS_OM_OPTS -> OZONE_OM_OPTS
> - HDDS_DN_OPTS -> OZONE_DATANODE_OPTS
> - HDFS_STORAGECONTAINERMANAGER_OPTS -> OZONE_SCM_OPTS
> The new names conforms to {{hadoop_subcommand_opts}}. Thanks [~elek] for 
> pointing this out.
> Objective:
> Rename the environment variables to be in accordance with the convention, and 
> keep the compatibility by deprecating the old variable names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek merged pull request #1401: HDDS-3927. Rename Ozone OM,DN,SCM runtime options to conform to naming conventions

2020-09-13 Thread GitBox


elek merged pull request #1401:
URL: https://github.com/apache/hadoop-ozone/pull/1401


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] codecov-commenter commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


codecov-commenter commented on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-691705833


   # 
[Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=h1) 
Report
   > Merging 
[#1411](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=desc) 
into 
[master](https://codecov.io/gh/apache/hadoop-ozone/commit/9a4cb9e385c9fc95331ff7a0d2dd731e0a74a21c?el=desc)
 will **increase** coverage by `0.07%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/graphs/tree.svg?width=650&height=150&src=pr&token=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1411  +/-   ##
   
   + Coverage 75.11%   75.19%   +0.07% 
   - Complexity1048810497   +9 
   
 Files   990  990  
 Lines 5088550885  
 Branches   4960 4960  
   
   + Hits  3822138261  +40 
   + Misses1028010238  -42 
   - Partials   2384 2386   +2 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hadoop-ozone/pull/1411?src=pr&el=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...er/common/transport/server/GrpcXceiverService.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3RyYW5zcG9ydC9zZXJ2ZXIvR3JwY1hjZWl2ZXJTZXJ2aWNlLmphdmE=)
 | `70.00% <0.00%> (-10.00%)` | `3.00% <0.00%> (ø%)` | |
   | 
[...ache/hadoop/ozone/om/codec/S3SecretValueCodec.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9jb2RlYy9TM1NlY3JldFZhbHVlQ29kZWMuamF2YQ==)
 | `90.90% <0.00%> (-9.10%)` | `3.00% <0.00%> (-1.00%)` | |
   | 
[...hdds/scm/container/common/helpers/ExcludeList.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zY20vY29udGFpbmVyL2NvbW1vbi9oZWxwZXJzL0V4Y2x1ZGVMaXN0LmphdmE=)
 | `78.26% <0.00%> (-8.70%)` | `17.00% <0.00%> (-2.00%)` | |
   | 
[...doop/hdds/scm/container/ContainerStateManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL2NvbnRhaW5lci9Db250YWluZXJTdGF0ZU1hbmFnZXIuamF2YQ==)
 | `81.67% <0.00%> (-6.88%)` | `32.00% <0.00%> (-3.00%)` | |
   | 
[...apache/hadoop/hdds/server/events/EventWatcher.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvZnJhbWV3b3JrL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zZXJ2ZXIvZXZlbnRzL0V2ZW50V2F0Y2hlci5qYXZh)
 | `77.77% <0.00%> (-4.17%)` | `14.00% <0.00%> (ø%)` | |
   | 
[...doop/hdds/scm/pipeline/SimplePipelineProvider.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL3BpcGVsaW5lL1NpbXBsZVBpcGVsaW5lUHJvdmlkZXIuamF2YQ==)
 | `76.00% <0.00%> (-4.00%)` | `4.00% <0.00%> (-1.00%)` | |
   | 
[...ent/algorithms/SCMContainerPlacementRackAware.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL2NvbnRhaW5lci9wbGFjZW1lbnQvYWxnb3JpdGhtcy9TQ01Db250YWluZXJQbGFjZW1lbnRSYWNrQXdhcmUuamF2YQ==)
 | `76.69% <0.00%> (-3.01%)` | `31.00% <0.00%> (-2.00%)` | |
   | 
[...va/org/apache/hadoop/ozone/lease/LeaseManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3Avb3pvbmUvbGVhc2UvTGVhc2VNYW5hZ2VyLmphdmE=)
 | `90.80% <0.00%> (-2.30%)` | `15.00% <0.00%> (-1.00%)` | |
   | 
[...apache/hadoop/ozone/client/io/KeyOutputStream.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLW96b25lL2NsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL2NsaWVudC9pby9LZXlPdXRwdXRTdHJlYW0uamF2YQ==)
 | `78.75% <0.00%> (-2.09%)` | `45.00% <0.00%> (-3.00%)` | |
   | 
[...hadoop/hdds/scm/container/SCMContainerManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree#diff-aGFkb29wLWhkZHMvc2VydmVyLXNjbS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL2hkZHMvc2NtL2NvbnRhaW5lci9TQ01Db250YWluZXJNYW5hZ2VyLmphdmE=)
 | `73.68% <0.00%> (-1.92%)` | `39.00% <0.00%> (-1.00%)` | |
   | ... and [20 
more](https://codecov.io/gh/apache/hadoop-ozone/pull/1411/diff?src=pr&el=tree-more)
 | |

[jira] [Updated] (HDDS-3689) Add various profiles to MiniOzoneChaosCluster to run different modes

2020-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3689:
-
Labels: pull-request-available  (was: )

> Add various profiles to MiniOzoneChaosCluster to run different modes
> 
>
> Key: HDDS-3689
> URL: https://issues.apache.org/jira/browse/HDDS-3689
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: pull-request-available
>
> Add various profiles to MiniOzoneChaosCluster to run different modes. This 
> will help in running different modes easily from MiniOzoneChaosCluster shell 
> script



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] mukul1987 opened a new pull request #1420: HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes.

2020-09-13 Thread GitBox


mukul1987 opened a new pull request #1420:
URL: https://github.com/apache/hadoop-ozone/pull/1420


   ## What changes were proposed in this pull request?
   
   This change add datanode, OM and a mix of both as a chaos mode. The chaos 
tests have helped in finding multiple issues in Ozone.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3689
   
   ## How was this patch tested?
   Ran MiniOzoneChaosCluster.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #1419: HDDS-3755 [DESIGN] Storage-class for Ozone

2020-09-13 Thread GitBox


elek opened a new pull request #1419:
URL: https://github.com/apache/hadoop-ozone/pull/1419


   ## What changes were proposed in this pull request?
   
   This is a design doc, which is moved from hackmd to make it easier to track 
the progress.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487497090



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,280 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used. It means that if both S3 and HCFS are used, normalization is 
forced, and S3 interface is not fully AWS S3 compatible. There is no option to 
use HCFS and S3 but with full AWS compatibility (and reduced HCFS 
compatibility). 
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for example 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487496850



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,280 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used. It means that if both S3 and HCFS are used, normalization is 
forced, and S3 interface is not fully AWS S3 compatible. There is no option to 
use HCFS and S3 but with full AWS compatibility (and reduced HCFS 
compatibility). 
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for example 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487496485



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,280 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used. It means that if both S3 and HCFS are used, normalization is 
forced, and S3 interface is not fully AWS S3 compatible. There is no option to 
use HCFS and S3 but with full AWS compatibility (and reduced HCFS 
compatibility). 
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for example 
`/a/b` for the key `/b/c/c`)

Review comment:
   Good catch, thanks, fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on 

[GitHub] [hadoop-ozone] elek commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-691630554


   > > > Thank You @elek for the design document.
   > > > My understanding from this is the draft is as below. Let me know if I 
am missing something here.
   > > > https://user-images.githubusercontent.com/8586345/92635994-8856fe80-f28b-11ea-95bf-8864d48e488f.png";>
   > > 
   > > 
   > > Correct. But this is not a matrix anymore. You should turn on either 
first or second of the configs, but not both.
   > 
   > Not sure what is meant here, because we have 2 configs, now we can have 4 
combinations according to proposal 3 are valid, 4th one is not.
   
   Agree, but there are two ways to define this 3 options:
   
   1st approach:
   
   KEY1=true,KEY2=true --> option1
   KEY1=false,KEY2=false --> option2
   KEY1=true,KEY2=false --> option3
   KEY1=false,KEY2=true --> invalid 
   
   2nd approach:
   
   KEY1=true --> option 1
   KEY2=true --> option2
   else --> option3
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487496013



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 k

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487495900



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 k

[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


xiaoyuyao commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486761072



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-13 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r487495511



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 k

[jira] [Created] (HDDS-4238) Test AWS S3 client compatibility with fs incompatible keys

2020-09-13 Thread Marton Elek (Jira)
Marton Elek created HDDS-4238:
-

 Summary: Test AWS S3 client compatibility with fs incompatible keys
 Key: HDDS-4238
 URL: https://issues.apache.org/jira/browse/HDDS-4238
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


There is a discussion in HDDS-4097 to define the ofs and s3 behavior (how to 
normalize / store keys).

Keys which has FS compatible names (like a/b/v/d) can be handled easily, but 
there are corner cases with fs incompatible path (like a/bd or a/b/../c)

This patch creates a new robot test suite to test these cases (based on the 
behavior of AWS S3).

Note: based on the discussion in HDDS_4097 there are cases where the new test 
is failed (with specific settings we can prefer ofs/o3fs compatibilty/full-view 
instead of 100% s3 compatibility)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org