[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486442745



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486438477



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] elek commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-690376470


   > Thank You @elek for the design document.
   > 
   > My understanding from this is the draft is as below. Let me know if I am 
missing something here.
   > https://user-images.githubusercontent.com/8586345/92635994-8856fe80-f28b-11ea-95bf-8864d48e488f.png;>
   
   Correct. But this is not a matrix anymore. You should turn on either first 
or second of the configs, but not both. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486443111



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] captainzmc commented on pull request #1412: HDDS-3751. Ozone sh client support bucket quota option.

2020-09-10 Thread GitBox


captainzmc commented on pull request #1412:
URL: https://github.com/apache/hadoop-ozone/pull/1412#issuecomment-690427621


   Hi @ChenSammi @cxorm, This PR is a bucket quota shell based on #1233 volume 
quota shell. Could you help review it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 commented on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690583817


   Hi @lokeshj1703 
   https://issues.apache.org/jira/browse/HDDS-4121 This Jira changes the 
behavior with this openKeyCleanup service does not call SCM directly, it moves 
the keys to deleteTable, and it will be picked by KeyDeletingService and it 
will be only the one to send delete keys request to SCM from OM.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486439769



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486441102



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486535317



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create 

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1416: HDDS-4233. Interrupted exeception printed out from DatanodeStateMachine

2020-09-10 Thread GitBox


amaliujia commented on a change in pull request #1416:
URL: https://github.com/apache/hadoop-ozone/pull/1416#discussion_r486500932



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeStateMachine.java
##
@@ -242,7 +241,7 @@ private void start() throws IOException {
   try {
 Thread.sleep(nextHB.get() - now);
   } catch (InterruptedException e) {
-LOG.warn("Interrupt the execution.", e);
+//triggerHeartbeat is called during the sleep

Review comment:
   Hmm I might misunderstand this: will then the InterruptedException 
silently be hidden? If so is this expected behavior?  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4196) Add an endpoint in Recon to query Prometheus

2020-09-10 Thread Vivek Ratnavel Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Ratnavel Subramanian resolved HDDS-4196.
--
Resolution: Fixed

> Add an endpoint in Recon to query Prometheus
> 
>
> Key: HDDS-4196
> URL: https://issues.apache.org/jira/browse/HDDS-4196
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Affects Versions: 1.0.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
>
> Recon should have an endpoint to proxy requests to the configured prometheus 
> instance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4229) Upload Ozone 1.0.0 sources jars to Apache maven repo

2020-09-10 Thread Siyao Meng (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193826#comment-17193826
 ] 

Siyao Meng commented on HDDS-4229:
--

Any idea what tools I should use to properly (build and) deploy(upload) the 
jars to the Apache maven repos? [~elek]

I guess it's more than just running {{mvn deploy}} on my Mac.

> Upload Ozone 1.0.0 sources jars to Apache maven repo
> 
>
> Key: HDDS-4229
> URL: https://issues.apache.org/jira/browse/HDDS-4229
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Priority: Minor
>
> ozone artifacts on Apache maven repo doesn't have the corresponding sources 
> jars.
> This leads to a small inconvenience where debugging an Ozone client program 
> in IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
> maven repo.
> A possible workaround is to run {{mvn clean source:jar install -DskipTests}} 
> so local maven repo will have the sources jars available for debugging.
> e.g.
> for hadoop-ozone-client 1.0.0: 
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> We don't have {{*-sources.jar}} files.
> for hadoop-client 3.3.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
> There are {{hadoop-client-3.3.0-sources.jar}} and 
> {{hadoop-client-3.3.0-test-sources.jar}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486439055



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1414: HDDS-4231. Background Service blocks on task results.

2020-09-10 Thread GitBox


amaliujia commented on a change in pull request #1414:
URL: https://github.com/apache/hadoop-ozone/pull/1414#discussion_r486505162



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/utils/BackgroundService.java
##
@@ -62,11 +56,11 @@ public BackgroundService(String serviceName, long interval,
 this.interval = interval;
 this.unit = unit;
 this.serviceName = serviceName;
-this.serviceTimeout = serviceTimeout;
+this.serviceTimeoutInNanos = TimeDuration.valueOf(serviceTimeout, unit)
+.toLong(TimeUnit.NANOSECONDS);

Review comment:
   Usually long is not enough to contain NANOSECONDS but this variable 
seems just a service timeout, which should be a very small number so it should 
be fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486437934



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1314: HDDS-3988: DN can distinguish SCMCommand from stale leader SCM

2020-09-10 Thread GitBox


amaliujia commented on a change in pull request #1314:
URL: https://github.com/apache/hadoop-ozone/pull/1314#discussion_r486495949



##
File path: 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/ha/MockSCMHAManager.java
##
@@ -78,8 +79,8 @@ public void start() throws IOException {
* {@inheritDoc}
*/
   @Override
-  public boolean isLeader() {
-return isLeader;
+  public Optional isLeader() {

Review comment:
   Agreed. I have been thinking about it and we might be some testing 
infrastructure to simulate these complicated consensus cases. E.g. split-brian. 
   
   
   OM HA might have done something similar. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


amaliujia commented on a change in pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#discussion_r486502026



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyDeletingService.java
##
@@ -66,9 +66,6 @@
   private static final Logger LOG =
   LoggerFactory.getLogger(KeyDeletingService.class);
 
-  // The thread pool size for key deleting service.
-  private final static int KEY_DELETING_CORE_POOL_SIZE = 2;

Review comment:
   when just not change 2 to 1?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] amaliujia commented on pull request #1414: HDDS-4231. Background Service blocks on task results.

2020-09-10 Thread GitBox


amaliujia commented on pull request #1414:
URL: https://github.com/apache/hadoop-ozone/pull/1414#issuecomment-690532823


   +1 tried to review this PR and it looks good to me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1398: HDDS-4210. ResolveBucket during checkAcls fails.

2020-09-10 Thread GitBox


bharatviswa504 commented on a change in pull request #1398:
URL: https://github.com/apache/hadoop-ozone/pull/1398#discussion_r486559825



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3523,6 +3533,47 @@ public ResolvedBucket resolveBucketLink(Pair requested)
 visited);
   }
 
+  /**
+   * Resolves bucket symlinks. Read permission is required for following links.
+   *
+   * @param volumeAndBucket the bucket to be resolved (if it is a link)
+   * @param {@link OMClientRequest}  which has information required to check
+   * permission.
+   * @param visited collects link buckets visited during the resolution to
+   *   avoid infinite loops
+   * @return bucket location possibly updated with its actual volume and bucket
+   *   after following bucket links
+   * @throws IOException (most likely OMException) if ACL check fails, bucket 
is
+   *   not found, loop is detected in the links, etc.
+   */
+  private Pair resolveBucketLink(
+  Pair volumeAndBucket,
+  Set> visited,
+  OMClientRequest omClientRequest) throws IOException {

Review comment:
   I have taken this approach of passing OMClientRequest so that we don't 
need to touch all OMKey write Requests in passing this info. (With this only 
class needs to be updated which is base class OMKeyRequest)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 edited a comment on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690583817


   Hi @lokeshj1703 
   https://issues.apache.org/jira/browse/HDDS-4121 This Jira changes the 
behavior of openKeyCleanup service, and does not call SCM directly, it moves 
the keys to deleteTable by making a write request to OM. And later it will be 
picked by KeyDeletingService and it will be only the one to send delete keys 
request to SCM from OM.
   
   Do you think, we need this change still even after this also?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 edited a comment on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690583817


   Hi @lokeshj1703 
   https://issues.apache.org/jira/browse/HDDS-4121 This Jira changes the 
behavior with this openKeyCleanup service does not call SCM directly, it moves 
the keys to deleteTable, and it will be picked by KeyDeletingService and it 
will be only the one to send delete keys request to SCM from OM.
   
   Do you think, we need this change still even after this also?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


amaliujia commented on a change in pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#discussion_r486502026



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyDeletingService.java
##
@@ -66,9 +66,6 @@
   private static final Logger LOG =
   LoggerFactory.getLogger(KeyDeletingService.class);
 
-  // The thread pool size for key deleting service.
-  private final static int KEY_DELETING_CORE_POOL_SIZE = 2;

Review comment:
   why not just change 2 to 1?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486441859



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1413: HDDS-4228: add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread GitBox


amaliujia commented on pull request #1413:
URL: https://github.com/apache/hadoop-ozone/pull/1413#issuecomment-690526588


   LGTM! Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1398: HDDS-4210. ResolveBucket during checkAcls fails.

2020-09-10 Thread GitBox


bharatviswa504 commented on a change in pull request #1398:
URL: https://github.com/apache/hadoop-ozone/pull/1398#discussion_r486559825



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3523,6 +3533,47 @@ public ResolvedBucket resolveBucketLink(Pair requested)
 visited);
   }
 
+  /**
+   * Resolves bucket symlinks. Read permission is required for following links.
+   *
+   * @param volumeAndBucket the bucket to be resolved (if it is a link)
+   * @param {@link OMClientRequest}  which has information required to check
+   * permission.
+   * @param visited collects link buckets visited during the resolution to
+   *   avoid infinite loops
+   * @return bucket location possibly updated with its actual volume and bucket
+   *   after following bucket links
+   * @throws IOException (most likely OMException) if ACL check fails, bucket 
is
+   *   not found, loop is detected in the links, etc.
+   */
+  private Pair resolveBucketLink(
+  Pair volumeAndBucket,
+  Set> visited,
+  OMClientRequest omClientRequest) throws IOException {

Review comment:
   I have taken this approach of passing OMClientRequest so that we don't 
need to touch all OMKey write Requests in passing this info.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1398: HDDS-4210. ResolveBucket during checkAcls fails.

2020-09-10 Thread GitBox


bharatviswa504 commented on a change in pull request #1398:
URL: https://github.com/apache/hadoop-ozone/pull/1398#discussion_r486559825



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3523,6 +3533,47 @@ public ResolvedBucket resolveBucketLink(Pair requested)
 visited);
   }
 
+  /**
+   * Resolves bucket symlinks. Read permission is required for following links.
+   *
+   * @param volumeAndBucket the bucket to be resolved (if it is a link)
+   * @param {@link OMClientRequest}  which has information required to check
+   * permission.
+   * @param visited collects link buckets visited during the resolution to
+   *   avoid infinite loops
+   * @return bucket location possibly updated with its actual volume and bucket
+   *   after following bucket links
+   * @throws IOException (most likely OMException) if ACL check fails, bucket 
is
+   *   not found, loop is detected in the links, etc.
+   */
+  private Pair resolveBucketLink(
+  Pair volumeAndBucket,
+  Set> visited,
+  OMClientRequest omClientRequest) throws IOException {

Review comment:
   I have taken this approach of passing OMClientRequest so that we don't 
need to touch all OMKey write Requests in passing this info. (With this only 
class needs to be updated which is base class OMKeyRequest)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486757610



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,280 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used. It means that if both S3 and HCFS are used, normalization is 
forced, and S3 interface is not fully AWS S3 compatible. There is no option to 
use HCFS and S3 but with full AWS compatibility (and reduced HCFS 
compatibility). 
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for example 
`/a/b` for the key `/b/c/c`)

Review comment:
   /b/c/c => /a/b/c

##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,280 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 edited a comment on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690583817


   Hi @lokeshj1703 
   [HDDS-4120](https://issues.apache.org/jira/browse/HDDS-4120) This Jira 
changes the behavior of openKeyCleanup service, and does not call SCM directly, 
it moves the keys to deleteTable by making a write request to OM. And later it 
will be picked by KeyDeletingService and it will be only the one to send delete 
keys request to SCM from OM.
   
   Do you think, we need this change still even after this also?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3805) [OFS] Remove usage of OzoneClientAdapter interface

2020-09-10 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-3805:
-
Priority: Minor  (was: Major)

> [OFS] Remove usage of OzoneClientAdapter interface
> --
>
> Key: HDDS-3805
> URL: https://issues.apache.org/jira/browse/HDDS-3805
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Filesystem
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Minor
>  Labels: pull-request-available
>
> Use ClientProtocol (proxy) directly instead of OzoneClient / ObjectStore in 
> BasicRootedOzoneClientAdapterImpl and BasicRootedOzoneFileSystem as [~elek] 
> have suggested.
> This is part of the OFS refactoring effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread Li Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193955#comment-17193955
 ] 

Li Cheng commented on HDDS-4228:


PR is merged. Thanks [~glengeng] for working on this.

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng merged pull request #1413: HDDS-4228: add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread GitBox


timmylicheng merged pull request #1413:
URL: https://github.com/apache/hadoop-ozone/pull/1413


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread Li Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng resolved HDDS-4228.

Fix Version/s: 1.1.0
   Resolution: Fixed

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
> Fix For: 1.1.0
>
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] vivekratnavel merged pull request #1390: HDDS-4196. Add an endpoint in Recon to query Prometheus

2020-09-10 Thread GitBox


vivekratnavel merged pull request #1390:
URL: https://github.com/apache/hadoop-ozone/pull/1390


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4229) Upload Ozone 1.0.0 sources jars to Apache maven repo

2020-09-10 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-4229:
-
Description: 
ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

For hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.

For hadoop-ozone-client 1.0.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

Artifacts are also located here:
https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/


Found an article here: https://infra.apache.org/publishing-maven-artifacts.html

  was:
ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

For hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.

For hadoop-ozone-client 1.0.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

Artifacts are also located here:
https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/


> Upload Ozone 1.0.0 sources jars to Apache maven repo
> 
>
> Key: HDDS-4229
> URL: https://issues.apache.org/jira/browse/HDDS-4229
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Priority: Minor
>
> ozone artifacts on Apache maven repo doesn't have the corresponding sources 
> jars.
> This leads to a small inconvenience where debugging an Ozone client program 
> in IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
> maven repo.
> A possible workaround is to run {{mvn clean source:jar install -DskipTests}} 
> so local maven repo will have the sources jars available for debugging.
> e.g.
> For hadoop-client 3.3.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
> There are {{hadoop-client-3.3.0-sources.jar}} and 
> {{hadoop-client-3.3.0-test-sources.jar}}.
> For hadoop-ozone-client 1.0.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> We don't have {{*-sources.jar}} files.
> Artifacts are also located here:
> https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> Found an article here: 
> https://infra.apache.org/publishing-maven-artifacts.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4229) Upload Ozone 1.0.0 sources jars to Apache maven repo

2020-09-10 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-4229:
-
Description: 
ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

For hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.

For hadoop-ozone-client 1.0.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

Artifacts are also located here:
https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/


Found an article, probably relevant: 
https://infra.apache.org/publishing-maven-artifacts.html

  was:
ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

For hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.

For hadoop-ozone-client 1.0.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

Artifacts are also located here:
https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/


Found an article here: https://infra.apache.org/publishing-maven-artifacts.html


> Upload Ozone 1.0.0 sources jars to Apache maven repo
> 
>
> Key: HDDS-4229
> URL: https://issues.apache.org/jira/browse/HDDS-4229
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Priority: Minor
>
> ozone artifacts on Apache maven repo doesn't have the corresponding sources 
> jars.
> This leads to a small inconvenience where debugging an Ozone client program 
> in IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
> maven repo.
> A possible workaround is to run {{mvn clean source:jar install -DskipTests}} 
> so local maven repo will have the sources jars available for debugging.
> e.g.
> For hadoop-client 3.3.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
> There are {{hadoop-client-3.3.0-sources.jar}} and 
> {{hadoop-client-3.3.0-test-sources.jar}}.
> For hadoop-ozone-client 1.0.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> We don't have {{*-sources.jar}} files.
> Artifacts are also located here:
> https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> Found an article, probably relevant: 
> https://infra.apache.org/publishing-maven-artifacts.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


bharatviswa504 commented on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-690808157


   > > Thank You @elek for the design document.
   > > My understanding from this is the draft is as below. Let me know if I am 
missing something here.
   > > https://user-images.githubusercontent.com/8586345/92635994-8856fe80-f28b-11ea-95bf-8864d48e488f.png;>
   > 
   > Correct. But this is not a matrix anymore. You should turn on either first 
or second of the configs, but not both.
   
   Not sure what is meant here, because we have 2 configs, now we can have 4 
combinations according to proposal 3 are valid, 4th one is not. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1413: HDDS-4228: add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread GitBox


timmylicheng commented on pull request #1413:
URL: https://github.com/apache/hadoop-ozone/pull/1413#issuecomment-690833613


   Merging



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4234) Add comment to ListVolumes logic

2020-09-10 Thread Xie Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xie Lei updated HDDS-4234:
--
Description: 
when do following command, the statistics of list request is 2
{code:java}
ozone sh volume ls
{code}
 

 

!image-2020-09-11-11-23-50-504.png!

  was:
when do following command, the statistics of list request is 2
{code:java}
ozone sh volume ls
{code}
 


> Add comment to ListVolumes logic
> 
>
> Key: HDDS-4234
> URL: https://issues.apache.org/jira/browse/HDDS-4234
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Xie Lei
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: image-2020-09-11-11-23-50-504.png
>
>
> when do following command, the statistics of list request is 2
> {code:java}
> ozone sh volume ls
> {code}
>  
>  
> !image-2020-09-11-11-23-50-504.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4234) Add comment to ListVolumes logic

2020-09-10 Thread Xie Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xie Lei updated HDDS-4234:
--
Attachment: image-2020-09-11-11-23-50-504.png

> Add comment to ListVolumes logic
> 
>
> Key: HDDS-4234
> URL: https://issues.apache.org/jira/browse/HDDS-4234
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Xie Lei
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: image-2020-09-11-11-23-50-504.png
>
>
> when do following command, the statistics of list request is 2
> {code:java}
> ozone sh volume ls
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4234) Add comment to ListVolumes logic

2020-09-10 Thread Xie Lei (Jira)
Xie Lei created HDDS-4234:
-

 Summary: Add comment to ListVolumes logic
 Key: HDDS-4234
 URL: https://issues.apache.org/jira/browse/HDDS-4234
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Client
Affects Versions: 0.5.0
Reporter: Xie Lei
 Fix For: 1.0.0


when do following command, the statistics of list request is 2
{code:java}
ozone sh volume ls
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4229) Upload Ozone 1.0.0 sources jars to Apache maven repo

2020-09-10 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-4229:
-
Description: 
ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

For hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.

For hadoop-ozone-client 1.0.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

Artifacts are also located here:
https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/

  was:
ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

for hadoop-ozone-client 1.0.0: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

for hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.



> Upload Ozone 1.0.0 sources jars to Apache maven repo
> 
>
> Key: HDDS-4229
> URL: https://issues.apache.org/jira/browse/HDDS-4229
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Priority: Minor
>
> ozone artifacts on Apache maven repo doesn't have the corresponding sources 
> jars.
> This leads to a small inconvenience where debugging an Ozone client program 
> in IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
> maven repo.
> A possible workaround is to run {{mvn clean source:jar install -DskipTests}} 
> so local maven repo will have the sources jars available for debugging.
> e.g.
> For hadoop-client 3.3.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
> There are {{hadoop-client-3.3.0-sources.jar}} and 
> {{hadoop-client-3.3.0-test-sources.jar}}.
> For hadoop-ozone-client 1.0.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> We don't have {{*-sources.jar}} files.
> Artifacts are also located here:
> https://repository.apache.org/content/groups/public/org/apache/hadoop/hadoop-ozone-client/1.0.0/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 commented on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690739465


   And also a question, these are expired Keys in the openKeyTable, how for the 
same block 2 threads can send the request before. And also I see there is no 
logic for deleteExpiredKey today, and openKeyCleanup is not yet integrated in 
OM.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1214: HDDS-3981. Add more debug level log to XceiverClientGrpc for debug purpose

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1214:
URL: https://github.com/apache/hadoop-ozone/pull/1214#discussion_r486649479



##
File path: 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java
##
@@ -354,7 +363,7 @@ private XceiverClientReply sendCommandWithRetry(
 responseProto = null;
   } catch (ExecutionException e) {
 LOG.debug("Failed to execute command {} on datanode {}",

Review comment:
   Good point, you are right on this. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1214: HDDS-3981. Add more debug level log to XceiverClientGrpc for debug purpose

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1214:
URL: https://github.com/apache/hadoop-ozone/pull/1214#discussion_r486650134



##
File path: 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java
##
@@ -354,7 +363,7 @@ private XceiverClientReply sendCommandWithRetry(
 responseProto = null;
   } catch (ExecutionException e) {
 LOG.debug("Failed to execute command {} on datanode {}",
-request, dn.getUuid(), e);

Review comment:
   I think ip address should be sufficient for production environment. The 
local mode is for mini clsuter test only. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 removed a comment on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 removed a comment on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690739465


   And also a question, these are expired Keys in the openKeyTable, how for the 
same block 2 threads can send the request before. And also I see there is no 
logic for deleteExpiredKey today, and openKeyCleanup is not yet integrated in 
OM.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4234) Add importent comment to ListVolumes logic

2020-09-10 Thread Xie Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xie Lei updated HDDS-4234:
--
Summary: Add importent comment to ListVolumes logic  (was: Add comment to 
ListVolumes logic)

> Add importent comment to ListVolumes logic
> --
>
> Key: HDDS-4234
> URL: https://issues.apache.org/jira/browse/HDDS-4234
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Client
>Affects Versions: 0.5.0
>Reporter: Xie Lei
>Priority: Major
> Fix For: 1.0.0
>
> Attachments: image-2020-09-11-11-23-50-504.png
>
>
> when do following command, the statistics of list request is 2
> {code:java}
> ozone sh volume ls
> {code}
>  
>  
> !image-2020-09-11-11-23-50-504.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1340: HDDS-3188 Add failover proxy for SCM block location.

2020-09-10 Thread GitBox


timmylicheng commented on pull request #1340:
URL: https://github.com/apache/hadoop-ozone/pull/1340#issuecomment-690835337


   > The client failover logic is based on the suggested leader sent by SCM. 
The `String` value of suggested leader sent by SCM Server is 
`RaftPeer#getAddress`, but at client side this value is compare with 
`SCM_DUMMY_NODEID_PREFIX + ` which will never match. So suggested leader is 
never valued and we are always failing over to the next proxy in round robin.
   
   I have removed suggestedLeader related parts in this PR until we think it 
thru. Thanks for the head out.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


bharatviswa504 edited a comment on pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415#issuecomment-690583817


   Hi @lokeshj1703 
   [HDDS-4120](https://issues.apache.org/jira/browse/HDDS-4120) This Jira 
changes the behavior of openKeyCleanup service, and does not call SCM directly, 
it moves the keys to deleteTable by making a write request to OM. And later it 
will be picked by KeyDeletingService and it will be only the one to send delete 
keys request to SCM from OM.
   
   Do you think, we need this change still even after this also for 
openKeyCleanup?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1214: HDDS-3981. Add more debug level log to XceiverClientGrpc for debug purpose

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1214:
URL: https://github.com/apache/hadoop-ozone/pull/1214#discussion_r486652573



##
File path: 
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/XceiverClientGrpc.java
##
@@ -339,6 +343,11 @@ private XceiverClientReply sendCommandWithRetry(
 // in case these don't exist for the specific datanode.
 reply.addDatanode(dn);
 responseProto = sendCommandAsync(request, dn).getResponse().get();
+if (LOG.isDebugEnabled()) {

Review comment:
   I mean if we could move the LOG for latency around Line 464 where the 
container op latency metrics are updated. This way, we can piggyback LOG with 
the existing metrics update. 
   
   metrics.addContainerOpsLatency(request.getCmdType(),
   System.nanoTime() - requestTime);





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1340: HDDS-3188 Add failover proxy for SCM block location.

2020-09-10 Thread GitBox


timmylicheng commented on a change in pull request #1340:
URL: https://github.com/apache/hadoop-ozone/pull/1340#discussion_r486736129



##
File path: 
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/proxy/SCMBlockLocationFailoverProxyProvider.java
##
@@ -0,0 +1,281 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.proxy;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocol;
+import org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolPB;
+import org.apache.hadoop.hdds.utils.LegacyHadoopConfigurationSource;
+import org.apache.hadoop.io.retry.FailoverProxyProvider;
+import org.apache.hadoop.io.retry.RetryPolicy;
+import org.apache.hadoop.io.retry.RetryPolicy.RetryAction;
+import org.apache.hadoop.ipc.ProtobufRpcEngine;
+import org.apache.hadoop.ipc.RPC;
+import org.apache.hadoop.net.NetUtils;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+import static org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_NAMES;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_SERVICE_IDS_KEY;
+import static org.apache.hadoop.hdds.HddsUtils.getScmAddressForBlockClients;
+import static org.apache.hadoop.hdds.HddsUtils.getPortNumberFromConfigKeys;
+import static org.apache.hadoop.hdds.HddsUtils.getHostName;
+
+/**
+ * Failover proxy provider for SCM.
+ */
+public class SCMBlockLocationFailoverProxyProvider implements
+FailoverProxyProvider, Closeable {
+  public static final Logger LOG =
+  LoggerFactory.getLogger(SCMBlockLocationFailoverProxyProvider.class);
+
+  private Map> scmProxies;
+  private Map scmProxyInfoMap;
+  private List scmNodeIDList;
+
+  private String currentProxySCMNodeId;
+  private int currentProxyIndex;
+
+  private final ConfigurationSource conf;
+  private final long scmVersion;
+
+  private final String scmServiceId;
+
+  private String lastAttemptedLeader;
+
+  private final int maxRetryCount;
+  private final long retryInterval;
+
+  public static final String SCM_DUMMY_NODEID_PREFIX = "scm";
+
+  public SCMBlockLocationFailoverProxyProvider(ConfigurationSource conf) {
+this.conf = conf;
+this.scmVersion = RPC.getProtocolVersion(ScmBlockLocationProtocol.class);
+this.scmServiceId = conf.getTrimmed(OZONE_SCM_SERVICE_IDS_KEY);
+this.scmProxies = new HashMap<>();
+this.scmProxyInfoMap = new HashMap<>();
+this.scmNodeIDList = new ArrayList<>();
+loadConfigs();
+
+this.currentProxyIndex = 0;
+currentProxySCMNodeId = scmNodeIDList.get(currentProxyIndex);
+
+this.maxRetryCount = conf.getObject(SCMBlockClientConfig.class)
+.getRetryCount();
+this.retryInterval = conf.getObject(SCMBlockClientConfig.class)

Review comment:
   Updated





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lamber-ken opened a new pull request #1417: HDDS-4324. Add importent comment to ListVolumets logic

2020-09-10 Thread GitBox


lamber-ken opened a new pull request #1417:
URL: https://github.com/apache/hadoop-ozone/pull/1417


   ## What changes were proposed in this pull request?
   
   when execute `ozone sh volume ls` command only once, but the the statistics 
of list request is 2.
   
   After delved into logic, it may not a bug. It's better add this comment.
   
   
![image](https://user-images.githubusercontent.com/20113411/92853737-bf134d00-f422-11ea-8465-1f5e4172da9c.png)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4234
   
   
![image](https://user-images.githubusercontent.com/20113411/92853577-94c18f80-f422-11ea-8cf3-dc2539490ea2.png)
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486760564



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create 

[GitHub] [hadoop-ozone] amaliujia commented on a change in pull request #1412: HDDS-3751. Ozone sh client support bucket quota option.

2020-09-10 Thread GitBox


amaliujia commented on a change in pull request #1412:
URL: https://github.com/apache/hadoop-ozone/pull/1412#discussion_r486775970



##
File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
##
@@ -174,6 +184,20 @@ public OzoneBucket(ConfigurationSource conf, 
ClientProtocol proxy,
 this.modificationTime = Instant.ofEpochMilli(modificationTime);
   }
 
+  @SuppressWarnings("parameternumber")

Review comment:
   Out of curiosity: what is the purpose of 
`@SuppressWarnings("parameternumber")`?

##
File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java
##
@@ -464,6 +469,8 @@ public void createBucket(
 .setStorageType(storageType)
 .setSourceVolume(bucketArgs.getSourceVolume())
 .setSourceBucket(bucketArgs.getSourceBucket())
+.setQuotaInBytes(quotaInBytes)
+.setQuotaInCounts(quotaInCounts)

Review comment:
   Do you need to verify whether `quotaInBytes` and `quotaInCounts` are 
valid? e.g. >= 0?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486761072



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create 

[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r486775263



##
File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneVolume.java
##
@@ -131,6 +133,18 @@ public OzoneVolume(ConfigurationSource conf, 
ClientProtocol proxy,
 this.modificationTime = Instant.ofEpochMilli(modificationTime);
   }
 
+  @SuppressWarnings("parameternumber")

Review comment:
   sounds good to me. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


xiaoyuyao commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486761554



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,280 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used. It means that if both S3 and HCFS are used, normalization is 
forced, and S3 interface is not fully AWS S3 compatible. There is no option to 
use HCFS and S3 but with full AWS compatibility (and reduced HCFS 
compatibility). 
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for example 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which 

[GitHub] [hadoop-ozone] fapifta commented on a change in pull request #1405: HDDS-4143. Implement a factory for OM Requests that returns an instance based on layout version.

2020-09-10 Thread GitBox


fapifta commented on a change in pull request #1405:
URL: https://github.com/apache/hadoop-ozone/pull/1405#discussion_r486213861



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/bucket/OMBucketSetPropertyRequest.java
##
@@ -206,4 +207,8 @@ public OMClientResponse validateAndUpdateCache(OzoneManager 
ozoneManager,
   return omClientResponse;
 }
   }
+
+  public static String getRequestType() {

Review comment:
   The problem currently with an annotation approach is the fact that we 
can not use expressions like "SetAcl.name() + "-" + ObjectType.KEY" in 
annotations, as that does not qualify as a constant.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4230) CLONE - Add failover proxy to SCM block protocol

2020-09-10 Thread Glen Geng (Jira)
Glen Geng created HDDS-4230:
---

 Summary: CLONE - Add failover proxy to SCM block protocol
 Key: HDDS-4230
 URL: https://issues.apache.org/jira/browse/HDDS-4230
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: SCM
Reporter: Glen Geng
Assignee: Li Cheng


Need to supports 2N + 1 SCMs. Add configs and logic to support multiple SCMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4230) SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException

2020-09-10 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4230:

Description: 
like OMFailoverProxyProvider,  SCMBlockLocationFailoverProxyProvider should 
also handle LeaderNotReadyException.

If scm client (like OzoneManager) has touched leader SCM, meanwhile leader SCM 
is stuck in replaying raft log entries, scm client should not round robin to 
next SCM, It should wait and retry the same SCM later.

  was:like OMFailoverProxyProvider, 


> SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException
> ---
>
> Key: HDDS-4230
> URL: https://issues.apache.org/jira/browse/HDDS-4230
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Glen Geng
>Assignee: Li Cheng
>Priority: Major
>  Labels: pull-request-available
>
> like OMFailoverProxyProvider,  SCMBlockLocationFailoverProxyProvider should 
> also handle LeaderNotReadyException.
> If scm client (like OzoneManager) has touched leader SCM, meanwhile leader 
> SCM is stuck in replaying raft log entries, scm client should not round robin 
> to next SCM, It should wait and retry the same SCM later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4232) Use single thread for KeyDeletingService and OpenKeyCleanupService

2020-09-10 Thread Lokesh Jain (Jira)
Lokesh Jain created HDDS-4232:
-

 Summary: Use single thread for KeyDeletingService and 
OpenKeyCleanupService
 Key: HDDS-4232
 URL: https://issues.apache.org/jira/browse/HDDS-4232
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Lokesh Jain
Assignee: Lokesh Jain


KeyDeletingService and OpenKeyCleanupService scan the keys from a particular 
rocksdb table and sends deletion request to SCM. Every thread would scan the 
table and send deletion requests. This can lead to multiple deletion request 
for a particular block. There is currently no way to distribute the keys to be 
deleted amongst multiple threads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4232) Use single thread for KeyDeletingService and OpenKeyCleanupService

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4232:
-
Labels: pull-request-available  (was: )

> Use single thread for KeyDeletingService and OpenKeyCleanupService
> --
>
> Key: HDDS-4232
> URL: https://issues.apache.org/jira/browse/HDDS-4232
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>
> KeyDeletingService and OpenKeyCleanupService scan the keys from a particular 
> rocksdb table and sends deletion request to SCM. Every thread would scan the 
> table and send deletion requests. This can lead to multiple deletion request 
> for a particular block. There is currently no way to distribute the keys to 
> be deleted amongst multiple threads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 opened a new pull request #1415: HDDS-4232. Use single thread for KeyDeletingService and OpenKeyCleanupService.

2020-09-10 Thread GitBox


lokeshj1703 opened a new pull request #1415:
URL: https://github.com/apache/hadoop-ozone/pull/1415


   ## What changes were proposed in this pull request?
   
   KeyDeletingService and OpenKeyCleanupService scan the keys from a particular 
rocksdb table and sends deletion request to SCM. Every thread would scan the 
table and send deletion requests. This can lead to multiple deletion request 
for a particular block. There is currently no way to distribute the keys to be 
deleted amongst multiple threads.
   
   ## What is the link to the Apache JIRA
   
   HDDS-4232
   
   ## How was this patch tested?
   
   This lead to failure in TestBlockDeletion. Please refer 
https://github.com/apache/hadoop-ozone/pull/1121/commits/49b610a9feb54899ae226d3b6f2e36c5f8cc784f
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng commented on a change in pull request #1314: HDDS-3988: DN can distinguish SCMCommand from stale leader SCM

2020-09-10 Thread GitBox


GlenGeng commented on a change in pull request #1314:
URL: https://github.com/apache/hadoop-ozone/pull/1314#discussion_r486282159



##
File path: 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/ha/MockSCMHAManager.java
##
@@ -78,8 +79,8 @@ public void start() throws IOException {
* {@inheritDoc}
*/
   @Override
-  public boolean isLeader() {
-return isLeader;
+  public Optional isLeader() {

Review comment:
   Good point. We need add a lot of test cases, including UT and acceptance 
test, before merging 2823 back to master.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1340: HDDS-3188 Add failover proxy for SCM block location.

2020-09-10 Thread GitBox


timmylicheng commented on a change in pull request #1340:
URL: https://github.com/apache/hadoop-ozone/pull/1340#discussion_r486199846



##
File path: 
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/proxy/SCMBlockLocationFailoverProxyProvider.java
##
@@ -0,0 +1,281 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.proxy;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocol;
+import org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolPB;
+import org.apache.hadoop.hdds.utils.LegacyHadoopConfigurationSource;
+import org.apache.hadoop.io.retry.FailoverProxyProvider;
+import org.apache.hadoop.io.retry.RetryPolicy;
+import org.apache.hadoop.io.retry.RetryPolicy.RetryAction;
+import org.apache.hadoop.ipc.ProtobufRpcEngine;
+import org.apache.hadoop.ipc.RPC;
+import org.apache.hadoop.net.NetUtils;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+import static org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_NAMES;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_SERVICE_IDS_KEY;
+import static org.apache.hadoop.hdds.HddsUtils.getScmAddressForBlockClients;
+import static org.apache.hadoop.hdds.HddsUtils.getPortNumberFromConfigKeys;
+import static org.apache.hadoop.hdds.HddsUtils.getHostName;
+
+/**
+ * Failover proxy provider for SCM.
+ */
+public class SCMBlockLocationFailoverProxyProvider implements
+FailoverProxyProvider, Closeable {
+  public static final Logger LOG =
+  LoggerFactory.getLogger(SCMBlockLocationFailoverProxyProvider.class);
+
+  private Map> scmProxies;
+  private Map scmProxyInfoMap;
+  private List scmNodeIDList;
+
+  private String currentProxySCMNodeId;
+  private int currentProxyIndex;
+
+  private final ConfigurationSource conf;
+  private final long scmVersion;
+
+  private final String scmServiceId;
+
+  private String lastAttemptedLeader;
+
+  private final int maxRetryCount;
+  private final long retryInterval;
+
+  public static final String SCM_DUMMY_NODEID_PREFIX = "scm";
+
+  public SCMBlockLocationFailoverProxyProvider(ConfigurationSource conf) {
+this.conf = conf;
+this.scmVersion = RPC.getProtocolVersion(ScmBlockLocationProtocol.class);
+this.scmServiceId = conf.getTrimmed(OZONE_SCM_SERVICE_IDS_KEY);
+this.scmProxies = new HashMap<>();
+this.scmProxyInfoMap = new HashMap<>();
+this.scmNodeIDList = new ArrayList<>();
+loadConfigs();
+
+this.currentProxyIndex = 0;
+currentProxySCMNodeId = scmNodeIDList.get(currentProxyIndex);
+
+this.maxRetryCount = conf.getObject(SCMBlockClientConfig.class)
+.getRetryCount();
+this.retryInterval = conf.getObject(SCMBlockClientConfig.class)
+.getRetryInterval();
+  }
+
+  @VisibleForTesting
+  protected Collection getSCMAddressList() {
+Collection scmAddressList =
+conf.getTrimmedStringCollection(OZONE_SCM_NAMES);
+Collection resultList = new ArrayList<>();
+if (!scmAddressList.isEmpty()) {
+  final int port = getPortNumberFromConfigKeys(conf,
+  ScmConfigKeys.OZONE_SCM_BLOCK_CLIENT_ADDRESS_KEY)
+  .orElse(ScmConfigKeys.OZONE_SCM_BLOCK_CLIENT_PORT_DEFAULT);
+  for (String scmAddress : scmAddressList) {
+LOG.info("SCM Address for proxy is {}", scmAddress);
+
+Optional hostname = getHostName(scmAddress);
+if (hostname.isPresent()) {
+  resultList.add(NetUtils.createSocketAddr(
+  hostname.get() + ":" + port));
+}
+  }
+}
+if (resultList.isEmpty()) {
+  // fall back
+  resultList.add(getScmAddressForBlockClients(conf));
+}
+return resultList;
+  }
+
+  

[GitHub] [hadoop-ozone] timmylicheng commented on a change in pull request #1340: HDDS-3188 Add failover proxy for SCM block location.

2020-09-10 Thread GitBox


timmylicheng commented on a change in pull request #1340:
URL: https://github.com/apache/hadoop-ozone/pull/1340#discussion_r486199021



##
File path: 
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/proxy/SCMBlockLocationFailoverProxyProvider.java
##
@@ -0,0 +1,281 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.proxy;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.scm.ScmConfigKeys;
+import org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocol;
+import org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolPB;
+import org.apache.hadoop.hdds.utils.LegacyHadoopConfigurationSource;
+import org.apache.hadoop.io.retry.FailoverProxyProvider;
+import org.apache.hadoop.io.retry.RetryPolicy;
+import org.apache.hadoop.io.retry.RetryPolicy.RetryAction;
+import org.apache.hadoop.ipc.ProtobufRpcEngine;
+import org.apache.hadoop.ipc.RPC;
+import org.apache.hadoop.net.NetUtils;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.net.InetSocketAddress;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+import static org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_NAMES;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_SERVICE_IDS_KEY;
+import static org.apache.hadoop.hdds.HddsUtils.getScmAddressForBlockClients;
+import static org.apache.hadoop.hdds.HddsUtils.getPortNumberFromConfigKeys;
+import static org.apache.hadoop.hdds.HddsUtils.getHostName;
+
+/**
+ * Failover proxy provider for SCM.
+ */
+public class SCMBlockLocationFailoverProxyProvider implements
+FailoverProxyProvider, Closeable {
+  public static final Logger LOG =
+  LoggerFactory.getLogger(SCMBlockLocationFailoverProxyProvider.class);
+
+  private Map> scmProxies;
+  private Map scmProxyInfoMap;
+  private List scmNodeIDList;
+
+  private String currentProxySCMNodeId;
+  private int currentProxyIndex;
+
+  private final ConfigurationSource conf;
+  private final long scmVersion;
+
+  private final String scmServiceId;
+
+  private String lastAttemptedLeader;
+
+  private final int maxRetryCount;
+  private final long retryInterval;
+
+  public static final String SCM_DUMMY_NODEID_PREFIX = "scm";
+
+  public SCMBlockLocationFailoverProxyProvider(ConfigurationSource conf) {
+this.conf = conf;
+this.scmVersion = RPC.getProtocolVersion(ScmBlockLocationProtocol.class);
+this.scmServiceId = conf.getTrimmed(OZONE_SCM_SERVICE_IDS_KEY);
+this.scmProxies = new HashMap<>();
+this.scmProxyInfoMap = new HashMap<>();
+this.scmNodeIDList = new ArrayList<>();
+loadConfigs();
+
+this.currentProxyIndex = 0;
+currentProxySCMNodeId = scmNodeIDList.get(currentProxyIndex);
+
+this.maxRetryCount = conf.getObject(SCMBlockClientConfig.class)
+.getRetryCount();
+this.retryInterval = conf.getObject(SCMBlockClientConfig.class)
+.getRetryInterval();
+  }
+
+  @VisibleForTesting
+  protected Collection getSCMAddressList() {
+Collection scmAddressList =
+conf.getTrimmedStringCollection(OZONE_SCM_NAMES);
+Collection resultList = new ArrayList<>();
+if (!scmAddressList.isEmpty()) {
+  final int port = getPortNumberFromConfigKeys(conf,
+  ScmConfigKeys.OZONE_SCM_BLOCK_CLIENT_ADDRESS_KEY)
+  .orElse(ScmConfigKeys.OZONE_SCM_BLOCK_CLIENT_PORT_DEFAULT);
+  for (String scmAddress : scmAddressList) {
+LOG.info("SCM Address for proxy is {}", scmAddress);
+
+Optional hostname = getHostName(scmAddress);
+if (hostname.isPresent()) {
+  resultList.add(NetUtils.createSocketAddr(
+  hostname.get() + ":" + port));
+}
+  }
+}
+if (resultList.isEmpty()) {
+  // fall back
+  resultList.add(getScmAddressForBlockClients(conf));
+}
+return resultList;
+  }
+
+  

[GitHub] [hadoop-ozone] fapifta edited a comment on pull request #1405: HDDS-4143. Implement a factory for OM Requests that returns an instance based on layout version.

2020-09-10 Thread GitBox


fapifta edited a comment on pull request #1405:
URL: https://github.com/apache/hadoop-ozone/pull/1405#issuecomment-690123594


   Hi @avijayanhwx, thank you for working on this.
   Please find some comments below.
   
   1.
   First and foremost I would like to comment on the 
LayoutVersionInstanceFactory.instances data structure as this data structure 
seems to be the most performance critical, because this will be used to look up 
the implementation class for every request. I may have an idea to make this a 
bit better, which I would like to discuss.
   HashMap access is constant time however that constant time does include 
hashing.
   TreeMap access is O(log(n)).
   
   What we use here in the Map as key is the name of an enum, however this gets 
tricky when it comes to ACLs, and the Integer is the LayoutVersion what comes 
as well from an enum, however in OMLayoutVersion, we use a separate 
layoutVersion, it is still equals to the ordinal of the enum.
   
   So if we change two things:
   - OMLayoutVersion to use the enum's ordinal as the LayoutVersion
   - ACL requests to be enumerated as separate request types for volume bucket 
and key ACL operations, instead of the dynamic string approach for the request 
type.
   
   Then we can use an EnumMap>, if we want to make the factory usable by other 
parts of the code, then we can supply OzoneManagerProtocolProtos.Type, and 
OMLayoutFeature as type parameters to the factory as needed.
   With that we would have enummap of enummaps with guaranteed constant time 
access (array access basically), which is faster.
   
   Of course this will result in some more memory consumption and a fair bit 
more complicated initialization as a tradeoff, because we would need to 
register class T to all LayoutVersions for every request type but even with 
fairly large request (1-200) and version counts(~1000), the required memory for 
these lookup tables are still in the few Mib range, if my estimation skills are 
not screwing me over, so it seems to be a fair tradeoff.
   
   Having an enum that enumerates all the requests (ACLs for every type, and 
property for every type separately as an abstraction in OM, we can use that in 
many other ways possibly, and that also solves the getRequestType()->annotation 
conversion, as with the current request types, we can not introduce an 
annotation, as for example the "SetAcl.name() + "-" + ObjectType.KEY" 
expression does not qualify as a constant and can not be used as value inside 
an annotation.
   What do you think?
   
   2.
   It would be nice to add a test for OMClientRequest, that ensures that it has 
and all of its implementations has a declared constructor with an OMRequest 
parameter, so that if for some reason this changes we can get a nice detailed 
notification about the problem with the change, and based on the test error, we 
can either change the OzoneManagerRatisUtils class to call the proper 
constructor, or consider the change in the constructor.
   
   3.
   I am unsure if we need to restrict BelongsToLayoutFeature to classes and 
DisallowedUntilLayoutVersion to methods, what is a reasoning behind such a 
restriction? As I see, the code does not understand the annotations in other 
places, and probably it is better to restrict, I am just curious if we thought 
about methods with features, or completely disallowed classes, in the future? 
So this is more like a question than a concern for now.
   
   4.
   TestOmRequestFactory does not have a closing empty line.
   
   5.
   I am not fully convinced wether it is a consistent and good design decision 
to do not have an interface to the LayoutVersionInstanceFactory, and leave 
types like OMLayoutVersionIstanceFactory lingering around in the API, are there 
any plans to generalize this to an interface -> abstract impl -> concrete impl 
structure as with LayoutVersionManager and use the interface only everywhere 
where we need to pass it in? Are there any reasons not to do so?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] fapifta commented on pull request #1405: HDDS-4143. Implement a factory for OM Requests that returns an instance based on layout version.

2020-09-10 Thread GitBox


fapifta commented on pull request #1405:
URL: https://github.com/apache/hadoop-ozone/pull/1405#issuecomment-690123594


   Hi @avijayanhwx, thank you for working on this.
   Please find some comments below.
   
   1. First and foremost I would like to comment on the 
LayoutVersionInstanceFactory.instances data structure as this data structure 
seems to be the most performance critical, because this will be used to look up 
the implementation class for every request. I may have an idea to make this a 
bit better, which I would like to discuss.
   HashMap access is constant time however that constant time does include 
hashing.
   TreeMap access is O(log(n)).
   
   What we use here in the Map as key is the name of an enum, however this gets 
tricky when it comes to ACLs, and the Integer is the LayoutVersion what comes 
as well from an enum, however in OMLayoutVersion, we use a separate 
layoutVersion, it is still equals to the ordinal of the enum.
   
   So if we change two things:
   - OMLayoutVersion to use the enum's ordinal as the LayoutVersion
   - ACL requests to be enumerated as separate request types for volume bucket 
and key ACL operations, instead of the dynamic string approach for the request 
type.
   
   Then we can use an EnumMap>, if we want to make the factory usable by other 
parts of the code, then we can supply OzoneManagerProtocolProtos.Type, and 
OMLayoutFeature as type parameters to the factory as needed.
   With that we would have enummap of enummaps with guaranteed constant time 
access (array access basically), which is faster.
   
   Of course this will result in some more memory consumption and a fair bit 
more complicated initialization as a tradeoff, because we would need to 
register class T to all LayoutVersions for every request type but even with 
fairly large request (1-200) and version counts(~1000), the required memory for 
these lookup tables are still in the few Mib range, if my estimation skills are 
not screwing me over, so it seems to be a fair tradeoff.
   
   Having an enum that enumerates all the requests (ACLs for every type, and 
property for every type separately as an abstraction in OM, we can use that in 
many other ways possibly, and that also solves the getRequestType()->annotation 
conversion, as with the current request types, we can not introduce an 
annotation, as for example the "SetAcl.name() + "-" + ObjectType.KEY" 
expression does not qualify as a constant and can not be used as value inside 
an annotation.
   What do you think?
   
   2.
   It would be nice to add a test for OMClientRequest, that ensures that it has 
and all of its implementations has a declared constructor with an OMRequest 
parameter, so that if for some reason this changes we can get a nice detailed 
notification about the problem with the change, and based on the test error, we 
can either change the OzoneManagerRatisUtils class to call the proper 
constructor, or consider the change in the constructor.
   
   3.
   I am unsure if we need to restrict BelongsToLayoutFeature to classes and 
DisallowedUntilLayoutVersion to methods, what is a reasoning behind such a 
restriction? As I see, the code does not understand the annotations in other 
places, and probably it is better to restrict, I am just curious if we thought 
about methods with features, or completely disallowed classes, in the future? 
So this is more like a question than a concern for now.
   
   4.
   TestOmRequestFactory does not have a closing empty line.
   
   5.
   I am not fully convinced wether it is a consistent and good design decision 
to do not have an interface to the LayoutVersionInstanceFactory, and leave 
types like OMLayoutVersionIstanceFactory lingering around in the API, are there 
any plans to generalize this to an interface -> abstract impl -> concrete impl 
structure as with LayoutVersionManager and use the interface only everywhere 
where we need to pass it in? Are there any reasons not to do so?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4231) Background Service blocks on task results

2020-09-10 Thread Lokesh Jain (Jira)
Lokesh Jain created HDDS-4231:
-

 Summary: Background Service blocks on task results
 Key: HDDS-4231
 URL: https://issues.apache.org/jira/browse/HDDS-4231
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Lokesh Jain
Assignee: Lokesh Jain


Background service currently waits on the results of the tasks. The idea is to 
track the time it took for the task to execute and log if task takes more than 
configured timeout.
This does not require waiting on the task results and can be achieved by just 
comparing the execution time of a task with the timeout value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4231) Background Service blocks on task results

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4231:
-
Labels: pull-request-available  (was: )

> Background Service blocks on task results
> -
>
> Key: HDDS-4231
> URL: https://issues.apache.org/jira/browse/HDDS-4231
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>
> Background service currently waits on the results of the tasks. The idea is 
> to track the time it took for the task to execute and log if task takes more 
> than configured timeout.
> This does not require waiting on the task results and can be achieved by just 
> comparing the execution time of a task with the timeout value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 opened a new pull request #1414: HDDS-4231. Background Service blocks on task results.

2020-09-10 Thread GitBox


lokeshj1703 opened a new pull request #1414:
URL: https://github.com/apache/hadoop-ozone/pull/1414


   ## What changes were proposed in this pull request?
   
   Background service currently waits on the results of the tasks. The idea is 
to track the time it took for the task to execute and log if task takes more 
than configured timeout.
   This does not require waiting on the task results and can be achieved by 
just comparing the execution time of a task with the timeout value.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4231
   
   ## How was this patch tested?
   
   This was one of the reasons for failure of TestBlockDeletion tracked in 
HDDS-3432. Please check 
https://github.com/apache/hadoop-ozone/pull/1121/commits/a3feb31ae94e2c1a127f3d84f08ccc4edfb0e0ac
 . HDDS-3432 is blocked on a ratis bug.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4230) SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException

2020-09-10 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4230:

Description: like OMFailoverProxyProvider,   (was: Need to supports 2N + 1 
SCMs. Add configs and logic to support multiple SCMs.)

> SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException
> ---
>
> Key: HDDS-4230
> URL: https://issues.apache.org/jira/browse/HDDS-4230
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Glen Geng
>Assignee: Li Cheng
>Priority: Major
>  Labels: pull-request-available
>
> like OMFailoverProxyProvider, 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc closed pull request #1337: HDDS-4129. change MAX_QUOTA_IN_BYTES to Long.MAX_VALUE.

2020-09-10 Thread GitBox


captainzmc closed pull request #1337:
URL: https://github.com/apache/hadoop-ozone/pull/1337


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4230) SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException

2020-09-10 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4230:

Summary: SCMBlockLocationFailoverProxyProvider should handle 
LeaderNotReadyException  (was: CLONE - Add failover proxy to SCM block protocol)

> SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException
> ---
>
> Key: HDDS-4230
> URL: https://issues.apache.org/jira/browse/HDDS-4230
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Glen Geng
>Assignee: Li Cheng
>Priority: Major
>  Labels: pull-request-available
>
> Need to supports 2N + 1 SCMs. Add configs and logic to support multiple SCMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4230) SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException

2020-09-10 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4230:

Description: 
It is an enhancement for HDDS-3188.

Like OMFailoverProxyProvider, SCMBlockLocationFailoverProxyProvider should also 
handle LeaderNotReadyException.

If SCM client (like OzoneManager) has touched leader SCM, meanwhile leader SCM 
is stuck in replaying raft log entries(e.g., that SCM restarts and becomes 
leader, it needs time to recover its state machine by replaying all raft log 
entries), SCM client should not round robin to the next SCM, It should wait and 
retry the same SCM later.

  was:
like OMFailoverProxyProvider,  SCMBlockLocationFailoverProxyProvider should 
also handle LeaderNotReadyException.

If scm client (like OzoneManager) has touched leader SCM, meanwhile leader SCM 
is stuck in replaying raft log entries, scm client should not round robin to 
next SCM, It should wait and retry the same SCM later.


> SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException
> ---
>
> Key: HDDS-4230
> URL: https://issues.apache.org/jira/browse/HDDS-4230
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Glen Geng
>Assignee: Li Cheng
>Priority: Major
>  Labels: pull-request-available
>
> It is an enhancement for HDDS-3188.
> Like OMFailoverProxyProvider, SCMBlockLocationFailoverProxyProvider should 
> also handle LeaderNotReadyException.
> If SCM client (like OzoneManager) has touched leader SCM, meanwhile leader 
> SCM is stuck in replaying raft log entries(e.g., that SCM restarts and 
> becomes leader, it needs time to recover its state machine by replaying all 
> raft log entries), SCM client should not round robin to the next SCM, It 
> should wait and retry the same SCM later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-4230) SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException

2020-09-10 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng reassigned HDDS-4230:
---

Assignee: (was: Li Cheng)

> SCMBlockLocationFailoverProxyProvider should handle LeaderNotReadyException
> ---
>
> Key: HDDS-4230
> URL: https://issues.apache.org/jira/browse/HDDS-4230
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Glen Geng
>Priority: Major
>  Labels: pull-request-available
>
> It is an enhancement for HDDS-3188.
> Like OMFailoverProxyProvider, SCMBlockLocationFailoverProxyProvider should 
> also handle LeaderNotReadyException.
> If SCM client (like OzoneManager) has touched leader SCM, meanwhile leader 
> SCM is stuck in replaying raft log entries(e.g., that SCM restarts and 
> becomes leader, it needs time to recover its state machine by replaying all 
> raft log entries), SCM client should not round robin to the next SCM, It 
> should wait and retry the same SCM later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng commented on a change in pull request #1314: HDDS-3988: DN can distinguish SCMCommand from stale leader SCM

2020-09-10 Thread GitBox


GlenGeng commented on a change in pull request #1314:
URL: https://github.com/apache/hadoop-ozone/pull/1314#discussion_r486281665



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##
@@ -478,7 +551,26 @@ public void execute(ExecutorService service, long time, 
TimeUnit unit)
   public SCMCommand getNextCommand() {
 lock.lock();
 try {
-  return commandQueue.poll();
+  initTermOfLeaderSCM();
+  if (!termOfLeaderSCM.isPresent()) {
+return null;  // not ready yet
+  }
+
+  while (true) {
+SCMCommand command = commandQueue.poll();
+if (command == null) {
+  return null;
+}
+
+updateTermOfLeaderSCM(command);
+if (command.getTerm() == termOfLeaderSCM.get()) {

Review comment:
   Do you mean whether `termOfLeaderSCM` is updated in leader election of 
SCM ? No, it won't. Datanode detects the latest SCM term by heartbeat with 
SCMs, whose interval is larger than 30s.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong commented on pull request #1290: HDDS-4064. Show container verbose info with verbose option

2020-09-10 Thread GitBox


maobaolong commented on pull request #1290:
URL: https://github.com/apache/hadoop-ozone/pull/1290#issuecomment-690015532


   @adoroszlai @xiaoyuyao @elek Thanks for your review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4064) Show container verbose info with verbose option

2020-09-10 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong resolved HDDS-4064.
--
Fix Version/s: 1.1.0
   Resolution: Fixed

> Show container verbose info with verbose option
> ---
>
> Key: HDDS-4064
> URL: https://issues.apache.org/jira/browse/HDDS-4064
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone CLI
>Affects Versions: 1.1.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng commented on a change in pull request #1371: HDDS-2922. Balance ratis leader distribution in datanodes

2020-09-10 Thread GitBox


GlenGeng commented on a change in pull request #1371:
URL: https://github.com/apache/hadoop-ozone/pull/1371#discussion_r486090313



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ratis/RatisHelper.java
##
@@ -125,6 +131,17 @@ private static RaftGroup newRaftGroup(Collection 
peers) {
 : RaftGroup.valueOf(DUMMY_GROUP_ID, peers);
   }
 
+  public static RaftGroup newRaftGroup(RaftGroupId groupId,
+  List peers, List priorityList) {
+final List newPeers = new ArrayList<>();

Review comment:
   sanity check for `peers.size() == priorityList.size()`

##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/protocol/DatanodeDetails.java
##
@@ -190,6 +195,18 @@ public Port getPort(Port.Name name) {
 return null;
   }
 
+  public int getSuggestedLeaderCount() {

Review comment:
   miss java doc for these public method.

##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/pipeline/Pipeline.java
##
@@ -123,6 +124,14 @@ public Instant getCreationTimestamp() {
 return creationTimestamp;
   }
 
+  public void setSuggestedLeader(UUID suggestedLeader) {

Review comment:
   miss java doc for public method.
   
   Why not set suggestedLeader in Ctor and remove the getter/setter ? I guess, 
it would be final/immutable after creating, just like type/factor.

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/protocol/commands/CreatePipelineCommand.java
##
@@ -48,16 +50,34 @@ public CreatePipelineCommand(final PipelineID pipelineID,
 this.factor = factor;
 this.type = type;
 this.nodelist = datanodeList;
+this.priorityList = new ArrayList<>();
+for (DatanodeDetails dn : datanodeList) {

Review comment:
   ditto

##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/pipeline/Pipeline.java
##
@@ -61,6 +61,7 @@
   private UUID leaderId;
   // Timestamp for pipeline upon creation
   private Instant creationTimestamp;
+  private UUID suggestedLeader;

Review comment:
   how about suggestedLeaderId ?

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/CreatePipelineCommandHandler.java
##
@@ -82,18 +104,22 @@ public void handle(SCMCommand command, OzoneContainer 
ozoneContainer,
 final CreatePipelineCommandProto createCommand =
 ((CreatePipelineCommand)command).getProto();
 final HddsProtos.PipelineID pipelineID = createCommand.getPipelineID();
-final Collection peers =
+final List peers =
 createCommand.getDatanodeList().stream()
 .map(DatanodeDetails::getFromProtoBuf)
 .collect(Collectors.toList());
+final List priorityList = createCommand.getPriorityList();
+
+incSuggestedLeaderCount(priorityList, peers, dn);

Review comment:
   increase the counter after successfully created the pipeline ? Might 
throw exception during the creation.

##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
##
@@ -711,10 +711,23 @@ private long 
calculatePipelineBytesWritten(HddsProtos.PipelineID pipelineID) {
 
   @Override
   public void addGroup(HddsProtos.PipelineID pipelineId,
-  Collection peers) throws IOException {
+  List peers) throws IOException {
+List priorityList = new ArrayList<>();

Review comment:
   ``
   List priorityList = new ArrayList<>(peers.size());
   for (...)
   ``
   
   so that won't be bothered by unused var `dn`

##
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/RatisPipelineProvider.java
##
@@ -98,8 +102,40 @@ private boolean exceedPipelineNumberLimit(ReplicationFactor 
factor) {
 return false;
   }
 
+  private DatanodeDetails getSuggestedLeader(List dns) {
+int minLeaderCount = Integer.MAX_VALUE;
+DatanodeDetails suggestedLeader = null;
+
+for (int i = 0; i < dns.size(); i++) {

Review comment:
   as suggested by idea, range for
   `for (dn : dns)`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong commented on pull request #1407: HDDS-4158. Provide a class type for Java based configuration

2020-09-10 Thread GitBox


maobaolong commented on pull request #1407:
URL: https://github.com/apache/hadoop-ozone/pull/1407#issuecomment-690017443


   @adoroszlai Please take a look at this PR, it will help us to use the `Java 
based configuration` conveniently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1413: HDDS-4228: add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread GitBox


timmylicheng commented on pull request #1413:
URL: https://github.com/apache/hadoop-ozone/pull/1413#issuecomment-690017940


   LGTM. +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4229) Upload Ozone 1.0.0 sources jars to Apache maven repo

2020-09-10 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-4229:
-
Summary: Upload Ozone 1.0.0 sources jars to Apache maven repo  (was: Upload 
Ozone 1.0.0 sources jar to Apache maven repo)

> Upload Ozone 1.0.0 sources jars to Apache maven repo
> 
>
> Key: HDDS-4229
> URL: https://issues.apache.org/jira/browse/HDDS-4229
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Priority: Minor
>
> ozone artifacts on Apache maven repo doesn't have the corresponding sources 
> jars.
> This leads to a small inconvenience where debugging an Ozone client program 
> in IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
> maven repo.
> A possible workaround is to run {{mvn clean source:jar install -DskipTests}} 
> so local maven repo will have the sources jars available for debugging.
> e.g.
> for hadoop-ozone-client 1.0.0: 
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
> We don't have {{*-sources.jar}} files.
> for hadoop-client 3.3.0:
> https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
> There are {{hadoop-client-3.3.0-sources.jar}} and 
> {{hadoop-client-3.3.0-test-sources.jar}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4228:
-
Labels: pull-request-available pull-requests-available  (was: 
pull-requests-available)

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4229) Upload Ozone 1.0.0 sources jar to Apache maven repo

2020-09-10 Thread Siyao Meng (Jira)
Siyao Meng created HDDS-4229:


 Summary: Upload Ozone 1.0.0 sources jar to Apache maven repo
 Key: HDDS-4229
 URL: https://issues.apache.org/jira/browse/HDDS-4229
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Siyao Meng


ozone artifacts on Apache maven repo doesn't have the corresponding sources 
jars.
This leads to a small inconvenience where debugging an Ozone client program in 
IDEs (e.g. IntelliJ) won't be able to fetch sources jars directly from the 
maven repo.
A possible workaround is to run {{mvn clean source:jar install -DskipTests}} so 
local maven repo will have the sources jars available for debugging.

e.g.

for hadoop-ozone-client 1.0.0: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-ozone-client/1.0.0/
We don't have {{*-sources.jar}} files.

for hadoop-client 3.3.0:
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-client/3.3.0/
There are {{hadoop-client-3.3.0-sources.jar}} and 
{{hadoop-client-3.3.0-test-sources.jar}}.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4233) Interrupted exeception printed out from DatanodeStateMachine

2020-09-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-4233:
--
Summary: Interrupted exeception printed out from DatanodeStateMachine  
(was: Interrupted execption printed out from DatanodeStateMachine)

> Interrupted exeception printed out from DatanodeStateMachine
> 
>
> Key: HDDS-4233
> URL: https://issues.apache.org/jira/browse/HDDS-4233
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> A strange exception is visible in the log during normal run:
> {code}
> 2020-09-10 11:31:41 WARN  DatanodeStateMachine:245 - Interrupt the execution.
> java.lang.InterruptedException: sleep interrupted
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:243)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:405)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The most common reason to this is triggering a new HB request.
> As this is a normal behavior we shouldn't log exception on WARN level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4233) Interrupted exeception printed out from DatanodeStateMachine

2020-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4233:
-
Labels: pull-request-available  (was: )

> Interrupted exeception printed out from DatanodeStateMachine
> 
>
> Key: HDDS-4233
> URL: https://issues.apache.org/jira/browse/HDDS-4233
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>
> A strange exception is visible in the log during normal run:
> {code}
> 2020-09-10 11:31:41 WARN  DatanodeStateMachine:245 - Interrupt the execution.
> java.lang.InterruptedException: sleep interrupted
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:243)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:405)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The most common reason to this is triggering a new HB request.
> As this is a normal behavior we shouldn't log exception on WARN level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #1416: HDDS-4233. Interrupted exeception printed out from DatanodeStateMachine

2020-09-10 Thread GitBox


elek opened a new pull request #1416:
URL: https://github.com/apache/hadoop-ozone/pull/1416


   ## What changes were proposed in this pull request?
   
   Hide Interrupted exceptions from WARN level as they are expected.

   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4233
   
   ## How was this patch tested?
   
   CI. Simple change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486417710



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation

Review comment:
   > Here AWS S3 incompatibility means, is it because we are showing 
normalized keys?
   
   Yes, keys are normalized. Content can be found under different key names.
   
   I started to define the 100% compatibility here:
   
   
https://github.com/elek/hadoop-ozone/blob/s3-compat/hadoop-ozone/dist/src/main/smoketest/s3/s3-vs-filepath.robot





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486418379



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation

Review comment:
   > I don't think that is true. Paths are normalized already on the S3 
interface when writing new keys.
   
   But not for read, if I understood well. But happy to remove this line if 
it's confusing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486423208



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   Different behavior on bucket level seems to be an interesting idea.
   
   > For buckets created via FS interface, the FS semantics will always take 
precedence. 
   
   How would you define the behavior of bucket is created from S3? 
   
   I suppose in this case we should support 100% AWS S3 compatibility (without 
forced normalization).
   
   But how would o3fs/ofs work in case of `s3` buckets:
   
1. Partial view from ofs (incompatible keys are hidden)
2. `ofs/o3fs` is disabled (exception), no intermediate directories are 
created.

   
   
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486424461



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   Also, one disadvantage: bucket level settings have increased complexity. 
It's harder to define the expected behavior for a specific path. Cluster level 
settings is easie, as there is one global behavior for the setup.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486428009



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.

Review comment:
   Yes, thanks. I also clarified this paragraph a little:
   
   > But please note that `ozone.om.enable.filesystem.paths` should always be 
turned on if S3 and HCFS are both used. It means that if both S3 and HCFS are 
used, normalization is forced, and S3 interface is not fully AWS S3 compatible. 
There is no option to use HCFS and S3 but with full AWS compatibility (and 
reduced HCFS compatibility).
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486428560



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)

Review comment:
   Fixed, thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486429661



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486432322



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486433552



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[GitHub] [hadoop-ozone] elek commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-10 Thread GitBox


elek commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r486434809



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 

[jira] [Created] (HDDS-4233) Interrupted execption printed out from DatanodeStateMachine

2020-09-10 Thread Marton Elek (Jira)
Marton Elek created HDDS-4233:
-

 Summary: Interrupted execption printed out from 
DatanodeStateMachine
 Key: HDDS-4233
 URL: https://issues.apache.org/jira/browse/HDDS-4233
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


A strange exception is visible in the log during normal run:

{code}
2020-09-10 11:31:41 WARN  DatanodeStateMachine:245 - Interrupt the execution.
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:243)
at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:405)
at java.lang.Thread.run(Thread.java:748)
{code}


The most common reason to this is triggering a new HB request.

As this is a normal behavior we shouldn't log exception on WARN level.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org