[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4228:
-
Labels: pull-request-available pull-requests-available  (was: 
pull-requests-available)

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1413: HDDS-4228: add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread GitBox


timmylicheng commented on pull request #1413:
URL: https://github.com/apache/hadoop-ozone/pull/1413#issuecomment-690017940


   LGTM. +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong commented on pull request #1407: HDDS-4158. Provide a class type for Java based configuration

2020-09-09 Thread GitBox


maobaolong commented on pull request #1407:
URL: https://github.com/apache/hadoop-ozone/pull/1407#issuecomment-690017443


   @adoroszlai Please take a look at this PR, it will help us to use the `Java 
based configuration` conveniently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong commented on pull request #1290: HDDS-4064. Show container verbose info with verbose option

2020-09-09 Thread GitBox


maobaolong commented on pull request #1290:
URL: https://github.com/apache/hadoop-ozone/pull/1290#issuecomment-690015532


   @adoroszlai @xiaoyuyao @elek Thanks for your review.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4064) Show container verbose info with verbose option

2020-09-09 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong resolved HDDS-4064.
--
Fix Version/s: 1.1.0
   Resolution: Fixed

> Show container verbose info with verbose option
> ---
>
> Key: HDDS-4064
> URL: https://issues.apache.org/jira/browse/HDDS-4064
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone CLI
>Affects Versions: 1.1.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc closed pull request #1412: HDDS-3751. Ozone sh client support bucket quota option.

2020-09-09 Thread GitBox


captainzmc closed pull request #1412:
URL: https://github.com/apache/hadoop-ozone/pull/1412


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Labels: pull-requests-available  (was: pull-request-available)

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-requests-available
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng opened a new pull request #1413: HDDS-4228: add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread GitBox


GlenGeng opened a new pull request #1413:
URL: https://github.com/apache/hadoop-ozone/pull/1413


   ## What changes were proposed in this pull request?
   
   The scm audit log for ALLOCATE_BLOCK is as follows:
   `2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, size=268435456, 
type=RATIS, factor=THREE} | ret=SUCCESS |`
    
   One might be interested about the num of blocks allocated, better add field 
'num' to  ALLOCATE_BLOCK of scm audit log.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4228
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   CI



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4228:
-
Labels: pull-request-available  (was: )

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-request-available
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Description: 
 

The scm audit log for ALLOCATE_BLOCK is as follows:
{code:java}
2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, size=268435456, 
type=RATIS, factor=THREE} | ret=SUCCESS |{code}
 

One might be interested about the num of blocks allocated, better add field 
'num' to  ALLOCATE_BLOCK of scm audit log.

  was:
 

The scm audit log for ALLOCATE_BLOCK is as follows:
{code:java}
2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, size=268435456, 
type=RATIS, factor=THREE} | ret=SUCCESS |{code}
 

Better add num of blocks allocated into the audit log, one might be interested 
about the num of blocks allocated, better add field 'num' to 


> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> One might be interested about the num of blocks allocated, better add field 
> 'num' to  ALLOCATE_BLOCK of scm audit log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Description: 
 

The scm audit log for ALLOCATE_BLOCK is as follows:
{code:java}
2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, size=268435456, 
type=RATIS, factor=THREE} | ret=SUCCESS |{code}
 

Better add num of blocks allocated into the audit log, one might be interested 
about the num of blocks allocated, better add field 'num' to 

  was:
 

The sac
{code:java}
2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, size=268435456, 
type=RATIS, factor=THREE} | ret=SUCCESS |{code}


> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>
>  
> The scm audit log for ALLOCATE_BLOCK is as follows:
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}
>  
> Better add num of blocks allocated into the audit log, one might be 
> interested about the num of blocks allocated, better add field 'num' to 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) add field 'num' to ALLOCATE_BLOCK of scm audit log.

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Summary: add field 'num' to ALLOCATE_BLOCK of scm audit log.  (was: 
ALLOCATE_BLOCK of scm audit log miss num)

> add field 'num' to ALLOCATE_BLOCK of scm audit log.
> ---
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>
>  
> The sac
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) ALLOCATE_BLOCK of scm audit log miss num

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Description: 
 

The sac
{code:java}
2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, size=268435456, 
type=RATIS, factor=THREE} | ret=SUCCESS |{code}

  was:
Teragen reported to be slow with low number of mappers compared to HDFS.

In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins but 
with Ozone it was 6 mins. It could be fixed with using more mappers, but when I 
investigated the execution I found a few problems reagrding to the BufferPool 
management.

 1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
itself is incremental
 2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
which can be a slow operation (positions should be calculated).
 3. There is no explicit support for write(byte) operations

In the flamegraph it's clearly visible that with low number of mappers the 
client is busy with buffer operations. After the patch the rpc call and the 
checksum calculation give the majority of the time. 


> ALLOCATE_BLOCK of scm audit log miss num
> 
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>
>  
> The sac
> {code:java}
> 2020-09-10 03:42:08,196 | INFO | SCMAudit | user=root | ip=172.16.90.221 | 
> op=ALLOCATE_BLOCK {owner=7da0b4c4-d053-4fa0-8648-44ff0b8ba1bf, 
> size=268435456, type=RATIS, factor=THREE} | ret=SUCCESS |{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-4228) ALLOCATE_BLOCK of scm audit log miss num

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng reassigned HDDS-4228:
---

Assignee: Glen Geng  (was: Marton Elek)

> ALLOCATE_BLOCK of scm audit log miss num
> 
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>  Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) ALLOCATE_BLOCK of scm audit log miss num

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Labels:   (was: pull-request-available)

> ALLOCATE_BLOCK of scm audit log miss num
> 
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Glen Geng
>Priority: Minor
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4228) ALLOCATE_BLOCK of scm audit log miss num

2020-09-09 Thread Glen Geng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glen Geng updated HDDS-4228:

Priority: Minor  (was: Blocker)

> ALLOCATE_BLOCK of scm audit log miss num
> 
>
> Key: HDDS-4228
> URL: https://issues.apache.org/jira/browse/HDDS-4228
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Glen Geng
>Assignee: Marton Elek
>Priority: Minor
>  Labels: pull-request-available
>
> Teragen reported to be slow with low number of mappers compared to HDFS.
> In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins 
> but with Ozone it was 6 mins. It could be fixed with using more mappers, but 
> when I investigated the execution I found a few problems reagrding to the 
> BufferPool management.
>  1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
> itself is incremental
>  2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
> which can be a slow operation (positions should be calculated).
>  3. There is no explicit support for write(byte) operations
> In the flamegraph it's clearly visible that with low number of mappers the 
> client is busy with buffer operations. After the patch the rpc call and the 
> checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4228) ALLOCATE_BLOCK of scm audit log miss num

2020-09-09 Thread Glen Geng (Jira)
Glen Geng created HDDS-4228:
---

 Summary: ALLOCATE_BLOCK of scm audit log miss num
 Key: HDDS-4228
 URL: https://issues.apache.org/jira/browse/HDDS-4228
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Glen Geng
Assignee: Marton Elek


Teragen reported to be slow with low number of mappers compared to HDFS.

In my test (one pipeline, 3 yarn nodes) 10 g teragen with HDFS was ~3 mins but 
with Ozone it was 6 mins. It could be fixed with using more mappers, but when I 
investigated the execution I found a few problems reagrding to the BufferPool 
management.

 1. IncrementalChunkBuffer is slow and it might not be required as BufferPool 
itself is incremental
 2. For each write operation the bufferPool.allocateBufferIfNeeded is called 
which can be a slow operation (positions should be calculated).
 3. There is no explicit support for write(byte) operations

In the flamegraph it's clearly visible that with low number of mappers the 
client is busy with buffer operations. After the patch the rpc call and the 
checksum calculation give the majority of the time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2416) Ozone Trash Feature

2020-09-09 Thread YiSheng Lien (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YiSheng Lien updated HDDS-2416:
---
Attachment: (was: Process deleting keys in trash-enabled buckets.html)

> Ozone Trash Feature
> ---
>
> Key: HDDS-2416
> URL: https://issues.apache.org/jira/browse/HDDS-2416
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Matthew Sharp
>Assignee: Matthew Sharp
>Priority: Minor
> Attachments: Ozone_Trash_Feature.docx
>
>
> This Jira is a proposal to add a new feature to Ozone that provides a user 
> with the ability to recover keys that may have been deleted accidentally.  
> This would be similar to the HDFS trash feature.
> The attached document outlines the proposal and considerations for this 
> feature.
> And this [Question-Doc|https://hackmd.io/@cxorm/B1Hmsd4t8] is addressed for 
> question related to the feature, 
> it would be continuously updated, please feel free to updated it.
> (Or let us know the question, and we would update it, thanks)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4227) Implement a "prepareForUpgrade" step that applies all committed transactions onto the OM state machine.

2020-09-09 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-4227:

Description: 
*Why is this needed?*
Through HDDS-4143, we have a generic factory to handle multiple versions of 
apply transaction implementations based on layout version. Hence, this factory 
can be used to handle versioned requests across layout versions, whenever both 
the versions need to exist in the code (Let's say for HDDS-2939). 

However, it has been noticed that the OM ratis requests are still undergoing 
lot of minor changes (HDDS-4007, HDDS-4007, HDDS-3903), and in these cases it 
will become hard to maintain 2 versions of the code just to support clean 
upgrades. 

Hence, the plan is to build a pre-upgrade utility (client API) that makes sure 
that an OM instance has no "un-applied" transactions in this Raft log. Invoking 
this client API makes sure that the upgrade starts with a clean state. Of 
course, this would be needed only in a HA setup. In a non HA setup, this can 
either be skipped, or when invoked will be a No-Op (Non Ratis) or cause no harm 
(Single node Ratis).

*How does it work?*
Before updating the software bits, our goal is to get OMs to get to the  latest 
state with respect to apply transaction. The reason we want this is to make 
sure that the same version of the code executes the AT step in all the 3 OMs. 
In a high level, the flow will be as follows.

* Before upgrade, *stop* the OMs.
* Start OMs with a special flag --prepareUpgrade (This is something like 
--init,  which is a special state which stops the ephemeral OM instance after 
doing some work)
* When OM is started with the --prepareUpgrade flag, it does not start the RPC 
server, so no new requests can get in.
* In this state, we give every OM time to apply txn until the last txn.
* We know that at least 2 OMs would have gotten the last client request 
transaction committed into their log. Hence, those 2 OMs are expected to apply 
transaction to that index faster.
* At every OM, the Raft log will be purged after this wait period (so that the 
replay does not happen), and a Ratis snapshot taken at last txn.
* Even if there is a lagger OM which is unable to get to last applied txn 
index, its logs will be purged after the wait time expires.
* Now when OMs are started with newer version, all the OMs will start using the 
new code.
* The lagger OM will get the new Ratis snapshot since there are no logs to 
replay from.

> Implement a "prepareForUpgrade" step that applies all committed transactions 
> onto the OM state machine.
> ---
>
> Key: HDDS-4227
> URL: https://issues.apache.org/jira/browse/HDDS-4227
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
> Fix For: 1.1.0
>
>
> *Why is this needed?*
> Through HDDS-4143, we have a generic factory to handle multiple versions of 
> apply transaction implementations based on layout version. Hence, this 
> factory can be used to handle versioned requests across layout versions, 
> whenever both the versions need to exist in the code (Let's say for 
> HDDS-2939). 
> However, it has been noticed that the OM ratis requests are still undergoing 
> lot of minor changes (HDDS-4007, HDDS-4007, HDDS-3903), and in these cases it 
> will become hard to maintain 2 versions of the code just to support clean 
> upgrades. 
> Hence, the plan is to build a pre-upgrade utility (client API) that makes 
> sure that an OM instance has no "un-applied" transactions in this Raft log. 
> Invoking this client API makes sure that the upgrade starts with a clean 
> state. Of course, this would be needed only in a HA setup. In a non HA setup, 
> this can either be skipped, or when invoked will be a No-Op (Non Ratis) or 
> cause no harm (Single node Ratis).
> *How does it work?*
> Before updating the software bits, our goal is to get OMs to get to the  
> latest state with respect to apply transaction. The reason we want this is to 
> make sure that the same version of the code executes the AT step in all the 3 
> OMs. In a high level, the flow will be as follows.
> * Before upgrade, *stop* the OMs.
> * Start OMs with a special flag --prepareUpgrade (This is something like 
> --init,  which is a special state which stops the ephemeral OM instance after 
> doing some work)
> * When OM is started with the --prepareUpgrade flag, it does not start the 
> RPC server, so no new requests can get in.
> * In this state, we give every OM time to apply txn until the last txn.
> * We know that at least 2 OMs would have gotten the last client request 
> transaction committed into their log. He

[GitHub] [hadoop-ozone] amaliujia edited a comment on pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-09 Thread GitBox


amaliujia edited a comment on pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#issuecomment-689826469


   R @timmylicheng @nandakumar131 
   
   I am thinking maybe we can first merge this PR and create a JIRA to track 
left work. Right now per feedback this command could print more information 
about Ratis peers, e.g. leader/follower roles, leader term, etc.
   
   I took a look at how does OM HA does: 
https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java#L2535
   
   Basically it seems such request will directly hit the leader OM, then get 
status will be much easier. Currently in SCM HA we haven't reached to the point 
with a robust Ratis setup. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] amaliujia commented on pull request #1346: HDDS-4115. CLI command to show current SCM leader and follower status.

2020-09-09 Thread GitBox


amaliujia commented on pull request #1346:
URL: https://github.com/apache/hadoop-ozone/pull/1346#issuecomment-689826469


   R @timmylicheng @nandakumar131 
   
   I am thinking maybe we can first merge this PR and create a JIRA to track 
left work. Right now per feedback this command could print more information 
about Ratis peers, e.g. leader/follower roles, leader term, etc.
   
   I took a look at how does OM HA does: 
https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java#L2535
   
   Basically it seems such request will directly hit the leader OM, then get 
status will be much easier. Currently in SCM HA we haven't reached to the point 
with robust Ratis setup. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4227) Implement a "prepareForUpgrade" step that applies all committed transactions onto the OM state machine.

2020-09-09 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-4227:
---

 Summary: Implement a "prepareForUpgrade" step that applies all 
committed transactions onto the OM state machine.
 Key: HDDS-4227
 URL: https://issues.apache.org/jira/browse/HDDS-4227
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Manager
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 1.1.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl commented on pull request #1401: HDDS-3927. Rename Ozone OM,DN,SCM runtime options to conform to naming conventions

2020-09-09 Thread GitBox


smengcl commented on pull request #1401:
URL: https://github.com/apache/hadoop-ozone/pull/1401#issuecomment-689797013


   Thanks @elek for the review. I have made changes accordingly.
   
   Do the changes look good to you?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4226) Cleanup OM snapshots left after a failed installSnapshot

2020-09-09 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created HDDS-4226:
---

 Summary: Cleanup OM snapshots left after a failed installSnapshot
 Key: HDDS-4226
 URL: https://issues.apache.org/jira/browse/HDDS-4226
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Mukul Kumar Singh


Ozonemanager tries to install the snapshot
{code:java}
2020-09-09 22:07:14,830 [pool-144-thread-1] INFO  om.OzoneManager 
(OzoneManager.java:installCheckpoint(3159)) - Installing checkpoint with 
OMTransactionInfo 2#68754
2020-09-09 22:07:14,831 [grpc-default-executor-50] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE: 
reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
_PROGRESS
{code}

It failed because of the issues from HDDS-4224.
{code:java}
2020-09-09 22:07:14,831 [pool-144-thread-1] ERROR om.OzoneManager 
(OzoneManager.java:installSnapshotFromLeader(3141)) - Failed to install 
snapshot from Leader OM: {}
java.lang.NullPointerException
at 
org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3168)
at 
org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3162)
at 
org.apache.hadoop.ozone.om.OzoneManager.installSnapshotFromLeader(OzoneManager.java:3139)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$notifyInstallSnapshotFromLeader$4(OzoneManagerStateMachine.java:372)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
 

The checkpoint is left in the snapshot directory.
{code:java}
➜  chaos-2020-09-09-22-05-33-IST ls 
MiniOzoneClusterImpl-71baac34-2321-4756-ba1e-5834c5628047/omNode-2/ratis/snapshot/om.db-omNode-1-1599669
om.db-omNode-1-1599669432684/  om.db-omNode-1-1599669451421/  
om.db-omNode-1-1599669478149/  om.db-omNode-1-1599669504818/  
om.db-omNode-1-1599669533577/  om.db-omNode-1-1599669566509/
om.db-omNode-1-1599669433775/  om.db-omNode-1-1599669453030/  
om.db-omNode-1-1599669480273/  om.db-omNode-1-1599669507385/  
om.db-omNode-1-1599669535603/  om.db-omNode-1-1599669568325/
om.db-omNode-1-1599669434867/  om.db-omNode-1-1599669454688/  
om.db-omNode-1-1599669482206/  om.db-omNode-1-1599669509373/  
om.db-omNode-1-1599669537716/  om.db-omNode-1-1599669570186/
om.db-omNode-1-1599669435886/  om.db-omNode-1-1599669456346/  
om.db-omNode-1-1599669484256/  om.db-omNode-1-1599669511241/  
om.db-omNode-1-1599669540574/  om.db-omNode-1-1599669572150/
om.db-omNode-1-1599669437199/  om.db-omNode-1-1599669458194/  
om.db-omNode-1-1599669486200/  om.db-omNode-1-1599669513051/  
om.db-omNode-1-1599669543136/  om.db-omNode-1-1599669574811/
om.db-omNode-1-1599669438519/  om.db-omNode-1-1599669459992/  
om.db-omNode-1-1599669487968/  om.db-omNode-1-1599669515343/  
om.db-omNode-1-1599669546272/  om.db-omNode-1-1599669576833/
om.db-omNode-1-1599669439819/  om.db-omNode-1-1599669461897/  
om.db-omNode-1-1599669490218/  om.db-omNode-1-1599669517332/  
om.db-omNode-1-1599669548363/  om.db-omNode-1-1599669578680/
om.db-omNode-1-1599669441209/  om.db-omNode-1-1599669463871/  
om.db-omNode-1-1599669492005/  om.db-omNode-1-1599669519320/  
om.db-omNode-1-1599669551596/  om.db-omNode-1-1599669580427/
om.db-omNode-1-1599669442606/  om.db-omNode-1-1599669465810/  
om.db-omNode-1-1599669493727/  om.db-omNode-1-1599669521491/  
om.db-omNode-1-1599669554153/  om.db-omNode-1-1599669582124/
om.db-omNode-1-1599669443967/  om.db-omNode-1-1599669467909/  
om.db-omNode-1-1599669495587/  om.db-omNode-1-1599669523436/  
om.db-omNode-1-1599669556370/  om.db-omNode-1-1599669583768/
om.db-omNode-1-1599669445468/  om.db-omNode-1-1599669470054/  
om.db-omNode-1-1599669497445/  om.db-omNode-1-1599669525567/  
om.db-omNode-1-1599669558461/  om.db-omNode-1-1599669585501/
om.db-omNode-1-1599669446937/  om.db-omNode-1-1599669472125/  
om.db-omNode-1-1599669499362/  om.db-omNode-1-1599669527648/  
om.db-omNode-1-1599669560578/
om.db-omNode-1-1599669448360/  om.db-omNode-1-1599669474051/  
om.db-omNode-1-1599669501269/  om.db-omNode-1-1599669529648/  
om.db-omNode-1-1599669562666/
om.db-omNode-1-1599669449867/  om.db-omNode-1-1599669476078/  
om.db-omNode-1-1599669503036/  om.db-omNode-1-1599669531573/  
om.db-omNode-1-1599669564620/ {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on pull request #1400: HDDS-4141. Implement Finalize command in Ozone Manager client.

2020-09-09 Thread GitBox


avijayanhwx commented on pull request #1400:
URL: https://github.com/apache/hadoop-ozone/pull/1400#issuecomment-689756069


   Thanks for working on this @fapifta, and for the reviews @linyiqun , 
@sodonnel. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4143) Implement a factory for OM Requests that returns an instance based on layout version.

2020-09-09 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-4143:

Description: 
* Add the current layout version (MLV) to the OM Ratis request. If there is no 
layout version   present, we can default to '0'.
* Implement Generic factory which stores different instances of Type 'T' 
sharded by a key & version. A single key can be associated with different 
versions of 'T'. This is to support a typical use case during upgrade to have 
multiple versions of a class / method / object and chose them based on current 
layout version at runtime. Before finalizing, an older version is typically 
needed, and after finalize, a newer version is needed.
* Using the generic factory, we scan all the different OM "write" requests and 
associate them with versions.
* Layout feature code refactoring. Added more comments and tests.

  was:Add the current layout version (MLV) to the OM Ratis request. If there is 
no layout version   present, we can default to '0'.


> Implement a factory for OM Requests that returns an instance based on layout 
> version.
> -
>
> Key: HDDS-4143
> URL: https://issues.apache.org/jira/browse/HDDS-4143
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> * Add the current layout version (MLV) to the OM Ratis request. If there is 
> no layout version   present, we can default to '0'.
> * Implement Generic factory which stores different instances of Type 'T' 
> sharded by a key & version. A single key can be associated with different 
> versions of 'T'. This is to support a typical use case during upgrade to have 
> multiple versions of a class / method / object and chose them based on 
> current layout version at runtime. Before finalizing, an older version is 
> typically needed, and after finalize, a newer version is needed.
> * Using the generic factory, we scan all the different OM "write" requests 
> and associate them with versions.
> * Layout feature code refactoring. Added more comments and tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4141) Implement Finalize command in Ozone Manager client.

2020-09-09 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan resolved HDDS-4141.
-
Resolution: Fixed

PR merged.

> Implement Finalize command in Ozone Manager client.
> ---
>
> Key: HDDS-4141
> URL: https://issues.apache.org/jira/browse/HDDS-4141
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: István Fajth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> * On the client side, add a new command to finalize OM through CLI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx merged pull request #1400: HDDS-4141. Implement Finalize command in Ozone Manager client.

2020-09-09 Thread GitBox


avijayanhwx merged pull request #1400:
URL: https://github.com/apache/hadoop-ozone/pull/1400


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1400: HDDS-4141. Implement Finalize command in Ozone Manager client.

2020-09-09 Thread GitBox


avijayanhwx commented on a change in pull request #1400:
URL: https://github.com/apache/hadoop-ozone/pull/1400#discussion_r485845129



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/upgrade/OMFinalizeUpgradeProgressResponse.java
##
@@ -0,0 +1,45 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.om.response.upgrade;
+
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OmMetadataManagerImpl;
+import org.apache.hadoop.ozone.om.response.CleanupTableInfo;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+
+import java.io.IOException;
+
+/**
+ * Response for finalizeUpgradeProgress request.
+ */
+// yepp this will not be a write request, adding a table here to the annotation
+// just to pass tests related to this annotation.
+@CleanupTableInfo(cleanupTables = { OmMetadataManagerImpl.USER_TABLE })

Review comment:
   @fapifta I am Ok with handling this in a follow up patch. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl merged pull request #1397: HDDS-4211. [OFS] Better owner and group display for listing Ozone volumes and buckets

2020-09-09 Thread GitBox


smengcl merged pull request #1397:
URL: https://github.com/apache/hadoop-ozone/pull/1397


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4211) [OFS] Better owner and group display for listing Ozone volumes and buckets

2020-09-09 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-4211:
-
   Fix Version/s: 1.1.0
Target Version/s:   (was: 1.1.0)
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

> [OFS] Better owner and group display for listing Ozone volumes and buckets
> --
>
> Key: HDDS-4211
> URL: https://issues.apache.org/jira/browse/HDDS-4211
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> Improve volumes' and buckets' owner and group display when listing in OFS.
> 1. Display short name instead of full Kerberos principal.
> 2. For volumes, get actual group of the owner (currently it is the volume 
> admin name which is incorrect)
> 3. For buckets, display the owner and group of its parent volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl commented on a change in pull request #1397: HDDS-4211. [OFS] Better owner and group display for listing Ozone volumes and buckets

2020-09-09 Thread GitBox


smengcl commented on a change in pull request #1397:
URL: https://github.com/apache/hadoop-ozone/pull/1397#discussion_r485820978



##
File path: 
hadoop-ozone/ozonefs-common/src/main/java/org/apache/hadoop/fs/ozone/BasicRootedOzoneClientAdapterImpl.java
##
@@ -972,24 +992,26 @@ private static FileStatusAdapter 
getFileStatusAdapterForVolume(
* Generate a FileStatusAdapter for a bucket.
* @param ozoneBucket OzoneBucket object.
* @param uri Full URI to OFS root.
+   * @param owner Owner of the parent volume of the bucket.
+   * @param group Group of the parent volume of the bucket.
* @return FileStatusAdapter for a bucket.
*/
   private static FileStatusAdapter getFileStatusAdapterForBucket(
-  OzoneBucket ozoneBucket, URI uri, String username) {
+  OzoneBucket ozoneBucket, URI uri, String owner, String group) {
 String pathStr = uri.toString() +
 OZONE_URI_DELIMITER + ozoneBucket.getVolumeName() +
 OZONE_URI_DELIMITER + ozoneBucket.getName();
 if (LOG.isDebugEnabled()) {
-  LOG.debug("getFileStatusAdapterForBucket: ozoneBucket={}, pathStr={}, "
-  + "username={}", ozoneBucket.getVolumeName() + 
OZONE_URI_DELIMITER
-  + ozoneBucket.getName(), pathStr, username);
+  LOG.debug("getFileStatusAdapterForBucket: ozoneBucket={}, pathStr={}",
+  ozoneBucket.getVolumeName() + OZONE_URI_DELIMITER +
+  ozoneBucket.getName(), pathStr);
 }
 Path path = new Path(pathStr);
 return new FileStatusAdapter(0L, path, true, (short)0, 0L,
 ozoneBucket.getCreationTime().getEpochSecond() * 1000, 0L,
-FsPermission.getDirDefault().toShort(),  // TODO: derive from ACLs 
later
-// TODO: revisit owner and group
-username, username, path, new BlockLocation[0]);
+FsPermission.getDirDefault().toShort(),
+// TODO: maybe derive owner and group from ACLs later

Review comment:
   Thanks! Removed the comment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


arp7 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485820636



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   > an existing bucket can be exposed created via CLI to be exposed to S3, 
what semantics that bucket will get?
   If the bucket was created via FS interface, it will support FS semantics.
   
   > Buckets creation is possible via only OFS, what about O3fs?
   Good point, for buckets created via the Ozone shell, we could accept a 
command-line flag. The default can be filesystem because S3 buckets are 
traditionally created via the S3 API. You're right this needs some more 
discussion.

##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.

[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


arp7 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485820636



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   bq. an existing bucket can be exposed created via CLI to be exposed to 
S3, what semantics that bucket will get?
   If the bucket was created via FS interface, it will support FS semantics.
   
   bq. Buckets creation is possible via only OFS, what about O3fs?
   Good point, for buckets created via the Ozone shell, we could accept a 
command-line flag. The default can be filesystem because S3 buckets are 
traditionally created via the S3 API. You're right this needs some more 
discussion.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] fapifta commented on a change in pull request #1400: HDDS-4141. Implement Finalize command in Ozone Manager client.

2020-09-09 Thread GitBox


fapifta commented on a change in pull request #1400:
URL: https://github.com/apache/hadoop-ozone/pull/1400#discussion_r485819842



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/upgrade/OMFinalizeUpgradeProgressResponse.java
##
@@ -0,0 +1,45 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package org.apache.hadoop.ozone.om.response.upgrade;
+
+import org.apache.hadoop.hdds.utils.db.BatchOperation;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OmMetadataManagerImpl;
+import org.apache.hadoop.ozone.om.response.CleanupTableInfo;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+
+import java.io.IOException;
+
+/**
+ * Response for finalizeUpgradeProgress request.
+ */
+// yepp this will not be a write request, adding a table here to the annotation
+// just to pass tests related to this annotation.
+@CleanupTableInfo(cleanupTables = { OmMetadataManagerImpl.USER_TABLE })

Review comment:
   I already started to work on the server side, one of the first steps is 
to make this reguest a read request and delete this class, as read requests are 
handled in an other way. I can push that commit to this PR or it will be 
addressed in the PR coming for HDDS-4172.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485817494



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   We also have another way right, an existing bucket can be exposed 
created via CLI to be exposed to S3, what semantics that bucket will get?
   
   >For buckets created via FS interface, the FS semantics will always take 
precedence
   Buckets creation is possible via only OFS, what about O3fs?
   
   >If the global setting is enabled, then the value of the setting at the time 
of bucket creation is sampled and that takes >effect for the lifetime of the 
bucket.
   
   A bucket created via Shell, when global flag (assuming 
ozone.om.enable.filesystem.paths=true), they will follow FS semantics and with 
slight S3 incompatibility.
   So, a bucket created via Shell, when global flag (assuming 
ozone.om.enable.filesystem.paths=false), they will follow S3 semantics and with 
broken FS semantics or completely disallow.
   
   Written from my understanding, as I have not got the complete context of the 
proposal.
   
   I might be missing somethings here.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485817494



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   We also have another way right, an existing bucket can be exposed 
created via CLI to be exposed to S3, what semantics that bucket will get?
   
   >For buckets created via FS interface, the FS semantics will always take 
precedence
   
   Buckets creation is possible via only OFS, what about O3fs?
   
   >If the global setting is enabled, then the value of the setting at the time 
of bucket creation is sampled and that takes >effect for the lifetime of the 
bucket.
   
   A bucket created via Shell, when global flag (assuming 
ozone.om.enable.filesystem.paths=true), they will follow FS semantics and with 
slight S3 incompatibility.
   So, a bucket created via Shell, when global flag (assuming 
ozone.om.enable.filesystem.paths=false), they will follow S3 semantics and with 
broken FS semantics or completely disallow.
   
   Written from my understanding, as I have not got the complete context of the 
proposal.
   
   I might be missing somethings here.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485817494



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   We also have another way right, an existing bucket can be exposed 
created via CLI to be exposed to S3, what semantics that bucket will get?
   
   >For buckets created via FS interface, the FS semantics will always take 
precedence
   Buckets creation is possible via only OFS, what about O3fs?
   
   >If the global setting is enabled, then the value of the setting at the time 
of bucket creation is sampled and that takes >effect for the lifetime of the 
bucket.
   
   A bucket created via Shell, when global flag (assuming 
ozone.om.enable.filesystem.paths=true), they will follow FS semantics and with 
slight S3 incompatibility.
   So, a bucket created via Shell, when global flag (assuming 
ozone.om.enable.filesystem.paths=false), they will follow S3 semantics and with 
broken FS semantics or completely disallow.
   
   Written from my understanding, as I have not got the complete context of the 
proposal.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485813294



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485805457



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485805715



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485804988



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485802830



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485802830



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485802418



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485801742



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485801364



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485798906



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485794138



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485794525



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[jira] [Created] (HDDS-4225) NPE while installing snapshots on ozone manager

2020-09-09 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created HDDS-4225:
---

 Summary: NPE while installing snapshots on ozone manager
 Key: HDDS-4225
 URL: https://issues.apache.org/jira/browse/HDDS-4225
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Mukul Kumar Singh


{code}
2020-09-09 22:07:14,830 [pool-144-thread-1] INFO  om.OzoneManager 
(OzoneManager.java:installCheckpoint(3159)) - Installing checkpoint with 
OMTransactionInfo 2#68754
2020-09-09 22:07:14,831 [grpc-default-executor-50] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE: 
reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
_PROGRESS
2020-09-09 22:07:14,831 [grpc-default-executor-49] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:installSnapshot(1117)) - omNode-2@group-D62218D261DE: 
receive installSnapshot: omNode-1->omNode-2#0-t2,notif
y:(t:2, i:68440)
2020-09-09 22:07:14,836 [grpc-default-executor-49] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:notifyStateMachineToInstallSnapshot(1282)) - 
omNode-2@group-D62218D261DE: Snapshot Installation by StateMach
ine is in progress.
2020-09-09 22:07:14,836 [grpc-default-executor-49] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE: 
reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
_PROGRESS
2020-09-09 22:07:14,832 [grpc-default-executor-53] INFO  server.GrpcLogAppender 
(GrpcLogAppender.java:onNext(375)) - 
omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: received 
a reply om
Node-1<-omNode-2#0:FAIL-t2,IN_PROGRESS
2020-09-09 22:07:14,831 [grpc-default-executor-50] INFO  
server.GrpcServerProtocolService 
(GrpcServerProtocolService.java:onCompleted(138)) - omNode-2: Completed 
INSTALL_SNAPSHOT, lastRequest: omNode-1->omN
ode-2#0-t2,notify:(t:2, i:68440)
2020-09-09 22:07:14,831 [pool-144-thread-1] ERROR om.OzoneManager 
(OzoneManager.java:installSnapshotFromLeader(3141)) - Failed to install 
snapshot from Leader OM: {}
java.lang.NullPointerException
at 
org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3168)
at 
org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3162)
at 
org.apache.hadoop.ozone.om.OzoneManager.installSnapshotFromLeader(OzoneManager.java:3139)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$notifyInstallSnapshotFromLeader$4(OzoneManagerStateMachine.java:372)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485789295



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485789295



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485785304



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsi

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485784568



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation

Review comment:
   Here AWS S3 incompatibility means, is it because we are showing 
normalized keys?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485778737



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
+
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)

Review comment:
   exapmle -> example





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485778176



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -0,0 +1,282 @@
+---
+title: S3/Ozone Filesystem inter-op 
+summary: How to support both S3 and HCFS and the same time
+date: 2020-09-09
+jira: HDDS-4097
+status: draft
+author: Marton Elek, 
+---
+
+
+# Ozone S3 vs file-system semantics
+
+Ozone is an object-store for Hadoop ecosystem which can be used from multiple 
interfaces: 
+
+ 1. From Hadoop Compatible File Systems (will be called as *HCFS* in the 
remaining of this document) (RPC)
+ 2. From S3 compatible applications (REST)
+ 3. From container orchestrator as mounted volume (CSI, alpha feature)
+
+As Ozone is an object store it stores key and values in a flat hierarchy which 
is enough to support S3 (2). But to support Hadoop Compatible File System (and 
CSI), Ozone should simulated file system hierarchy.
+
+There are multiple challenges when file system hierarchy is simulated by a 
flat namespace:
+
+ 1. Some key patterns couldn't be easily transformed to file system path (e.g. 
`/a/b/../c`, `/a/b//d`, or a real key with directory path like `/b/d/`)
+ 2. Directory entries (which may have own properties) require special handling 
as file system interface requires a dir entry even if it's not created 
explicitly (for example if key `/a/b/c` is created `/a/b` supposed to be a 
visible directory entry for file system interface) 
+ 3. Non-recursive listing of directories can be hard (Listing direct entries 
under `/a` should ignore all the `/a/b/...`, `/a/b/c/...` keys) 
+ 4. Similar to listing, rename can be a costly operation as it requires to 
rename many keys (renaming a first level directory means a rename of all the 
keys with the same prefix)
+
+See also the [Hadoop S3A 
documentation](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client)
 which describes some of these problem when AWS S3 is used. (*Warnings* section)
+
+# Current status
+
+As of today *Ozone Manager* has two different interfaces (both are defined in 
`OmClientProtocol.proto`): 
+
+ 1. object store related functions (like *CreateKey*, *LookupKey*, 
*DeleteKey*,...)  
+ 2. file system related functions (like *CreateFile*, *LookupFile*,...)
+
+File system related functions uses the same flat hierarchy under the hood but 
includes additional functionalities. For example the `createFile` call creates 
all the intermediate directories for a specific key (create file `/a/b/c` will 
create `/a/b` and `/a` entries in the key space)
+
+Today, if a key is created from the S3 interface can cause exceptions if the 
intermediate directories are checked from HCFS:
+
+
+```shell
+$ aws s3api put-object --endpoint http://localhost:9878 --bucket bucket1 --key 
/a/b/c/d
+
+$ ozone fs -ls o3fs://bucket1.s3v/a/
+ls: `o3fs://bucket1.s3v/a/': No such file or directory
+```
+
+This problem is reported in 
[HDDS-3955](https://issues.apache.org/jira/browse/HDDS-3955), where a new 
configuration key is introduced (`ozone.om.enable.filesystem.paths`). If this 
is enabled, intermediate directories are created even if the object store 
interface is used.
+
+This configuration is turned off by default, which means that S3 and HCFS 
couldn't be used together.
+
+To solve the performance problems of the directory listing / rename, 
[HDDS-2939](https://issues.apache.org/jira/browse/HDDS-2939) is created, which 
propose to use a new prefix table to store the "directory" entries (=prefixes).
+
+[HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.

Review comment:
S3 and HCFS couldn't be used together **without** normalization??





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-4209) S3A Filesystem does not work with Ozone S3

2020-09-09 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193033#comment-17193033
 ] 

Bharat Viswanadham edited comment on HDDS-4209 at 9/9/20, 4:56 PM:
---

Hi [~elek]
For your test have you enabled this config *ozone.om.enable.filesystem.paths*.
And how you have created the directory. (If using ofs/o3fs, it will work fine, 
as mkdir for ofs is createDirectory not put with 0 byte file)


[root@bvoz-1 ~]# oapi create-bucket --bucket sample
{
"Location": "https://bvoz-1.bvoz.root.hwx.site:9879/sample";
}

[root@bvoz-1 ~]# hdfs dfs -mkdir -p s3a://sample/dir1/dir2


[root@bvoz-1 ~]# oapi list-objects --bucket sample
{
"Contents": [
{
"LastModified": "2020-09-09T16:50:36.888Z",
"ETag": "2020-09-09T16:50:36.888Z",
"StorageClass": "STANDARD",
"Key": "dir1/",
"Size": 0
},
{
"LastModified": "2020-09-09T16:50:36.981Z",
"ETag": "2020-09-09T16:50:36.981Z",
"StorageClass": "STANDARD",
"Key": "dir1/dir2",
"Size": 0
}
]
}

As explained in the Jira description, when *mkdir -p* is run on S3A it created 
0-byte file, so it will not append "/" and ozone will not consider it is a 
directory.



hdfs dfs -put /etc/hadoop/conf/ozone-site.xml s3a://sample/dir1/dir2/file1

Fails with the below error, as it considers /dir1/dir2 as file not directory.


{code:java}
4:54:16.945 PM  ERROR   ObjectEndpoint  
Exception occurred in PutObject
NOT_A_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Can not create 
file: dir1/dir2/file1._COPYING_ as there is already file in the given path
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:593)
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.openKey(OzoneManagerProtocolClientSideTranslatorPB.java:584)
at 
org.apache.hadoop.ozone.client.rpc.RpcClient.createKey(RpcClient.java:688)
at 
org.apache.hadoop.ozone.client.OzoneBucket.createKey(OzoneBucket.java:396)
at 
org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.put(ObjectEndpoint.java:168)
at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
{code}






was (Author: bharatviswa):
Hi [~elek]
For your test have you enabled this config.
And how you have created the directory. (If using ofs/o3fs, it will work fine, 
as mkdir for ofs is createDirectory not put with 0 byte file)


[root@bvoz-1 ~]# oapi create-bucket --bucket sample
{
"Location": "https://bvoz-1.bvoz.root.hwx.site:9879/sample";
}

[root@bvoz-1 ~]# hdfs dfs -mkdir -p s3a://sample/dir1/dir2


[root@bvoz-1 ~]# oapi list-objects --bucket sample
{
"Contents": [
{
"LastModified": "2020-09-09T16:50:36.888Z",
"ETag": "2020-09-09T16:50:36.888Z",
"StorageClass": "STANDARD",
"Key": "dir1/",
"Size": 0
},
{
"LastModified": "2020-09-09T16:50:36.981Z",
"ETag": "2020-09-09T16:50:36.981Z",
"StorageClass": "STANDARD",
"Key": "dir1/dir2",
"Size": 0
}
]
}

As explained in the Jira description, when *mkdir -p* is run on S3A it created 
0-byte file, so it will not append "/" and ozone will not consider it is a 
directory.



hdfs dfs -put /etc/hadoop/conf/ozone-site.xml s3a://sample/dir1/dir2/file1

Fails with the below error, as it considers /dir1/dir2 as file not directory.


{code:java}
4:54:16.945 PM  ERROR   ObjectEndpoint  
Exception occurred in PutObject
NOT_A_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Can not create 
file: dir1/dir2/file1._COPYING_ as there is already file in the given path
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:593)
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.openKey(OzoneManagerProtocolClientSideTranslatorPB.java:584)
at 
org.apache.hadoop.ozone.client.rpc.RpcClient.createKey(RpcClient.java:688)
at 
org.apache.hadoop.o

[jira] [Commented] (HDDS-4209) S3A Filesystem does not work with Ozone S3

2020-09-09 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193033#comment-17193033
 ] 

Bharat Viswanadham commented on HDDS-4209:
--

Hi [~elek]
For your test have you enabled this config.
And how you have created the directory. (If using ofs/o3fs, it will work fine, 
as mkdir for ofs is createDirectory not put with 0 byte file)


[root@bvoz-1 ~]# oapi create-bucket --bucket sample
{
"Location": "https://bvoz-1.bvoz.root.hwx.site:9879/sample";
}

[root@bvoz-1 ~]# hdfs dfs -mkdir -p s3a://sample/dir1/dir2


[root@bvoz-1 ~]# oapi list-objects --bucket sample
{
"Contents": [
{
"LastModified": "2020-09-09T16:50:36.888Z",
"ETag": "2020-09-09T16:50:36.888Z",
"StorageClass": "STANDARD",
"Key": "dir1/",
"Size": 0
},
{
"LastModified": "2020-09-09T16:50:36.981Z",
"ETag": "2020-09-09T16:50:36.981Z",
"StorageClass": "STANDARD",
"Key": "dir1/dir2",
"Size": 0
}
]
}

As explained in the Jira description, when *mkdir -p* is run on S3A it created 
0-byte file, so it will not append "/" and ozone will not consider it is a 
directory.



hdfs dfs -put /etc/hadoop/conf/ozone-site.xml s3a://sample/dir1/dir2/file1

Fails with the below error, as it considers /dir1/dir2 as file not directory.


{code:java}
4:54:16.945 PM  ERROR   ObjectEndpoint  
Exception occurred in PutObject
NOT_A_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Can not create 
file: dir1/dir2/file1._COPYING_ as there is already file in the given path
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:593)
at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.openKey(OzoneManagerProtocolClientSideTranslatorPB.java:584)
at 
org.apache.hadoop.ozone.client.rpc.RpcClient.createKey(RpcClient.java:688)
at 
org.apache.hadoop.ozone.client.OzoneBucket.createKey(OzoneBucket.java:396)
at 
org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.put(ObjectEndpoint.java:168)
at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
{code}





> S3A Filesystem does not work with Ozone S3
> --
>
> Key: HDDS-4209
> URL: https://issues.apache.org/jira/browse/HDDS-4209
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Blocker
>
> When *ozone.om.enable.filesystem.paths* is enabled
>  
> hdfs dfs -mkdir -p s3a://b12345/d11/d12 -> Success
> hdfs dfs -put /tmp/file1 s3a://b12345/d11/d12/file1 -> fails with below error
>  
> {code:java}
> 2020-09-04 03:53:51,377 ERROR 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest: Key creation 
> failed. Volume:s3v, Bucket:b1234, Keyd11/d12/file1._COPYING_. Exception:{}
> NOT_A_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Can not create 
> file: cp/k1._COPYING_ as there is already file in the given path
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:256)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:227)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:428)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$applyTransaction$1(OzoneManagerStateMachine.java:246)
>  at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748){code}
> *Reason for this*
>  S3A filesystem when create directory creates an empty file
> *Now entries in Ozone KeyTable after create directory*
>  d11/
>  d11/d12
> Because of this in OMFileRequest.VerifyInFilesPath fails with 
> FILE_EXISTS_IN_GIVEN_PATH beca

[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1405: HDDS-4143. Implement a factory for OM Requests that returns an instance based on layout version.

2020-09-09 Thread GitBox


avijayanhwx commented on a change in pull request #1405:
URL: https://github.com/apache/hadoop-ozone/pull/1405#discussion_r485769011



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/bucket/OMBucketSetPropertyRequest.java
##
@@ -206,4 +207,8 @@ public OMClientResponse validateAndUpdateCache(OzoneManager 
ozoneManager,
   return omClientResponse;
 }
   }
+
+  public static String getRequestType() {

Review comment:
   @bharatviswa504 Yes, I pondered about adding annotation for this. Given 
that we already have annotations for Cleanup tables and BelongsToLayoutVersion, 
I wanted to hold off on adding more annotations. If it is ok with you, after 
the first round of changes, we can change this to an annotation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1405: HDDS-4143. Implement a factory for OM Requests that returns an instance based on layout version.

2020-09-09 Thread GitBox


avijayanhwx commented on a change in pull request #1405:
URL: https://github.com/apache/hadoop-ozone/pull/1405#discussion_r485769011



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/bucket/OMBucketSetPropertyRequest.java
##
@@ -206,4 +207,8 @@ public OMClientResponse validateAndUpdateCache(OzoneManager 
ozoneManager,
   return omClientResponse;
 }
   }
+
+  public static String getRequestType() {

Review comment:
   @bharatviswa504 Yes, I pondered about adding annotation for this. Given 
that we already have annotations for Cleanup tables, and 
BelongsToLayoutVersion, I wanted to hold off on adding more annotations. If it 
is ok with you, after the first round of changes, we can change this to an 
annotation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4173) Implement HDDS Version management using the LayoutVersionManager interface.

2020-09-09 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan resolved HDDS-4173.
-
Resolution: Fixed

PR merged through Github.

> Implement HDDS Version management using the LayoutVersionManager interface.
> ---
>
> Key: HDDS-4173
> URL: https://issues.apache.org/jira/browse/HDDS-4173
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode, SCM
>Affects Versions: 1.1.0
>Reporter: Aravindan Vijayan
>Assignee: Prashant Pogde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> * Create HDDS Layout Feature Catalog similar to the OM Layout Feature Catalog.
> * Any layout change to SCM and Datanode needs to be recorded here as a Layout 
> Feature.
> * This includes new SCM HA requests, new container layouts in DN etc.
> * Create a HDDSLayoutVersionManager similar to OMLayoutVersionManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4224) OM failed to install snapshots after OM failover

2020-09-09 Thread Mukul Kumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-4224:

Labels: MiniOzoneChaosCluster  (was: )

> OM failed to install snapshots after OM failover
> 
>
> Key: HDDS-4224
> URL: https://issues.apache.org/jira/browse/HDDS-4224
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> OM failed to install snapshots after OM failover
> {code}
> 2020-09-09 22:07:13,746 
> [org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$380/117485186@47069ab2]
>  INFO  server.GrpcLogAppender (GrpcLogAppender.java:installSnapshot(495)) - 
> omNode-1@group-D62
> 218D261DE->omNode-2-GrpcLogAppender: followerNextIndex = 65949 but 
> logStartIndex = 68440, notify follower to install snapshot-(t:2, i:68440)
> 2020-09-09 22:07:13,746 [grpc-default-executor-52] INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:notifyStateMachineToInstallSnapshot(1282)) - 
> omNode-2@group-D62218D261DE: Snapshot Installation by StateMach
> ine is in progress.
> 2020-09-09 22:07:13,752 [grpc-default-executor-52] INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE: 
> reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
> _PROGRESS
> 2020-09-09 22:07:13,746 [grpc-default-executor-51] INFO  
> server.GrpcLogAppender (GrpcLogAppender.java:onNext(375)) - 
> omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: 
> received a reply om
> Node-1<-omNode-2#0:FAIL-t2,IN_PROGRESS
> 2020-09-09 22:07:13,752 [grpc-default-executor-51] INFO  
> server.GrpcLogAppender (GrpcLogAppender.java:onNext(392)) - 
> omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: 
> InstallSnapshot in
> progress.
> 2020-09-09 22:07:13,746 [grpc-default-executor-22] INFO  
> server.GrpcServerProtocolService 
> (GrpcServerProtocolService.java:onCompleted(138)) - omNode-2: Completed 
> INSTALL_SNAPSHOT, lastRequest: omNode-1->omN
> ode-2#0-t2,notify:(t:2, i:68440)
> 2020-09-09 22:07:13,753 [grpc-default-executor-51] INFO  
> server.GrpcLogAppender (GrpcLogAppender.java:onNext(375)) - 
> omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: 
> received a reply om
> Node-1<-omNode-2#0:FAIL-t2,IN_PROGRESS
> 2020-09-09 22:07:13,753 [grpc-default-executor-51] INFO  
> server.GrpcLogAppender (GrpcLogAppender.java:onNext(392)) - 
> omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: 
> InstallSnapshot in
> progress.
> 2020-09-09 22:07:13,752 [grpc-default-executor-52] INFO  
> server.GrpcServerProtocolService 
> (GrpcServerProtocolService.java:onCompleted(138)) - omNode-2: Completed 
> INSTALL_SNAPSHOT, lastRequest: omNode-1->omN
> ode-2#0-t2,notify:(t:2, i:68440)
> 2020-09-09 22:07:13,747 
> [org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$380/117485186@47069ab2]
>  INFO  server.GrpcLogAppender (GrpcLogAppender.java:installSnapshot(503)) - 
> omNode-1@group-D62
> 218D261DE->omNode-2-GrpcLogAppender: send 
> omNode-1->omNode-2#0-t2,notify:(t:2, i:68440)
> 2020-09-09 22:07:13,756 [pool-144-thread-1] ERROR om.OzoneManager 
> (OzoneManager.java:installCheckpoint(3178)) - Failed to stop/ pause the 
> services. Cannot proceed with installing the new checkpoint.
> 2020-09-09 22:07:13,759 [pool-144-thread-1] ERROR om.OzoneManager 
> (OzoneManager.java:installSnapshotFromLeader(3141)) - Failed to install 
> snapshot from Leader OM: {}
> java.lang.IllegalStateException: ILLEGAL TRANSITION: In 
> OzoneManagerStateMachine:omNode-2:group-D62218D261DE, PAUSED -> PAUSING
> at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
> at org.apache.ratis.util.LifeCycle$State.validate(LifeCycle.java:115)
> at org.apache.ratis.util.LifeCycle.transition(LifeCycle.java:155)
> at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.pause(OzoneManagerStateMachine.java:305)
> at 
> org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3176)
> at 
> org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3162)
> at 
> org.apache.hadoop.ozone.om.OzoneManager.installSnapshotFromLeader(OzoneManager.java:3139)
> at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$notifyInstallSnapshotFromLeader$4(OzoneManagerStateMachine.java:372)
> at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at ja

[jira] [Created] (HDDS-4224) OM failed to install snapshots after OM failover

2020-09-09 Thread Mukul Kumar Singh (Jira)
Mukul Kumar Singh created HDDS-4224:
---

 Summary: OM failed to install snapshots after OM failover
 Key: HDDS-4224
 URL: https://issues.apache.org/jira/browse/HDDS-4224
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Mukul Kumar Singh


OM failed to install snapshots after OM failover

{code}
2020-09-09 22:07:13,746 
[org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$380/117485186@47069ab2]
 INFO  server.GrpcLogAppender (GrpcLogAppender.java:installSnapshot(495)) - 
omNode-1@group-D62
218D261DE->omNode-2-GrpcLogAppender: followerNextIndex = 65949 but 
logStartIndex = 68440, notify follower to install snapshot-(t:2, i:68440)
2020-09-09 22:07:13,746 [grpc-default-executor-52] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:notifyStateMachineToInstallSnapshot(1282)) - 
omNode-2@group-D62218D261DE: Snapshot Installation by StateMach
ine is in progress.
2020-09-09 22:07:13,752 [grpc-default-executor-52] INFO  impl.RaftServerImpl 
(RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE: 
reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
_PROGRESS
2020-09-09 22:07:13,746 [grpc-default-executor-51] INFO  server.GrpcLogAppender 
(GrpcLogAppender.java:onNext(375)) - 
omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: received 
a reply om
Node-1<-omNode-2#0:FAIL-t2,IN_PROGRESS
2020-09-09 22:07:13,752 [grpc-default-executor-51] INFO  server.GrpcLogAppender 
(GrpcLogAppender.java:onNext(392)) - 
omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: 
InstallSnapshot in
progress.
2020-09-09 22:07:13,746 [grpc-default-executor-22] INFO  
server.GrpcServerProtocolService 
(GrpcServerProtocolService.java:onCompleted(138)) - omNode-2: Completed 
INSTALL_SNAPSHOT, lastRequest: omNode-1->omN
ode-2#0-t2,notify:(t:2, i:68440)
2020-09-09 22:07:13,753 [grpc-default-executor-51] INFO  server.GrpcLogAppender 
(GrpcLogAppender.java:onNext(375)) - 
omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: received 
a reply om
Node-1<-omNode-2#0:FAIL-t2,IN_PROGRESS
2020-09-09 22:07:13,753 [grpc-default-executor-51] INFO  server.GrpcLogAppender 
(GrpcLogAppender.java:onNext(392)) - 
omNode-1@group-D62218D261DE->omNode-2-InstallSnapshotResponseHandler: 
InstallSnapshot in
progress.
2020-09-09 22:07:13,752 [grpc-default-executor-52] INFO  
server.GrpcServerProtocolService 
(GrpcServerProtocolService.java:onCompleted(138)) - omNode-2: Completed 
INSTALL_SNAPSHOT, lastRequest: omNode-1->omN
ode-2#0-t2,notify:(t:2, i:68440)
2020-09-09 22:07:13,747 
[org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$380/117485186@47069ab2]
 INFO  server.GrpcLogAppender (GrpcLogAppender.java:installSnapshot(503)) - 
omNode-1@group-D62
218D261DE->omNode-2-GrpcLogAppender: send omNode-1->omNode-2#0-t2,notify:(t:2, 
i:68440)
2020-09-09 22:07:13,756 [pool-144-thread-1] ERROR om.OzoneManager 
(OzoneManager.java:installCheckpoint(3178)) - Failed to stop/ pause the 
services. Cannot proceed with installing the new checkpoint.
2020-09-09 22:07:13,759 [pool-144-thread-1] ERROR om.OzoneManager 
(OzoneManager.java:installSnapshotFromLeader(3141)) - Failed to install 
snapshot from Leader OM: {}
java.lang.IllegalStateException: ILLEGAL TRANSITION: In 
OzoneManagerStateMachine:omNode-2:group-D62218D261DE, PAUSED -> PAUSING
at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
at org.apache.ratis.util.LifeCycle$State.validate(LifeCycle.java:115)
at org.apache.ratis.util.LifeCycle.transition(LifeCycle.java:155)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.pause(OzoneManagerStateMachine.java:305)
at 
org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3176)
at 
org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3162)
at 
org.apache.hadoop.ozone.om.OzoneManager.installSnapshotFromLeader(OzoneManager.java:3139)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$notifyInstallSnapshotFromLeader$4(OzoneManagerStateMachine.java:372)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-09-09 22:07:13,760 
[org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$380/117485186@47069ab2]
 INFO  server.GrpcLogAppender (GrpcLogAppender.java:installSnapshot(495)) - 
omNode-1@group-D62218D261DE->omNode-2-GrpcLogAppender: followerNextIndex = 
65949 but logStartIndex = 68440, notify follower to install snapshot-(t:2, 
i:68440)
2020-09-09 22:07:13,759 [grpc-default-executor-52] INFO

[GitHub] [hadoop-ozone] avijayanhwx merged pull request #1392: HDDS-4173. Implement HDDS Version management using the LayoutVersion…

2020-09-09 Thread GitBox


avijayanhwx merged pull request #1392:
URL: https://github.com/apache/hadoop-ozone/pull/1392


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx edited a comment on pull request #1392: HDDS-4173. Implement HDDS Version management using the LayoutVersion…

2020-09-09 Thread GitBox


avijayanhwx edited a comment on pull request #1392:
URL: https://github.com/apache/hadoop-ozone/pull/1392#issuecomment-689684096


   Thanks for working on this @prashantpogde. Thanks for the review @linyiqun. 
   LGTM +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on pull request #1392: HDDS-4173. Implement HDDS Version management using the LayoutVersion…

2020-09-09 Thread GitBox


avijayanhwx commented on pull request #1392:
URL: https://github.com/apache/hadoop-ozone/pull/1392#issuecomment-689684096


   Thanks for working on this @prashantpogde. LGTM +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1392: HDDS-4173. Implement HDDS Version management using the LayoutVersion…

2020-09-09 Thread GitBox


avijayanhwx commented on a change in pull request #1392:
URL: https://github.com/apache/hadoop-ozone/pull/1392#discussion_r485765534



##
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/upgrade/NewSCMFeatureUpgradeAction.java
##
@@ -0,0 +1,34 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.server.upgrade;
+
+import org.apache.hadoop.hdds.scm.server.StorageContainerManager;
+import org.apache.hadoop.hdds.upgrade.HDDSUpgradeAction;
+
+/**
+ * Example SCM Action class to help with understanding.
+ */
+public class NewSCMFeatureUpgradeAction implements

Review comment:
   Yes, sure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1361: HDDS-4155. Directory and filename can end up with same name in a path.

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1361:
URL: https://github.com/apache/hadoop-ozone/pull/1361#discussion_r485756050



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFSWithObjectStoreCreate.java
##
@@ -303,6 +304,12 @@ public void 
testMPUFailDuetoDirectoryCreationBeforeComplete()
 
   }
 
+  @Test(expected = FileAlreadyExistsException.class)

Review comment:
   You mean use OzoneBucketAPI and create a Key or add a robot test using S3
   If robot test, we don't have a test suite with this flag enabled,  there is 
already an open Jira for this, we can cover this additional scenario in that.
   https://issues.apache.org/jira/browse/HDDS-4154





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1361: HDDS-4155. Directory and filename can end up with same name in a path.

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1361:
URL: https://github.com/apache/hadoop-ozone/pull/1361#discussion_r485756050



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFSWithObjectStoreCreate.java
##
@@ -303,6 +304,12 @@ public void 
testMPUFailDuetoDirectoryCreationBeforeComplete()
 
   }
 
+  @Test(expected = FileAlreadyExistsException.class)

Review comment:
   You mean use OzoneBucketAPI and create a Key or add a robot test using S3
   If robot test, we don't have test suite with this flag enabled, opened a 
Jira 
   https://issues.apache.org/jira/browse/HDDS-4154





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1361: HDDS-4155. Directory and filename can end up with same name in a path.

2020-09-09 Thread GitBox


bharatviswa504 commented on a change in pull request #1361:
URL: https://github.com/apache/hadoop-ozone/pull/1361#discussion_r485756050



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFSWithObjectStoreCreate.java
##
@@ -303,6 +304,12 @@ public void 
testMPUFailDuetoDirectoryCreationBeforeComplete()
 
   }
 
+  @Test(expected = FileAlreadyExistsException.class)

Review comment:
   You mean use OzoneBucket or add a robot test using S3.
   If robot test, we don't have test suite with this flag enabled, opened a 
jira 
   https://issues.apache.org/jira/browse/HDDS-4154





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1361: HDDS-4155. Directory and filename can end up with same name in a path.

2020-09-09 Thread GitBox


arp7 commented on a change in pull request #1361:
URL: https://github.com/apache/hadoop-ozone/pull/1361#discussion_r485738948



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneFSWithObjectStoreCreate.java
##
@@ -303,6 +304,12 @@ public void 
testMPUFailDuetoDirectoryCreationBeforeComplete()
 
   }
 
+  @Test(expected = FileAlreadyExistsException.class)

Review comment:
   Thanks for the additional test. Can we also add a test case that 
exercises the REST API to create a file and dir with the same name, in both 
orders?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-4222) [OzoneFS optimization] Provide a mechanism for efficient path lookup

2020-09-09 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192947#comment-17192947
 ] 

Yiqun Lin edited comment on HDDS-4222 at 9/9/20, 3:29 PM:
--

Thanks for attaching the dir cache design, [~rakeshr]!
 I agree most of the current design details.

>For the consistency part, this is a very good point and will take care during 
>the implementation phase. I was thinking to update the cache during write and 
>read paths to avoid additional cache refresh cycle.
 I'm +1 for this way as current initial implementation.

>Rename and Delete ops will require only one entry update as it maintains 
>similar structure in the DB Directory Table.
 Delete ops is also not friendly for the Approach-3.

Example:

*DirTable:*
|CacheKey(PathElement)|{color:#ff8b00}*ObjectID*{color}|
|512/a|1025|
|1025/b|1026|
|1026/c|1027|
|1027/d|1028|
|1025/e|1029|

If we delete dir 512/a, it should lookup the whole dir cache and find the key 
which parent objectID is 1025 and then be deleted. So delete ops here seems 
still the very expensive ops.

>Delete ops will require only one entry update
 If we use the sync delete way, it will not update for only one entry (as 
explained in above example). 
 So is this mean the async delete way here, like bucket key deletion mechanism?
 1) Mark the delete key and let it not be accessed(e.g. add prefix in key)
 2) Async to remove these keys that needed to be deleted.


was (Author: linyiqun):
Thanks for attaching the dir cache design, [~rakeshr]!
 I agree most of the current design details.

>For the consistency part, this is a very good point and will take care during 
>the implementation phase. I was thinking to update the cache during write and 
>read paths to avoid additional cache refresh cycle.
 I'm +1 for this way as current initial implementation.

>Rename and Delete ops will require only one entry update as it maintains 
>similar structure in the DB Directory Table.
 Delete ops is also not friendly for the Approach-3.

Example:

*DirTable:*
|CacheKey(PathElement)|{color:#ff8b00}*ObjectID*{color}|
|512/a|1025|
|1025/b|1026|
|1026/c|1027|
|1027/d|1028|
|1025/e|1029|

If we delete dir 512/a, it should lookup the whole dir cache and find the key 
which parent objectID is 1025 and then be deleted. So delete ops here seems 
still the very expensive ops.

>Delete ops will require only one entry update
 If we use the sync delete way, it will not update for only one entry (as 
explained in above example). 
 So is this mean the async delete way here, like bucket key deletion mechanism?
 1) Mark the delete key,(add prefix in key to let it not be accessed)
 2) Async to remove these keys that needed to be deleted.

> [OzoneFS optimization] Provide a mechanism for efficient path lookup
> 
>
> Key: HDDS-4222
> URL: https://issues.apache.org/jira/browse/HDDS-4222
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Rakesh Radhakrishnan
>Assignee: Rakesh Radhakrishnan
>Priority: Major
> Attachments: Ozone FS Optimizations - Efficient Lookup using cache.pdf
>
>
> With the new file system HDDS-2939 like semantics design it requires multiple 
> DB lookups to traverse the path component in top-down fashion. This task to 
> discuss use cases and proposals to reduce the performance penalties during 
> path lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] arp7 commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


arp7 commented on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-689638762


   I have an alternate proposal, idea left in a comment. cc @bharatviswa504 
@elek 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4222) [OzoneFS optimization] Provide a mechanism for efficient path lookup

2020-09-09 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192947#comment-17192947
 ] 

Yiqun Lin commented on HDDS-4222:
-

Thanks for attaching the dir cache design, [~rakeshr]!
 I agree most of the current design details.

>For the consistency part, this is a very good point and will take care during 
>the implementation phase. I was thinking to update the cache during write and 
>read paths to avoid additional cache refresh cycle.
 I'm +1 for this way as current initial implementation.

>Rename and Delete ops will require only one entry update as it maintains 
>similar structure in the DB Directory Table.
 Delete ops is also not friendly for the Approach-3.

Example:

*DirTable:*
|CacheKey(PathElement)|{color:#ff8b00}*ObjectID*{color}|
|512/a|1025|
|1025/b|1026|
|1026/c|1027|
|1027/d|1028|
|1025/e|1029|

If we delete dir 512/a, it should lookup the whole dir cache and find the key 
which parent objectID is 1025 and then be deleted. So delete ops here seems 
still the very expensive ops.

>Delete ops will require only one entry update
 If we use the sync delete way, it will not update for only one entry (as 
explained in above example). 
 So is this mean the async delete way here, like bucket key deletion mechanism?
 1) Mark the delete key,(add prefix in key to let it not be accessed)
 2) Async to remove these keys that needed to be deleted.

> [OzoneFS optimization] Provide a mechanism for efficient path lookup
> 
>
> Key: HDDS-4222
> URL: https://issues.apache.org/jira/browse/HDDS-4222
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Rakesh Radhakrishnan
>Assignee: Rakesh Radhakrishnan
>Priority: Major
> Attachments: Ozone FS Optimizations - Efficient Lookup using cache.pdf
>
>
> With the new file system HDDS-2939 like semantics design it requires multiple 
> DB lookups to traverse the path component in top-down fashion. This task to 
> discuss use cases and proposals to reduce the performance penalties during 
> path lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


arp7 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485699608



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation
 
- * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular path)
- * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations in case of incompatible key names
- * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as possible
+This proposal suggest to use a 3rd option where 100% AWS compatiblity is 
guaranteed in exchange of a limited `ofs/o3fs` view:

Review comment:
   I completely disagree with this trade-off. The FS limited view is 
neither here nor there. You can insert keys via the S3 interface that are not 
visible via the FS view at all. To me this is the same as a corrupted 
filesystem. Marton, I liked your offline suggestion much better - disable FS 
access completely when operating in S3-compatible mode.
   
   Taking this one step further, I have a different approach in mind. Let's 
make this a per-bucket setting. For buckets created via the S3 interface, by 
default the S3 semantics will be preserved 100% unless the global setting is 
enabled and FS access will not be allowed at all. For buckets created via FS 
interface, the FS semantics will always take precedence. If the global setting 
is enabled, then the value of the setting at the time of bucket creation is 
sampled and that takes effect for the lifetime of the bucket. Basically you 
can't change the behavior for a given bucket.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] arp7 commented on a change in pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


arp7 commented on a change in pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#discussion_r485693614



##
File path: hadoop-hdds/docs/content/design/s3_hcfs.md
##
@@ -67,45 +66,100 @@ To solve the performance problems of the directory listing 
/ rename, [HDDS-2939]
 
 [HDDS-4097](https://issues.apache.org/jira/browse/HDDS-4097) is created to 
normalize the key names based on file-system semantics if 
`ozone.om.enable.filesystem.paths` is enabled. But please note that 
`ozone.om.enable.filesystem.paths` should always be turned on if S3 and HCFS 
are both used which means that S3 and HCFS couldn't be used together with 
normalization.
 
-## Goals
+# Goals
+
+ * Out of the box Ozone should support both S3 and HCFS interfaces without any 
settings. (It's possible only for the regular, fs compatible key names)
+ * As 100% compatibility couldn't be achieved on both side we need a 
configuration to set the expectations for incompatible key names
+ * Default behavior of `o3fs` and `ofs` should be as close to `s3a` as 
possible (when s3 compatibilty is prefered)
+
+# Possible cases to support
+
+There are two main aspects of supporting both `ofs/o3fs` and `s3` together:
+
+ 1. `ofs/o3fs` require to create intermediate directory entries (for exapmle 
`/a/b` for the key `/b/c/c`)
+ 2. Special file-system incompatible key names require special attention
+
+The second couldn't be done with compromise.
+
+ 1. We either support all key names (including non fs compatible key names), 
which means `ofs/o3fs` can provide only a partial view
+ 2. Or we can normalize the key names to be fs compatible (which makes it 
possible to create inconsistent S3 keys)
+
+HDDS-3955 introduced `ozone.om.enable.filesystem.paths`, with this setting we 
will have two possible usage pattern:
+
+| ozone.om.enable.filesystem.paths= | true | false
+|-|-|-|
+| create itermediate dirs | YES | NO |
+| normalize key names from `ofs/o3fs` | YES | NO
+| force to normalize key names of `s3` interface | YES (1) | NO 
+| `s3` key `/a/b/c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `ofs/o3fs` | YES | NO
+| `s3` key `/a/b//c` available from `s3` | AWS S3 incompatibility | YES
+
+(1): Under implementation

Review comment:
   I don't think that is true. Paths are normalized already on the S3 
interface when writing new keys.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on pull request #1404: HDDS-2949: mkdir : store directory entries in a separate table

2020-09-09 Thread GitBox


linyiqun commented on pull request #1404:
URL: https://github.com/apache/hadoop-ozone/pull/1404#issuecomment-689607498


   > Next task am planning to do CreateFile and then get/listStatus. IMHO, both 
file/dir delete can be done together in one patch which involves changes in 
KeyDeletingService.
   > 
   > Does this make sense to you.
   > 
   Makes sense to me.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r485570651



##
File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmVolumeArgs.java
##
@@ -46,6 +47,7 @@
   private long quotaInBytes;
   private long quotaInCounts;
   private final OmOzoneAclMap aclMap;
+  private LongAdder quotaUsageInBytes = new LongAdder();

Review comment:
   QuotaUsageInBytes is a property of Volume that needs to be updated each 
time when CreateKey, AllocateBlock, CommitKey, DeleteKey. and at the beginning, 
we used the volume lock. 
   But, Previously, only Bucket locks were used for these operation, and use 
volume lock greatly affect the concurrency performance of different buckets 
under same volume. 
   So to avoid Volume locking for poor performance, LongAdder is used here to 
complete the atomic update of quotaUsageInBytes.
   I did a performance test with freon. Multi-threading wrote different buckets 
under the same volume, and using LongAdder had little impact on the original 
performance.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r485570651



##
File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmVolumeArgs.java
##
@@ -46,6 +47,7 @@
   private long quotaInBytes;
   private long quotaInCounts;
   private final OmOzoneAclMap aclMap;
+  private LongAdder quotaUsageInBytes = new LongAdder();

Review comment:
   QuotaUsageInBytes is a property of Volume that needs to be updated each 
time when CreateKey, AllocateBlock, CommitKey, DeleteKey, and the lock of 
Volume is usually used. 
   Previously, only Bucket locks were used for these operation, and Volume 
locks can greatly affect the concurrency performance of different buckets. 
   So to avoid Volume locking for poor performance, LongAdder is used here to 
complete the atomic update of quotaUsageInBytes.
   I did a performance test with freon. Multi-threading wrote different buckets 
under the same volume, and using LongAdder had little impact on the original 
performance.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#issuecomment-689582328


   Thanks for @cxorm @xiaoyuyao to the review.  The new commit has been 
updated. Could you help take another look?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc opened a new pull request #1412: HDDS-3751. Ozone sh client support bucket quota option.

2020-09-09 Thread GitBox


captainzmc opened a new pull request #1412:
URL: https://github.com/apache/hadoop-ozone/pull/1412


   ## What changes were proposed in this pull request?
   
   By HDDS-3725 Ozone currently supports Set volume quota. This PR refers to 
the implementation of HDDS-3725, and make  Ozone shell support set bucket quota.
   The current Quota setting does not take effect. HDDS-541 gives all the work 
needed to perfect Quota.
   This PR is a subtask of HDDS-541. 
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3751
   
   ## How was this patch tested?
   
   UT added
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3751) Ozone sh bucket client support quota option.

2020-09-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3751:
-
Labels: pull-request-available  (was: )

> Ozone sh bucket client support quota option.
> 
>
> Key: HDDS-3751
> URL: https://issues.apache.org/jira/browse/HDDS-3751
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Simon Su
>Assignee: mingchao zhao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek edited a comment on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


elek edited a comment on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-689541807


   Opened as a **DRAFT** pull request as this is only a proposal. 
   
   @arp7 and @bharatviswa504 still have concerns about the proposed approach.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


elek commented on pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411#issuecomment-689541807


   Opened as a **DRAFT** pull request as this is only a proposal. 
   
   @arp7 and @bharatviswa504 still have concerns about the proposed approache.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4097) S3/Ozone Filesystem inter-op

2020-09-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4097:
-
Labels: pull-request-available  (was: )

> S3/Ozone Filesystem inter-op
> 
>
> Key: HDDS-4097
> URL: https://issues.apache.org/jira/browse/HDDS-4097
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Attachments: Ozone FileSystem Paths Enabled.docx, Ozone filesystem 
> path enabled.xlsx
>
>
> This Jira is to implement changes required to use Ozone buckets when data is 
> ingested via S3 and use the bucket/volume via OzoneFileSystem. Initial 
> implementation for this is done as part of HDDS-3955. There are few API's 
> which have missed the changes during the implementation of HDDS-3955. 
> Attached design document which discusses each API,  and what changes are 
> required.
> Excel sheet has information about each API, from what all interfaces the OM 
> API is used, and what changes are required for the API to support 
> inter-operability.
> Note: The proposal for delete/rename is still under discussion, not yet 
> finalized. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #1411: HDDS-4097. [DESIGN] S3/Ozone Filesystem inter-op

2020-09-09 Thread GitBox


elek opened a new pull request #1411:
URL: https://github.com/apache/hadoop-ozone/pull/1411


   ## What changes were proposed in this pull request?
   
   A new design doc is included about S3/HCFS interoperability. Earlier it was 
discussed under https://issues.apache.org/jira/browse/HDDS-4097. 
   
   But I created this PR as:
   
1. I promised to do it to make it easier to include all the context 
specific comments
2. Make it easier to follow the document specific changes / discussions



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4209) S3A Filesystem does not work with Ozone S3

2020-09-09 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192839#comment-17192839
 ] 

Marton Elek commented on HDDS-4209:
---

I checked it with aws s3 api, and after the first step I can see a good entry:

{code}
aws s3api list-objects --bucket ozonetest --prefix=o11
{
"Contents": [
{
"Key": "o11/o12/",
"LastModified": "2020-09-09T12:34:36.000Z",
"ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
"Size": 0,
"StorageClass": "STANDARD",
"Owner": {
"DisplayName": "e1428",
"ID": 
"b8c021b4343e316b28b545df160c6720479a998001ebf7019328b64417fe152d"
}
}
]
}
{code}

As the `/` postfix is added to the path the intermediate directory creation 
logic can understand it's a directory and can reuse the object. IMHO.

> S3A Filesystem does not work with Ozone S3
> --
>
> Key: HDDS-4209
> URL: https://issues.apache.org/jira/browse/HDDS-4209
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Blocker
>
> When *ozone.om.enable.filesystem.paths* is enabled
>  
> hdfs dfs -mkdir -p s3a://b12345/d11/d12 -> Success
> hdfs dfs -put /tmp/file1 s3a://b12345/d11/d12/file1 -> fails with below error
>  
> {code:java}
> 2020-09-04 03:53:51,377 ERROR 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest: Key creation 
> failed. Volume:s3v, Bucket:b1234, Keyd11/d12/file1._COPYING_. Exception:{}
> NOT_A_FILE org.apache.hadoop.ozone.om.exceptions.OMException: Can not create 
> file: cp/k1._COPYING_ as there is already file in the given path
>  at 
> org.apache.hadoop.ozone.om.request.key.OMKeyCreateRequest.validateAndUpdateCache(OMKeyCreateRequest.java:256)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:227)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:428)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$applyTransaction$1(OzoneManagerStateMachine.java:246)
>  at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748){code}
> *Reason for this*
>  S3A filesystem when create directory creates an empty file
> *Now entries in Ozone KeyTable after create directory*
>  d11/
>  d11/d12
> Because of this in OMFileRequest.VerifyInFilesPath fails with 
> FILE_EXISTS_IN_GIVEN_PATH because d11/d12 is considered as file not a 
> directory. (As in ozone currently, directories end with trailing "/")
> So, when d11/d12/file is created, we check parent exists, now d11/d12 is 
> considered as file and fails with NOT_A_FILE
> When disabled it works fine, as when disabled during key create we do not 
> check any filesystem semantics and also does not create intermediate 
> directories.
> {code:java}
> [root@bvoz-1 ~]# hdfs dfs -mkdir -p s3a://b12345/d11/d12
> [root@bvoz-1 ~]# hdfs dfs -put /etc/hadoop/conf/ozone-site.xml 
> s3a://b12345/d11/d12/k1
> [root@bvoz-1 ~]# hdfs dfs -ls s3a://b12345/d11/d12
> Found 1 items
> -rw-rw-rw-   1 systest systest   2373 2020-09-04 04:45 
> s3a://b12345/d11/d12/k1
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r485573765



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequest.java
##
@@ -182,8 +179,16 @@ public OMClientResponse 
validateAndUpdateCache(OzoneManager ozoneManager,
   new CacheKey<>(dbOzoneKey),
   new CacheValue<>(Optional.of(omKeyInfo), trxnLogIndex));
 
+  long scmBlockSize = ozoneManager.getScmBlockSize();
+  int factor = omKeyInfo.getFactor().getNumber();
+  omVolumeArgs = getVolumeInfo(omMetadataManager, volumeName);
+  long updateNum = omKeyInfo.getDataSize() * factor -
+  locationInfoList.size() * scmBlockSize * factor;

Review comment:
   I will add comment to describe the calculation here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r485570651



##
File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmVolumeArgs.java
##
@@ -46,6 +47,7 @@
   private long quotaInBytes;
   private long quotaInCounts;
   private final OmOzoneAclMap aclMap;
+  private LongAdder quotaUsageInBytes = new LongAdder();

Review comment:
   QuotaUsageInBytes is a property of Volume that needs to be updated each 
time when CreateKey, AllocateBlock, CommitKey, DeleteKey, and the lock of 
Volume is usually used. 
   Previously, only Bucket locks were used for these operation, and Volume 
locks can greatly affect the concurrency performance of different buckets. 
   So to avoid Volume locking for performance, LongAdder is used here to 
complete the atomic update of quotaUsageInBytes.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r485547522



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
##
@@ -269,6 +269,7 @@ private OzoneConsts() {
   public static final String KEY = "key";
   public static final String SRC_KEY = "srcKey";
   public static final String DST_KEY = "dstKey";
+  public static final String QUOTA_USAGE_IN_BYTES = "quotaUsageInBytes";

Review comment:
   Using quotaUsage here may not be appropriate because it makes it 
impossible to distinguish between QuotaUsageInBytes and later 
QuotaUsageInCount. 
   Here we can use the usage in ContainerInfo, [use 
usedBytes](https://github.com/apache/hadoop-ozone/blob/971a36eea16b38257892244d6de862b4f9461138/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerInfo.java#L55).
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] rakeshadr commented on pull request #1404: HDDS-2949: mkdir : store directory entries in a separate table

2020-09-09 Thread GitBox


rakeshadr commented on pull request #1404:
URL: https://github.com/apache/hadoop-ozone/pull/1404#issuecomment-689493252


   > With new layout version added, the latest PR logic looks more clear and 
independent.
   > I just left my minor comments. I am thinking one next part of work is 
delete dir implementation, right? Only seeing create dir implementation here.
   > In additional, please have a look of the github build failure, it doesn't 
show green.
   
   Will fix TestOzoneConfigurationFields test case.
   Next task am planning to do CreateFile and then get/listStatus. IMHO, both 
file/dir delete can be done together in one patch which involves changes in 
KeyDeletingService.
   
   Does this make sense to you.
   
   Thanks for the reviews!
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] rakeshadr commented on a change in pull request #1404: HDDS-2949: mkdir : store directory entries in a separate table

2020-09-09 Thread GitBox


rakeshadr commented on a change in pull request #1404:
URL: https://github.com/apache/hadoop-ozone/pull/1404#discussion_r485527504



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestOzoneDirectory.java
##
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.ozone;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.*;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.utils.db.Table;
+import org.apache.hadoop.hdds.utils.db.TableIterator;
+import org.apache.hadoop.ozone.MiniOzoneCluster;
+import org.apache.hadoop.ozone.OzoneConsts;
+import org.apache.hadoop.ozone.TestDataUtil;
+import org.apache.hadoop.ozone.client.OzoneBucket;
+import org.apache.hadoop.ozone.om.OMConfigKeys;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.helpers.OmBucketInfo;
+import org.apache.hadoop.ozone.om.helpers.OmDirectoryInfo;
+import org.apache.hadoop.ozone.om.helpers.OmKeyInfo;
+import org.apache.hadoop.util.StringUtils;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.Timeout;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.concurrent.TimeoutException;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.FS_TRASH_INTERVAL_KEY;
+import static 
org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_FS_ITERATE_BATCH_SIZE;
+import static org.junit.Assert.fail;
+
+/**
+ * Test verifies the entries and operations in directory table.
+ */
+public class TestOzoneDirectory {
+
+  @Rule
+  public Timeout timeout = new Timeout(30);
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(TestOzoneFileSystem.class);

Review comment:
   Will do it. Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] rakeshadr commented on a change in pull request #1404: HDDS-2949: mkdir : store directory entries in a separate table

2020-09-09 Thread GitBox


rakeshadr commented on a change in pull request #1404:
URL: https://github.com/apache/hadoop-ozone/pull/1404#discussion_r485527375



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequestV1.java
##
@@ -0,0 +1,313 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.request.file;
+
+import com.google.common.base.Optional;
+import org.apache.commons.lang3.tuple.ImmutablePair;
+import org.apache.hadoop.hdds.utils.db.cache.CacheKey;
+import org.apache.hadoop.hdds.utils.db.cache.CacheValue;
+import org.apache.hadoop.ozone.OzoneAcl;
+import org.apache.hadoop.ozone.audit.AuditLogger;
+import org.apache.hadoop.ozone.audit.OMAction;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.*;
+import org.apache.hadoop.ozone.om.ratis.utils.OzoneManagerDoubleBufferHelper;
+import org.apache.hadoop.ozone.om.request.util.OmResponseUtil;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.file.OMDirectoryCreateResponseV1;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.CreateDirectoryRequest;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.CreateDirectoryResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.KeyArgs;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.OMRequest;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.OMResponse;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos
+.Status;
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.FILE_ALREADY_EXISTS;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.INVALID_KEY_NAME;
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.BUCKET_LOCK;
+import static 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.OMDirectoryResult.*;
+
+/**
+ * Handle create directory request. It will add path components to the 
directory
+ * table and maintains file system semantics.
+ */
+public class OMDirectoryCreateRequestV1 extends OMDirectoryCreateRequest {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMDirectoryCreateRequestV1.class);
+
+  public OMDirectoryCreateRequestV1(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long trxnLogIndex, OzoneManagerDoubleBufferHelper omDoubleBufferHelper) {
+
+CreateDirectoryRequest createDirectoryRequest = getOmRequest()
+.getCreateDirectoryRequest();
+KeyArgs keyArgs = createDirectoryRequest.getKeyArgs();
+
+String volumeName = keyArgs.getVolumeName();
+String bucketName = keyArgs.getBucketName();
+String keyName = keyArgs.getKeyName();
+
+OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
+getOmRequest());
+
omResponse.setCreateDirectoryResponse(CreateDirectoryResponse.newBuilder());
+OMMetrics omMetrics = ozoneManager.getMetrics();
+omMetrics.incNumCreateDirectory();
+
+AuditLogger auditLogger = ozoneManager.getAuditLogger();
+OzoneManagerProtocolProtos.UserInfo userInfo = 
getOmRequest().getUserInfo();
+
+Map auditMap = buildKeyArgsAuditMap(keyArgs);
+OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+boolean acquiredLock = false;
+IOException exception

[GitHub] [hadoop-ozone] captainzmc commented on a change in pull request #1296: HDDS-4053. Volume space: add quotaUsageInBytes and update it when write and delete key.

2020-09-09 Thread GitBox


captainzmc commented on a change in pull request #1296:
URL: https://github.com/apache/hadoop-ozone/pull/1296#discussion_r485447342



##
File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneVolume.java
##
@@ -131,6 +133,18 @@ public OzoneVolume(ConfigurationSource conf, 
ClientProtocol proxy,
 this.modificationTime = Instant.ofEpochMilli(modificationTime);
   }
 
+  @SuppressWarnings("parameternumber")

Review comment:
   I think that's good advice, and that's a problem that also exists in 
OzoneBucket. We can optimize these problems separately, I opened a JIRA 
HDDS-4223 to follow and optimize this part.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4223) Optimize the construction method of OzoneVolume and OzoneBucket.

2020-09-09 Thread mingchao zhao (Jira)
mingchao zhao created HDDS-4223:
---

 Summary: Optimize the construction method of OzoneVolume and 
OzoneBucket.
 Key: HDDS-4223
 URL: https://issues.apache.org/jira/browse/HDDS-4223
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: mingchao zhao


There are many parameters different construction methods in OzoneVolume and 
OzoneBucket, we can consider adding bulid to avoid managing constructors with 
different parameters. To facilitate code maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2939) Ozone FS namespace

2020-09-09 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192704#comment-17192704
 ] 

Rakesh Radhakrishnan commented on HDDS-2939:


Thanks a lot [~linyiqun] for the comments. I've raised separate task HDDS-4222 
for more detailed and focussed discussion. Probably, it requires multiple 
sub-tasks to develop this module and we can do there.

> Ozone FS namespace
> --
>
> Key: HDDS-2939
> URL: https://issues.apache.org/jira/browse/HDDS-2939
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Rakesh Radhakrishnan
>Priority: Major
>  Labels: Triaged
> Attachments: Ozone FS Namespace Proposal v1.0.docx
>
>
> Create the structures and metadata layout required to support efficient FS 
> namespace operations in Ozone - operations involving folders/directories 
> required to support the Hadoop compatible Filesystem interface.
> The details are described in the attached document. The work is divided up 
> into sub-tasks as per the task list in the document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4222) [OzoneFS optimization] Provide a mechanism for efficient path lookup

2020-09-09 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192703#comment-17192703
 ] 

Rakesh Radhakrishnan commented on HDDS-4222:


Hi [~linyiqun], I have moved the lookup discussion to this jira task as this 
requires more detailed and focussed discussion. Thanks a lot for the help!

Please refer to [~linyiqun]'s original comment
{quote}Here the directory cache is used for avoid the additional look up 
overheads. Latest design of directory cache hasn't been attached but just some 
thoughts from me:
 Two type mapping cache will be useful I think:

, like , so that we can 
skip the traverse search from dir table to key table.
 >, this is used for the listStatus scenario, list files 
call can be a very expensive call under Ozone fs namespace.
 Cache introduced here can speed up the metadata access but also there are two 
aspects we need to consider.
{quote}
Yes, directory cache is most useful during the path component traversal. IMHIO, 
this would be the first candidate to target and would greatly help to get 
maximum performance benefit during path look ups. That doesn't meant that other 
entities like files, list etc is not important. I believe it depends on many 
factors like, workloads, hardware (RAM, NVMe)capabilities, how much is the 
metadata proportion(dirs, files) in FS namespace, directory hierarchy etc. To 
begin with, I am planning to implement a cache framework where OM will provide 
facility to plugin different cache entities based on user requirements. Here, 
based on the tradeoffs user can add more built-in cache policies and configure 
it and tune it accordingly.
{quote}Cache entry eviction policy for this, we cannot cache all the dir/file 
entries.
 Consistency between dir cache and underlying store. Cache entry will become 
stale when db store updated but not synced in corresponding cache entry. The 
cache refresh interval time can be introduced here. Only when the cache entry 
not updated more than given refresh interval, then we trigger update cache 
entry from querying the db table. Users can set different refresh interval time 
to ensure the cache freshness based on their scenarios. Also they can disable 
this cache by set interval to 0 that means each query will directly access to 
db.
 Current OM table cache seems not very helpful for dir cache so I came up with 
above thoughts.
{quote}
Yes, OM table cache is not helpful. I completely agree with you that the cache 
eviction policy is very important to manage the useful entries within the cache 
capacity. In the attached document, I proposed an optimized directory 
cache(Approach-3) with minimal data to incorporate more entires that benefits 
the path component traversal.

For the consistency part, this is a very good point and will take care during 
the implementation phase. I was thinking to update the cache during write and 
read paths to avoid additional cache refresh cycle. But, I don't have concrete 
thoughts on this and need to look into the OM code to do more deeper analysis.

> [OzoneFS optimization] Provide a mechanism for efficient path lookup
> 
>
> Key: HDDS-4222
> URL: https://issues.apache.org/jira/browse/HDDS-4222
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Rakesh Radhakrishnan
>Assignee: Rakesh Radhakrishnan
>Priority: Major
> Attachments: Ozone FS Optimizations - Efficient Lookup using cache.pdf
>
>
> With the new file system HDDS-2939 like semantics design it requires multiple 
> DB lookups to traverse the path component in top-down fashion. This task to 
> discuss use cases and proposals to reduce the performance penalties during 
> path lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4222) [OzoneFS optimization] Provide a mechanism for efficient path lookup

2020-09-09 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192688#comment-17192688
 ] 

Rakesh Radhakrishnan commented on HDDS-4222:


Attached document captures mode details. Please review it. Thanks!

Thanks a lot [~arp], [~msingh], [~bharat], [~hanishakoneru], [~avijayan], 
[~linyiqun] for the discussions.

> [OzoneFS optimization] Provide a mechanism for efficient path lookup
> 
>
> Key: HDDS-4222
> URL: https://issues.apache.org/jira/browse/HDDS-4222
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Rakesh Radhakrishnan
>Assignee: Rakesh Radhakrishnan
>Priority: Major
> Attachments: Ozone FS Optimizations - Efficient Lookup using cache.pdf
>
>
> With the new file system HDDS-2939 like semantics design it requires multiple 
> DB lookups to traverse the path component in top-down fashion. This task to 
> discuss use cases and proposals to reduce the performance penalties during 
> path lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4222) [OzoneFS optimization] Provide a mechanism for efficient path lookup

2020-09-09 Thread Rakesh Radhakrishnan (Jira)
Rakesh Radhakrishnan created HDDS-4222:
--

 Summary: [OzoneFS optimization] Provide a mechanism for efficient 
path lookup
 Key: HDDS-4222
 URL: https://issues.apache.org/jira/browse/HDDS-4222
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
Reporter: Rakesh Radhakrishnan
Assignee: Rakesh Radhakrishnan
 Attachments: Ozone FS Optimizations - Efficient Lookup using cache.pdf

With the new file system HDDS-2939 like semantics design it requires multiple 
DB lookups to traverse the path component in top-down fashion. This task to 
discuss use cases and proposals to reduce the performance penalties during path 
lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



  1   2   >