[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-22 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823567#comment-16823567
 ] 

Anu Engineer commented on HDDS-1301:


Thank you, I will sync with you, but I will write and post a design document 
that explains all these changes and the possible changes in the output 
committer. Then based on your feedback, we can shape the Ozone manager API.

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-19 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822092#comment-16822092
 ] 

Steve Loughran commented on HDDS-1301:
--

All the work in mapreduce for the s3a committer can be applied to Ozone too: 
you can define a new implementation of PathOutputCommitProtocol, and then 
declare the factory class for it {{PathOutputCommitterFactory}} , and when an 
MR client (or suitably configured Spark app) looks for an output committer for 
a FileOutputFormat, you can serve up your committer instead of the legacy 
"let's commit by rename" algorithm.

Happy to talk about how to do this/point you at the s3a impl. Key point: for 
your store, no matter what the applications think, rename is optional

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-18 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821611#comment-16821611
 ] 

Anu Engineer commented on HDDS-1301:


{quote}I think we need to move beyond thinking rename is a low cost way to 
commit work.
{quote}
Yes, 100% of that; Thanks for that comment. (y)

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-18 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821609#comment-16821609
 ] 

Steve Loughran commented on HDDS-1301:
--

I think we need to move beyond thinking rename is a low cost way to commit work.

+[~mackrorysd], [~gabor.bota]

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-18 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821598#comment-16821598
 ] 

Anu Engineer commented on HDDS-1301:


[~ljain], I am still looking at code and just thinking aloud, if we support 
this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve 
is going to really happy) but would we not get into a situation where an 
application can take a lock on a directory with millions on sub-keys. The added 
danger is that we have trained all our HDFS users that Rename is an atomic and 
fast operation, so application which is designed to work against HDFS will 
assume that rename is a cheap operation and attempt to do it. 

I am just wondering how long it would take to rename a directory with one 
million keys, or more specifically, if rename for us is an O(n) operation and 
for normal file systems it is O(1); do we run into the danger of DDOS of Ozone 
via Rename? Should we support a completely different operation on the client 
side – for the lack of a better word – I am going to call it "File Commit", 
where the user can collect a set of file names and send it to us, with prefix 
either replaced or advice the server to replace the prefix. That way, we will 
still allow bucket operations to continue, and we will only be concerned with 
the list of Ozone keys that the client gave us. 

>From the OzoneFS point of view, it might simply be renaming a small set of 
>files under a directory (say 100K files, in a bucket with 10 million keys), so 
>in the "File Commit" operation, the client specifies the file names explicitly 
>or via regex, and we will collect all those files and replace the prefix. I am 
>hoping that OzoneFS will be able to simulate a rename operation via "File 
>Commit". what do you think? Should we get on a call brainstorm this? 

 

 

 

 

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-18 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821588#comment-16821588
 ] 

Anu Engineer commented on HDDS-1301:


[~ljain] The patch looks good, but I am going to open a discussion with few 
other people since I need some advice. Nothing really to do with this patch. 
Thought that this discussion will be interesting to you too.

 

[~gopalv]  based on your feedback we are optimizing some features and access 
patterns of O3FS. One of the issues is rename, we can do single file rename 
within a bucket. I know it would be better if we can collect a large number of 
files and rename transactionally.

 

Thoughts, Comments?

 

cc: [~ste...@apache.org], [~jnp], [~arpitagarwal]

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812684#comment-16812684
 ] 

Hadoop QA commented on HDDS-1301:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  4m 
34s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdds: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-ozone: The patch generated 24 new + 0 
unchanged - 0 fixed = 24 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  0s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 24s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 33s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.container.common.TestDatanodeStateMachine |
|   | hadoop.ozone.om.exceptions.TestResultCodes |
\\
\\
|| Subsystem || 

[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-08 Thread Lokesh Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812538#comment-16812538
 ] 

Lokesh Jain commented on HDDS-1301:
---

Uploaded v1 patch for review. Will create pull request for the same.

> Optimize recursive ozone filesystem apis
> 
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: HDDS-1301.001.patch
>
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org