[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823567#comment-16823567 ] Anu Engineer commented on HDDS-1301: Thank you, I will sync with you, but I will write and post a design document that explains all these changes and the possible changes in the output committer. Then based on your feedback, we can shape the Ozone manager API. > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822092#comment-16822092 ] Steve Loughran commented on HDDS-1301: -- All the work in mapreduce for the s3a committer can be applied to Ozone too: you can define a new implementation of PathOutputCommitProtocol, and then declare the factory class for it {{PathOutputCommitterFactory}} , and when an MR client (or suitably configured Spark app) looks for an output committer for a FileOutputFormat, you can serve up your committer instead of the legacy "let's commit by rename" algorithm. Happy to talk about how to do this/point you at the s3a impl. Key point: for your store, no matter what the applications think, rename is optional > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821611#comment-16821611 ] Anu Engineer commented on HDDS-1301: {quote}I think we need to move beyond thinking rename is a low cost way to commit work. {quote} Yes, 100% of that; Thanks for that comment. (y) > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821609#comment-16821609 ] Steve Loughran commented on HDDS-1301: -- I think we need to move beyond thinking rename is a low cost way to commit work. +[~mackrorysd], [~gabor.bota] > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821598#comment-16821598 ] Anu Engineer commented on HDDS-1301: [~ljain], I am still looking at code and just thinking aloud, if we support this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve is going to really happy) but would we not get into a situation where an application can take a lock on a directory with millions on sub-keys. The added danger is that we have trained all our HDFS users that Rename is an atomic and fast operation, so application which is designed to work against HDFS will assume that rename is a cheap operation and attempt to do it. I am just wondering how long it would take to rename a directory with one million keys, or more specifically, if rename for us is an O(n) operation and for normal file systems it is O(1); do we run into the danger of DDOS of Ozone via Rename? Should we support a completely different operation on the client side – for the lack of a better word – I am going to call it "File Commit", where the user can collect a set of file names and send it to us, with prefix either replaced or advice the server to replace the prefix. That way, we will still allow bucket operations to continue, and we will only be concerned with the list of Ozone keys that the client gave us. >From the OzoneFS point of view, it might simply be renaming a small set of >files under a directory (say 100K files, in a bucket with 10 million keys), so >in the "File Commit" operation, the client specifies the file names explicitly >or via regex, and we will collect all those files and replace the prefix. I am >hoping that OzoneFS will be able to simulate a rename operation via "File >Commit". what do you think? Should we get on a call brainstorm this? > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821588#comment-16821588 ] Anu Engineer commented on HDDS-1301: [~ljain] The patch looks good, but I am going to open a discussion with few other people since I need some advice. Nothing really to do with this patch. Thought that this discussion will be interesting to you too. [~gopalv] based on your feedback we are optimizing some features and access patterns of O3FS. One of the issues is rename, we can do single file rename within a bucket. I know it would be better if we can collect a large number of files and rename transactionally. Thoughts, Comments? cc: [~ste...@apache.org], [~jnp], [~arpitagarwal] > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812684#comment-16812684 ] Hadoop QA commented on HDDS-1301: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 4m 34s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 29s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 26s{color} | {color:orange} hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 27s{color} | {color:orange} hadoop-ozone: The patch generated 24 new + 0 unchanged - 0 fixed = 24 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 24s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 33s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 74m 10s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.common.TestDatanodeStateMachine | | | hadoop.ozone.om.exceptions.TestResultCodes | \\ \\ || Subsystem ||
[jira] [Commented] (HDDS-1301) Optimize recursive ozone filesystem apis
[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812538#comment-16812538 ] Lokesh Jain commented on HDDS-1301: --- Uploaded v1 patch for review. Will create pull request for the same. > Optimize recursive ozone filesystem apis > > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-1301.001.patch > > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org