[ https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821598#comment-16821598 ]
Anu Engineer edited comment on HDDS-1301 at 4/19/19 1:11 AM: ------------------------------------------------------------- [~ljain], I am still looking at code and just thinking aloud, if we support this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve is going to really happy) but would we not get into a situation where an application can take a lock on a directory with millions on sub-keys. The added danger is that we have trained all our HDFS users that Rename is an atomic and fast operation, so application which is designed to work against HDFS will assume that rename is a cheap operation and attempt to do it. I am just wondering how long it would take to rename a directory with one million keys, or more specifically, if rename for us is an O( n ) operation and for normal file systems it is O(1); do we run into the danger of DDOS of Ozone via Rename? Should we support a completely different operation on the client side – for the lack of a better word – I am going to call it "File Commit", where the user can collect a set of file names and send it to us, with prefix either replaced or advice the server to replace the prefix. That way, we will still allow bucket operations to continue, and we will only be concerned with the list of Ozone keys that the client gave us. >From the OzoneFS point of view, it might simply be renaming a small set of >files under a directory (say 100K files, in a bucket with 10 million keys), so >in the "File Commit" operation, the client specifies the file names explicitly >or via regex, and we will collect all those files and replace the prefix. I am >hoping that OzoneFS will be able to simulate a rename operation via "File >Commit". what do you think? Should we get on a call brainstorm this? was (Author: anu): [~ljain], I am still looking at code and just thinking aloud, if we support this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve is going to really happy) but would we not get into a situation where an application can take a lock on a directory with millions on sub-keys. The added danger is that we have trained all our HDFS users that Rename is an atomic and fast operation, so application which is designed to work against HDFS will assume that rename is a cheap operation and attempt to do it. I am just wondering how long it would take to rename a directory with one million keys, or more specifically, if rename for us is an O(n) operation and for normal file systems it is O(1); do we run into the danger of DDOS of Ozone via Rename? Should we support a completely different operation on the client side – for the lack of a better word – I am going to call it "File Commit", where the user can collect a set of file names and send it to us, with prefix either replaced or advice the server to replace the prefix. That way, we will still allow bucket operations to continue, and we will only be concerned with the list of Ozone keys that the client gave us. >From the OzoneFS point of view, it might simply be renaming a small set of >files under a directory (say 100K files, in a bucket with 10 million keys), so >in the "File Commit" operation, the client specifies the file names explicitly >or via regex, and we will collect all those files and replace the prefix. I am >hoping that OzoneFS will be able to simulate a rename operation via "File >Commit". what do you think? Should we get on a call brainstorm this? > Optimize recursive ozone filesystem apis > ---------------------------------------- > > Key: HDDS-1301 > URL: https://issues.apache.org/jira/browse/HDDS-1301 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Reporter: Lokesh Jain > Assignee: Lokesh Jain > Priority: Major > Labels: pull-request-available > Attachments: HDDS-1301.001.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This Jira aims to optimise recursive apis in ozone file system. These are the > apis which have a recursive flag which requires an operation to be performed > on all the children of the directory. The Jira would add support for > recursive apis in Ozone manager in order to reduce the number of rpc calls to > Ozone Manager. Also currently these operations are not atomic. This Jira > would make all the operations in ozone filesystem atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org