[
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821598#comment-16821598
]
Anu Engineer edited comment on HDDS-1301 at 4/19/19 1:11 AM:
-------------------------------------------------------------
[~ljain], I am still looking at code and just thinking aloud, if we support
this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve
is going to really happy) but would we not get into a situation where an
application can take a lock on a directory with millions on sub-keys. The added
danger is that we have trained all our HDFS users that Rename is an atomic and
fast operation, so application which is designed to work against HDFS will
assume that rename is a cheap operation and attempt to do it.
I am just wondering how long it would take to rename a directory with one
million keys, or more specifically, if rename for us is an O( n ) operation and
for normal file systems it is O(1); do we run into the danger of DDOS of Ozone
via Rename? Should we support a completely different operation on the client
side – for the lack of a better word – I am going to call it "File Commit",
where the user can collect a set of file names and send it to us, with prefix
either replaced or advice the server to replace the prefix. That way, we will
still allow bucket operations to continue, and we will only be concerned with
the list of Ozone keys that the client gave us.
>From the OzoneFS point of view, it might simply be renaming a small set of
>files under a directory (say 100K files, in a bucket with 10 million keys), so
>in the "File Commit" operation, the client specifies the file names explicitly
>or via regex, and we will collect all those files and replace the prefix. I am
>hoping that OzoneFS will be able to simulate a rename operation via "File
>Commit". what do you think? Should we get on a call brainstorm this?
was (Author: anu):
[~ljain], I am still looking at code and just thinking aloud, if we support
this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve
is going to really happy) but would we not get into a situation where an
application can take a lock on a directory with millions on sub-keys. The added
danger is that we have trained all our HDFS users that Rename is an atomic and
fast operation, so application which is designed to work against HDFS will
assume that rename is a cheap operation and attempt to do it.
I am just wondering how long it would take to rename a directory with one
million keys, or more specifically, if rename for us is an O(n) operation and
for normal file systems it is O(1); do we run into the danger of DDOS of Ozone
via Rename? Should we support a completely different operation on the client
side – for the lack of a better word – I am going to call it "File Commit",
where the user can collect a set of file names and send it to us, with prefix
either replaced or advice the server to replace the prefix. That way, we will
still allow bucket operations to continue, and we will only be concerned with
the list of Ozone keys that the client gave us.
>From the OzoneFS point of view, it might simply be renaming a small set of
>files under a directory (say 100K files, in a bucket with 10 million keys), so
>in the "File Commit" operation, the client specifies the file names explicitly
>or via regex, and we will collect all those files and replace the prefix. I am
>hoping that OzoneFS will be able to simulate a rename operation via "File
>Commit". what do you think? Should we get on a call brainstorm this?
> Optimize recursive ozone filesystem apis
> ----------------------------------------
>
> Key: HDDS-1301
> URL: https://issues.apache.org/jira/browse/HDDS-1301
> Project: Hadoop Distributed Data Store
> Issue Type: Sub-task
> Reporter: Lokesh Jain
> Assignee: Lokesh Jain
> Priority: Major
> Labels: pull-request-available
> Attachments: HDDS-1301.001.patch
>
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the
> apis which have a recursive flag which requires an operation to be performed
> on all the children of the directory. The Jira would add support for
> recursive apis in Ozone manager in order to reduce the number of rpc calls to
> Ozone Manager. Also currently these operations are not atomic. This Jira
> would make all the operations in ozone filesystem atomic.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]