[ 
https://issues.apache.org/jira/browse/HDDS-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821598#comment-16821598
 ] 

Anu Engineer edited comment on HDDS-1301 at 4/19/19 1:11 AM:
-------------------------------------------------------------

[~ljain], I am still looking at code and just thinking aloud, if we support 
this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve 
is going to really happy) but would we not get into a situation where an 
application can take a lock on a directory with millions on sub-keys. The added 
danger is that we have trained all our HDFS users that Rename is an atomic and 
fast operation, so application which is designed to work against HDFS will 
assume that rename is a cheap operation and attempt to do it. 

I am just wondering how long it would take to rename a directory with one 
million keys, or more specifically, if rename for us is an O( n ) operation and 
for normal file systems it is O(1); do we run into the danger of DDOS of Ozone 
via Rename? Should we support a completely different operation on the client 
side – for the lack of a better word – I am going to call it "File Commit", 
where the user can collect a set of file names and send it to us, with prefix 
either replaced or advice the server to replace the prefix. That way, we will 
still allow bucket operations to continue, and we will only be concerned with 
the list of Ozone keys that the client gave us. 

>From the OzoneFS point of view, it might simply be renaming a small set of 
>files under a directory (say 100K files, in a bucket with 10 million keys), so 
>in the "File Commit" operation, the client specifies the file names explicitly 
>or via regex, and we will collect all those files and replace the prefix. I am 
>hoping that OzoneFS will be able to simulate a rename operation via "File 
>Commit". what do you think? Should we get on a call brainstorm this? 

 

 

 

 


was (Author: anu):
[~ljain], I am still looking at code and just thinking aloud, if we support 
this feature, and allow a bucket level rename (I am sure Owen, Gopal, and Steve 
is going to really happy) but would we not get into a situation where an 
application can take a lock on a directory with millions on sub-keys. The added 
danger is that we have trained all our HDFS users that Rename is an atomic and 
fast operation, so application which is designed to work against HDFS will 
assume that rename is a cheap operation and attempt to do it. 

I am just wondering how long it would take to rename a directory with one 
million keys, or more specifically, if rename for us is an O(n) operation and 
for normal file systems it is O(1); do we run into the danger of DDOS of Ozone 
via Rename? Should we support a completely different operation on the client 
side – for the lack of a better word – I am going to call it "File Commit", 
where the user can collect a set of file names and send it to us, with prefix 
either replaced or advice the server to replace the prefix. That way, we will 
still allow bucket operations to continue, and we will only be concerned with 
the list of Ozone keys that the client gave us. 

>From the OzoneFS point of view, it might simply be renaming a small set of 
>files under a directory (say 100K files, in a bucket with 10 million keys), so 
>in the "File Commit" operation, the client specifies the file names explicitly 
>or via regex, and we will collect all those files and replace the prefix. I am 
>hoping that OzoneFS will be able to simulate a rename operation via "File 
>Commit". what do you think? Should we get on a call brainstorm this? 

 

 

 

 

> Optimize recursive ozone filesystem apis
> ----------------------------------------
>
>                 Key: HDDS-1301
>                 URL: https://issues.apache.org/jira/browse/HDDS-1301
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Lokesh Jain
>            Assignee: Lokesh Jain
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HDDS-1301.001.patch
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This Jira aims to optimise recursive apis in ozone file system. These are the 
> apis which have a recursive flag which requires an operation to be performed 
> on all the children of the directory. The Jira would add support for 
> recursive apis in Ozone manager in order to reduce the number of rpc calls to 
> Ozone Manager. Also currently these operations are not atomic. This Jira 
> would make all the operations in ozone filesystem atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to