[jira] [Commented] (HDDS-2328) Support large-scale listing

2019-10-21 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956216#comment-16956216
 ] 

Anu Engineer commented on HDDS-2328:


Agree. We should probably do what S3AFileSystem has done. 

> Support large-scale listing 
> 
>
> Key: HDDS-2328
> URL: https://issues.apache.org/jira/browse/HDDS-2328
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Rajesh Balamohan
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: performance
>
> Large-scale listing of directory contents takes a lot longer time and also 
> has the potential to run into OOM. I have > 1 million entries in the same 
> level and it took lot longer time with {{RemoteIterator}} (didn't complete as 
> it was stuck in RDB::seek).
> S3A batches it with 5K listing per fetch IIRC.  It would be good to have this 
> feature in ozone as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2328) Support large-scale listing

2019-10-20 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955756#comment-16955756
 ] 

Lokesh Jain commented on HDDS-2328:
---

Currently we do not implement FileSystem#listLocatedStatus api in Ozone. 
Therefore it ends up calling listStatus for the entire directory at once which 
can lead to OOM. I think we just need to have an implementation for 
listLocatedStatus and other such related apis in BasicOzoneFileSystem.

> Support large-scale listing 
> 
>
> Key: HDDS-2328
> URL: https://issues.apache.org/jira/browse/HDDS-2328
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Rajesh Balamohan
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: performance
>
> Large-scale listing of directory contents takes a lot longer time and also 
> has the potential to run into OOM. I have > 1 million entries in the same 
> level and it took lot longer time with {{RemoteIterator}} (didn't complete as 
> it was stuck in RDB::seek).
> S3A batches it with 5K listing per fetch IIRC.  It would be good to have this 
> feature in ozone as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2328) Support large-scale listing

2019-10-20 Thread Rajesh Balamohan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955650#comment-16955650
 ] 

Rajesh Balamohan commented on HDDS-2328:



Here is the small snippet of the code which was used large listing (directory I 
used had millions of entries, which was populated earlier).

ozone src details: https://github.com/apache/hadoop-ozone (commit 
b4a1afd60e3a3c7319a1ffa97d5ace3a95ed26f6).

{noformat}
 // Get path details
...
... 
long sTime = System.currentTimeMillis();
RemoteIterator rit = fs.listLocatedStatus(path);
long count = 0 ;
while(rit.hasNext()) {
  rit.next();
  count++;
}
long eTime = System.currentTimeMillis();
...
...
{noformat}

> Support large-scale listing 
> 
>
> Key: HDDS-2328
> URL: https://issues.apache.org/jira/browse/HDDS-2328
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Rajesh Balamohan
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: performance
>
> Large-scale listing of directory contents takes a lot longer time and also 
> has the potential to run into OOM. I have > 1 million entries in the same 
> level and it took lot longer time with {{RemoteIterator}} (didn't complete as 
> it was stuck in RDB::seek).
> S3A batches it with 5K listing per fetch IIRC.  It would be good to have this 
> feature in ozone as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2328) Support large-scale listing

2019-10-18 Thread Anu Engineer (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954802#comment-16954802
 ] 

Anu Engineer commented on HDDS-2328:


The Listing API interface already does that. I will take a look at why we are 
not paging ... Can you please provide me with repro steps and which version of 
branch you tried with this ? 

> Support large-scale listing 
> 
>
> Key: HDDS-2328
> URL: https://issues.apache.org/jira/browse/HDDS-2328
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Rajesh Balamohan
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: performance
>
> Large-scale listing of directory contents takes a lot longer time and also 
> has the potential to run into OOM. I have > 1 million entries in the same 
> level and it took lot longer time with {{RemoteIterator}} (didn't complete as 
> it was stuck in RDB::seek).
> S3A batches it with 5K listing per fetch IIRC.  It would be good to have this 
> feature in ozone as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org