[jira] [Updated] (HADOOP-13704) S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use
[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HADOOP-13704: Labels: pull-request-available (was: ) > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > - > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13704) S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use
[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13704: Parent: HADOOP-18067 (was: HADOOP-17566) > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > - > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > > Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13704) S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use
[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13704: Parent: HADOOP-15220 (was: HADOOP-14831) > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > - > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > > Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13704) S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use
[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13704: Priority: Minor (was: Major) > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > - > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > > Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13704) S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use
[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13704: Description: Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats of a filesystem. This is very expensive on S3A (and any other object store), especially as the base implementation does the recursive tree walk. Because of HADOOP-13208, we have a full enumeration of files under a path without directory costs...S3A can/should switch to this to speed up those places where the operation is called. Also * API call needs FS spec and contract tests * S3A could instrument invocation, so as to enable real-world popularity to be measured was: Hive and a bit of Spark use {{getContentSummary()} to get some summary stats of a filesystem. This is very expensive on S3A (and any other object store), especially as the base implementation does the recursive tree walk. Because of HADOOP-13208, we have a full enumeration of files under a path without directory costs...S3A can/should switch to this to speed up those places where the operation is called. Also * API call needs FS spec and contract tests * S3A could instrument invocation, so as to enable real-world popularity to be measured > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > - > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > > Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13704) S3A getContentSummary() to move to listFiles(recursive) to count children; instrument use
[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-13704: Parent Issue: HADOOP-14831 (was: HADOOP-13204) > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > - > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Steve Loughran > > Hive and a bit of Spark use {{getContentSummary()} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org