[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743646#comment-16743646 ] Nihal Jain commented on HBASE-21672: bq. Can we do that here? Sounds good. I will soon submit a patch with the suggested change. > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742502#comment-16742502 ] Sean Busbey commented on HBASE-21672: - that would be great. > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742460#comment-16742460 ] Andrew Purtell commented on HBASE-21672: HBASE-18478 added configuration option {{hbase.master.balancer.uselocality}}. [~busbey]'s concern is relevant there too, even though it has already been committed. Ideally we can have one change that automatically disables locality calculation after determining what filesystem implementation is in use. Can we do that here? Record the determination in a static hash map keyed by filesystem impl classname or similar. Then consult this information before attempting to use locality information wherever. > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737513#comment-16737513 ] Sean Busbey commented on HBASE-21672: - Here's my concern: as an operator why should I have to know this detail of the implementation? this is effectively a hidden "magically go faster" button. why can't this be something that we take care of for the operator? Either by whitelisting FileSystems that should skip it or pushing the providers of those FileSystems to implement something that tells us as a downstream user that there isn't going to be locality? Or doing a start up check that tells us there isn't going to be locality (e.g. for the case where we are talking to HDFS but that HDFS is a distinct set of nodes from our HBase nodes)? > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737414#comment-16737414 ] Nihal Jain commented on HBASE-21672: [~jojochuang] What do you think about this? > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737370#comment-16737370 ] Nihal Jain commented on HBASE-21672: {quote}Shouldn't this either be a no-op for filesystems that don't have locality, or something we can just ask the filesystem? {quote} The file-system does not directly return anything as locality as such. We have some logic to calculate it in hbase. it is based on {{HDFSBlocksDistribution}} information which we create using block location information returned by under lying fs. I think this solution should be fine, and will be useful, given we know our fs would not do us any good and may waste cpu cycles in creating this {{HDFSBlocksDistribution}} information. In fact we already have something similar in HBase, see [HBASE-18478|https://issues.apache.org/jira/browse/HBASE-18478]. > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation
[ https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734269#comment-16734269 ] Sean Busbey commented on HBASE-21672: - this feels a bit hacky. Shouldn't this either be a no-op for filesystems that don't have locality, or something we can just ask the filesystem? (even if the current FileSystem API can't do that) > Allow skipping HDFS block distribution computation > -- > > Key: HBASE-21672 > URL: https://issues.apache.org/jira/browse/HBASE-21672 > Project: HBase > Issue Type: Improvement >Reporter: Nihal Jain >Assignee: Nihal Jain >Priority: Major > Labels: S3 > > We should have a configuration to skip HDFS block distribution calculation in > HBase. For example on file systems that do not surface locality such as S3, > calculating block distribution would not be any useful. > Currentlly, we do not have a way to skip hdfs block distribution computation. > For this, we can provide a new configuration key, say > {{hbase.block.distribution.skip.computation}} (which would be {{false}} by > default). > Users using filesystems such as s3 may choose to make this {{true}}, thus > skipping block distribution computation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)