[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-15 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743646#comment-16743646
 ] 

Nihal Jain commented on HBASE-21672:


bq. Can we do that here?
Sounds good. I will soon submit a patch with the suggested change.

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-14 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742502#comment-16742502
 ] 

Sean Busbey commented on HBASE-21672:
-

that would be great.

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-14 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742460#comment-16742460
 ] 

Andrew Purtell commented on HBASE-21672:


HBASE-18478 added configuration option {{hbase.master.balancer.uselocality}}. 
[~busbey]'s concern is relevant there too, even though it has already been 
committed. Ideally we can have one change that automatically disables locality 
calculation after determining what filesystem implementation is in use. Can we 
do that here? Record the determination in a static hash map keyed by filesystem 
impl classname or similar. Then consult this information before attempting to 
use locality information wherever.

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-08 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737513#comment-16737513
 ] 

Sean Busbey commented on HBASE-21672:
-

Here's my concern: as an operator why should I have to know this detail of the 
implementation? this is effectively a hidden "magically go faster" button. why 
can't this be something that we take care of for the operator? Either by 
whitelisting FileSystems that should skip it or pushing the providers of those 
FileSystems to implement something that tells us as a downstream user that 
there isn't going to be locality? Or doing a start up check that tells us there 
isn't going to be locality (e.g. for the case where we are talking to HDFS but 
that HDFS is a distinct set of nodes from our HBase nodes)?

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-08 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737414#comment-16737414
 ] 

Nihal Jain commented on HBASE-21672:


[~jojochuang] What do you think about this?

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-08 Thread Nihal Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737370#comment-16737370
 ] 

Nihal Jain commented on HBASE-21672:


{quote}Shouldn't this either be a no-op for filesystems that don't have 
locality, or something we can just ask the filesystem?
{quote}
The file-system does not directly return anything as locality as such. We have 
some logic to calculate it in hbase. it is based on {{HDFSBlocksDistribution}} 
information which we create using block location information returned by under 
lying fs.

I think this solution should be fine, and will be useful, given we know our fs 
would not do us any good and may waste cpu cycles in creating this 
{{HDFSBlocksDistribution}} information. In fact we already have something 
similar in HBase, see 
[HBASE-18478|https://issues.apache.org/jira/browse/HBASE-18478].

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21672) Allow skipping HDFS block distribution computation

2019-01-04 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16734269#comment-16734269
 ] 

Sean Busbey commented on HBASE-21672:
-

this feels a bit hacky. Shouldn't this either be a no-op for filesystems that 
don't have locality, or something we can just ask the filesystem? (even if the 
current FileSystem API can't do that)

> Allow skipping HDFS block distribution computation
> --
>
> Key: HBASE-21672
> URL: https://issues.apache.org/jira/browse/HBASE-21672
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nihal Jain
>Assignee: Nihal Jain
>Priority: Major
>  Labels: S3
>
> We should have a configuration to skip HDFS block distribution calculation in 
> HBase. For example on file systems that do not surface locality such as S3, 
> calculating block distribution would not be any useful.
> Currentlly, we do not have a way to skip hdfs block distribution computation. 
> For this, we can provide a new configuration key, say 
> {{hbase.block.distribution.skip.computation}} (which would be {{false}} by 
> default).
> Users using filesystems such as s3 may choose to make this {{true}}, thus 
> skipping block distribution computation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)