[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-04-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835852#comment-17835852
 ] 

Andrew Kyle Purtell commented on HBASE-28447:
-

PR #5820 introduces a new configuration setting - "hfile.block.size" - that, if 
set, will define the default blocksize to use when writing HFiles if a column 
family schema does not define its own non-default block size. This is a bit 
complicated but required for compatability. The rules are:
 * If the schema specifies a non default block size, use it.
 * Otherwise, if the configuration specifies a non default block size, use it.
 * Otherwise, use the default block size. The default is defined by 
HConstants.DEFAULT_BLOCKSIZE.

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Andrew Kyle Purtell
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.7.0, 3.0.0-beta-2, 2.5.9
>
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go 
> on hbase-site.xm which can control this value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-04-08 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834963#comment-17834963
 ] 

Andrew Kyle Purtell commented on HBASE-28447:
-

(y)

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go 
> on hbase-site.xm which can control this value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-04-08 Thread Gourab Taparia (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834845#comment-17834845
 ] 

Gourab Taparia commented on HBASE-28447:


[~apurtell] It will be while for me to start on this - Please feel free to pick 
it up

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go 
> on hbase-site.xm which can control this value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-04-05 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834450#comment-17834450
 ] 

Andrew Kyle Purtell commented on HBASE-28447:
-

[~gourab.taparia] Are you planning to open a PR for this? 

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go 
> on hbase-site.xm which can control this value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-22 Thread Gourab Taparia (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829785#comment-17829785
 ] 

Gourab Taparia commented on HBASE-28447:


Yes. Thanks [~wchevreuil] Updated the description to have "global config 
property that would go on hbase-site.xml" explicitly.

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-20 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829327#comment-17829327
 ] 

Viraj Jasani commented on HBASE-28447:
--

Yes, that is correct. This is reg introducing global site configuration.

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-20 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829194#comment-17829194
 ] 

Bryan Beaudreault commented on HBASE-28447:
---

Ah ok, thanks for clarifying. That makes sense, +1

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-20 Thread Wellington Chevreuil (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828954#comment-17828954
 ] 

Wellington Chevreuil commented on HBASE-28447:
--

[~bbeaudreault] , I think [~gourab.taparia] means a global config property that 
would go on hbase-site.xml. IIRC, currently we just make it configurable at 
table or CF level, meaning if you want to change it for all your schema, you 
need to update that individually for all tables/CFs. Is that right, 
[~gourab.taparia] ?

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize

2024-03-20 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828936#comment-17828936
 ] 

Bryan Beaudreault commented on HBASE-28447:
---

I'm not sure I follow. You can define blocksize at a few different levels, as 
you described. What exactly are you proposing to add here?

> New configuration to override the hfile specific blocksize
> --
>
> Key: HBASE-28447
> URL: https://issues.apache.org/jira/browse/HBASE-28447
> Project: HBase
>  Issue Type: Improvement
>Reporter: Gourab Taparia
>Assignee: Gourab Taparia
>Priority: Minor
>
> Right now there is no config attached to the HFile block size by which we can 
> override the default. The default is set to 64 KB in 
> HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this 
> value.
> Since the BLOCKSIZE is tracked at the column family level - we will need to 
> respect the CFD value first. Also, configuration settings are also something 
> that can be set in schema, at the column or table level, and will override 
> the relevant values from the site file. Below is the precedence order we can 
> use to get the final blocksize value :
> {code:java}
> ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides 
> > site configuration > HConstants.DEFAULT_BLOCKSIZE{code}
> PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” 
> however that is specific to map-reduce jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)