[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835852#comment-17835852 ] Andrew Kyle Purtell commented on HBASE-28447: - PR #5820 introduces a new configuration setting - "hfile.block.size" - that, if set, will define the default blocksize to use when writing HFiles if a column family schema does not define its own non-default block size. This is a bit complicated but required for compatability. The rules are: * If the schema specifies a non default block size, use it. * Otherwise, if the configuration specifies a non default block size, use it. * Otherwise, use the default block size. The default is defined by HConstants.DEFAULT_BLOCKSIZE. > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Andrew Kyle Purtell >Priority: Minor > Labels: pull-request-available > Fix For: 2.6.0, 2.7.0, 3.0.0-beta-2, 2.5.9 > > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go > on hbase-site.xm which can control this value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834963#comment-17834963 ] Andrew Kyle Purtell commented on HBASE-28447: - (y) > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go > on hbase-site.xm which can control this value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834845#comment-17834845 ] Gourab Taparia commented on HBASE-28447: [~apurtell] It will be while for me to start on this - Please feel free to pick it up > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go > on hbase-site.xm which can control this value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834450#comment-17834450 ] Andrew Kyle Purtell commented on HBASE-28447: - [~gourab.taparia] Are you planning to open a PR for this? > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a global config property that would go > on hbase-site.xm which can control this value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829785#comment-17829785 ] Gourab Taparia commented on HBASE-28447: Yes. Thanks [~wchevreuil] Updated the description to have "global config property that would go on hbase-site.xml" explicitly. > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this > value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829327#comment-17829327 ] Viraj Jasani commented on HBASE-28447: -- Yes, that is correct. This is reg introducing global site configuration. > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this > value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829194#comment-17829194 ] Bryan Beaudreault commented on HBASE-28447: --- Ah ok, thanks for clarifying. That makes sense, +1 > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this > value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828954#comment-17828954 ] Wellington Chevreuil commented on HBASE-28447: -- [~bbeaudreault] , I think [~gourab.taparia] means a global config property that would go on hbase-site.xml. IIRC, currently we just make it configurable at table or CF level, meaning if you want to change it for all your schema, you need to update that individually for all tables/CFs. Is that right, [~gourab.taparia] ? > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this > value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28447) New configuration to override the hfile specific blocksize
[ https://issues.apache.org/jira/browse/HBASE-28447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828936#comment-17828936 ] Bryan Beaudreault commented on HBASE-28447: --- I'm not sure I follow. You can define blocksize at a few different levels, as you described. What exactly are you proposing to add here? > New configuration to override the hfile specific blocksize > -- > > Key: HBASE-28447 > URL: https://issues.apache.org/jira/browse/HBASE-28447 > Project: HBase > Issue Type: Improvement >Reporter: Gourab Taparia >Assignee: Gourab Taparia >Priority: Minor > > Right now there is no config attached to the HFile block size by which we can > override the default. The default is set to 64 KB in > HConstants.DEFAULT_BLOCKSIZE . We need a new config which can control this > value. > Since the BLOCKSIZE is tracked at the column family level - we will need to > respect the CFD value first. Also, configuration settings are also something > that can be set in schema, at the column or table level, and will override > the relevant values from the site file. Below is the precedence order we can > use to get the final blocksize value : > {code:java} > ColumnFamilyDescriptor.BLOCKSIZE > schema level site configuration overrides > > site configuration > HConstants.DEFAULT_BLOCKSIZE{code} > PS: There is one related config “hbase.mapreduce.hfileoutputformat.blocksize” > however that is specific to map-reduce jobs. -- This message was sent by Atlassian Jira (v8.20.10#820010)