[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513580#comment-17513580 ] Bryan Beaudreault commented on HBASE-21065: --- Just as an update, we integrated this patch into our environment and also updated all of our existing hbase2 clusters to have this setup. That's currently only about 70 smallish clusters, but we'll be getting to our bigger clusters soon. Will loop back if we encounter any issues, but until then assume that no news is good :) > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512232#comment-17512232 ] Hudson commented on HBASE-21065: Results for branch branch-2 [build #497 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512072#comment-17512072 ] Hudson commented on HBASE-21065: Results for branch branch-2.5 [build #74 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511412#comment-17511412 ] Viraj Jasani commented on HBASE-21065: -- Thank you [~bbeaudreault]! Changes look good, let's wait for full build QA results. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511342#comment-17511342 ] Bryan Beaudreault commented on HBASE-21065: --- [~apurtell] I backported this to my company's branch-2-based fork and it required some minor conflict resolution. In order to save you the same work I also pushed it as a PR for the branch-2 backport: [https://github.com/apache/hbase/pull/4268.] The commit there should apply cleanly to branch-2.5 as well. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510866#comment-17510866 ] Andrew Kyle Purtell commented on HBASE-21065: - That is my fault. I shouldn’t be on JIRA on mobile. I put the 3.0.0 fix version back. We can resolve this once the changes have been committed to branch-2 and branch-2.5 and all will be well. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510842#comment-17510842 ] Bryan Beaudreault commented on HBASE-21065: --- Hey [~apurtell] just to clarify since you're traveling. Sorry if I did a faux pas by commenting on this old, resolved issue, but here's what happened: * This issue was originally resolved with a fix version of 3.0.0-alpha-1, because the attached GH PR was merged into master only * I stumbled across this and was wondering why we couldn't do this for branch-2, since it seems valuable. So I commented here to ask the question * Now you've removed the 3.0.0-alpha fix version, which I don't think is correct since it is resolved there, as Duo mentions I think probably what I should have done was to create a new Jira for "Backport HBASE-21065 to branch-2". Since we've already re-opened this issue, I think we have 2 options: * Return fix versions to 3.0.0-alpha-1 and resolve as fixed, then create the above Jira for backport * Set fix versions to 3.0.0-alpha-1, 2.5.0, 2.6.0 and resolve once the PR has been applied to branch-2.5 and branch-2 Sorry for the confusion here. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 2.5.0, 2.6.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510800#comment-17510800 ] Andrew Kyle Purtell commented on HBASE-21065: - Great. I am traveling so did not check the code before posting. No concerns obviously if already done. Let me update fix versions here just for 2.x. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510687#comment-17510687 ] Duo Zhang commented on HBASE-21065: --- I assume you mean making this change in 2.x? We already have this landed on master, i.e, 3.0.0-alpha? I do not have big concerns to land them to 2.x, it should be an improvement. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Andrew Kyle Purtell >Priority: Major > Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510558#comment-17510558 ] Andrew Kyle Purtell commented on HBASE-21065: - (y) > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510439#comment-17510439 ] Bryan Beaudreault commented on HBASE-21065: --- What you summarized is what I was thinking > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509947#comment-17509947 ] Andrew Kyle Purtell commented on HBASE-21065: - bq. I agree it might be nice to do this by default for 2.5.0 if we already have it enabled in some production. It is unlikely this will receive testing beyond what we can muster on a voluntary basis for the release, and the act of enabling it would be a opt out change enabling wide scale testing. I am confident the mechanics of the encoding and bloomfilters themselves are not problematic but a change like this could introduce interesting multi factor consequences for some environments. This would be documented as such in a release note including instructions on how to opt out. I think that can address generic concerns but I would be interested if you have a specific test scenario in mind. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509339#comment-17509339 ] Bryan Beaudreault commented on HBASE-21065: --- Thanks for the details both. Took a closer look at the patch. Glad I can just enable for my case. I agree it might be nice to do this by default for 2.5.0 if we already have it enabled in some production. The biggest issue with ROW_INDEX_V1 seems to be size, and meta should not be large. I find this encoding is great for random read, which I agree is often the case with meta. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509331#comment-17509331 ] Andrew Kyle Purtell commented on HBASE-21065: - bq. If you are just interested in addressing meta hotspot issues, edit your hbase:meta and enable ROW_INDEX_V1 and BLOOMFILTER... You can since 2.3.0. I checked that at least one cluster where I work has this in place – 2.4.x, BLOOMFILTER => 'ROW', DATA_BLOCK_ENCODING => 'ROW_INDEX_V1' on all hbase:meta columnfamilies. Enabling meta replicas also helped. With the exception of enabling replicas, is this reasonable to try as default in upcoming 2.5.0? It is coming in for a landing. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509330#comment-17509330 ] Michael Stack commented on HBASE-21065: --- [~bbeaudreault] making hbase:meta schema default enabling ROW_INDEX_V1? Probably not a technical issue. IIRC, probably thought changing default hbase:meta schema should wait on major version release (I think the patch and the subject on this Jira are out of alignment so there might be some confusion here as to what this Jira did). If you are just interested in addressing meta hotspot issues, edit your hbase:meta and enable ROW_INDEX_V1 and BLOOMFILTER... You can since 2.3.0. I checked that at least one cluster where I work has this in place – 2.4.x, BLOOMFILTER => 'ROW', DATA_BLOCK_ENCODING => 'ROW_INDEX_V1' on all hbase:meta columnfamilies. Enabling meta replicas also helped. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509118#comment-17509118 ] Bryan Beaudreault commented on HBASE-21065: --- Looks like a handful of conflicts, so maybe complicated. But curious if there was a technical reason beyond that. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509117#comment-17509117 ] Bryan Beaudreault commented on HBASE-21065: --- [~stack] sorry to ping on an old issue, but I'm curious – is there a reason this can't land in branch-2? We have meta hotspot issues often, so any improvement there will help. I see HBASE-23705 landed in branch-2. I can take a look at a backport if there were no real reasons. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022976#comment-17022976 ] Hudson commented on HBASE-21065: Results for branch master [build #1607 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1607/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/1607//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1607//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1607//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0 > > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019677#comment-17019677 ] Michael Stack commented on HBASE-21065: --- HBASE-23705 is about fixing our cellcomparator handling. It needs to land before this can go in. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018270#comment-17018270 ] Michael Stack commented on HBASE-21065: --- Changing the encoding on meta exposes the fact that the ROW_INDEX_V1 encoder does not work on the hbase:meta table; it has hard-coded the user-space CellComparator. Reviewing how CellComparators are instantiated around the codebase, we are inconsistent and encoding context does not have what CellComparator is appropriate. Let me fix this first. Will fix the UT failures we're seeing in the PR here. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)
[ https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012252#comment-17012252 ] Michael Stack commented on HBASE-21065: --- Put up a patch. Lets see how it does. > Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we > are at it) > - > > Key: HBASE-21065 > URL: https://issues.apache.org/jira/browse/HBASE-21065 > Project: HBase > Issue Type: Improvement > Components: meta, Performance >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > > Some users end up hitting meta hard. Bulk is probably because our client goes > to meta too often, and the real 'fix' for a saturated meta is splitting it, > but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in > the near term. It adds an index on hfile blocks and helped improve random > reads against user-space tables (less compares as we used index to go direct > to requested Cells rather than look at each Cell in turn until we found what > we wanted -- see RN on HBASE-16213 for citation). > I also noticed code-reading that we don't enable blooms on hbase:meta tables; > that could save some CPU and speed things up a bit too: > {code} > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > {code} > This issue is about doing a bit of perf compare of encoding *on* vs current > default (and will check diff in size of indexed blocks). > Meta access is mostly random-read I believe (A review of a user's access > showed this so at least for their workload). The nice addition, HBASE-19722 > Meta query statistics metrics source, would help verify if it saw some usage > on a prod cluster. > If all is good, I'd like to make a small patch, one that could be easily > backported, with minimal changes in it. > As is, its all a little awkward as the meta table schema is hard-coded and > meta is immutable -- stuff we'll have to fix if we want to split meta -- so > in the meantime it requires a code change to enable (and a backport of > HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is > enough). Code change to enable is small: > {code} > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > index 28c7ec3c2f..8f08f94dc1 100644 > --- > a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > +++ > b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java > @@ -160,6 +160,7 @@ public class FSTableDescriptors implements > TableDescriptors { > .setScope(HConstants.REPLICATION_SCOPE_LOCAL) > // Disable blooms for meta. Needs work. Seems to mess w/ > getClosestOrBefore. > .setBloomFilterType(BloomType.NONE) > + > .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1) > .build()) > > .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY) > .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS, > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)